pith. sign in

Usad: Universal speech and audio representation via distillation.arXiv preprint arXiv:2506.18843

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.SD 2

years

2026 2

verdicts

UNVERDICTED 2

roles

background 1

polarities

unclear 1

representative citing papers

Stage-adaptive audio diffusion modeling

cs.SD · 2026-05-06 · unverdicted · novelty 6.0

A semantic progress signal from SSL discrepancy slope enables three stage-aware mechanisms that improve training efficiency and performance in audio diffusion models over static baselines.

Alethia: A Foundational Encoder for Voice Deepfakes

cs.SD · 2026-04-30 · unverdicted · novelty 6.0

Alethia is a pretrained audio encoder using continuous embedding prediction and generative flow-matching reconstruction that outperforms existing speech foundation models on voice deepfake tasks with better robustness and zero-shot generalization.

citing papers explorer

Showing 2 of 2 citing papers.

  • Stage-adaptive audio diffusion modeling cs.SD · 2026-05-06 · unverdicted · none · ref 1

    A semantic progress signal from SSL discrepancy slope enables three stage-aware mechanisms that improve training efficiency and performance in audio diffusion models over static baselines.

  • Alethia: A Foundational Encoder for Voice Deepfakes cs.SD · 2026-04-30 · unverdicted · none · ref 7

    Alethia is a pretrained audio encoder using continuous embedding prediction and generative flow-matching reconstruction that outperforms existing speech foundation models on voice deepfake tasks with better robustness and zero-shot generalization.