A self-supervised prosody encoder with speaker disentanglement strategies outperforms raw prosody and HuBERT baselines on pitch reconstruction and prosodic event detection while achieving strong speaker separation.
IEEE International Conference on Acoustics, Speech, and Signal Processing / sponsored by the Institute of Electrical and Electronics Engineers Signal Processing Society
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Privacy-preserving Prosody Representation Learning
A self-supervised prosody encoder with speaker disentanglement strategies outperforms raw prosody and HuBERT baselines on pitch reconstruction and prosodic event detection while achieving strong speaker separation.