CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.
FSD50K: an open dataset of human-labeled sound events,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
A parameter-efficient dual-encoder model with differentiable Choquet integral fusion improves underwater acoustic classification accuracy over single-encoder baselines on DeepShip and ShipsEar datasets.
citing papers explorer
-
CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding
CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.
-
Parameter-efficient Dual-encoder Architecture with Differentiable Choquet Integral Fusion for Underwater Acoustic Classification
A parameter-efficient dual-encoder model with differentiable Choquet integral fusion improves underwater acoustic classification accuracy over single-encoder baselines on DeepShip and ShipsEar datasets.