High-Quality, Low-Delay Music Coding in the Opus Codec
read the original abstract
The IETF recently standardized the Opus codec as RFC6716. Opus targets a wide range of real-time Internet applications by combining a linear prediction coder with a transform coder. We describe the transform coder, with particular attention to the psychoacoustic knowledge built into the format. The result out-performs existing audio codecs that do not operate under real-time constraints.
This paper has not been read by Pith yet.
Forward citations
Cited by 4 Pith papers
-
Codec-Robust Attacks on Audio LLMs
CodecAttack perturbs audio in codec latent space with multi-bitrate EoT to achieve 85.5% average ASR on Opus-compressed Audio LLMs versus under 26% for waveform baselines, with transfer to MP3 and AAC.
-
AffectCodec: Emotion-Preserving Neural Speech Codec for Expressive Speech Modeling
AffectCodec is an emotion-guided neural speech codec that preserves emotional cues during quantization while maintaining semantic fidelity and prosodic naturalness.
-
Codec-Robust Attacks on Audio LLMs
CodecAttack optimizes perturbations in neural audio codec latent space to reach 85.5% average target-substring ASR on compressed Opus audio while waveform baselines stay below 26%.
-
Benchmarking Audio Deepfake Detection Robustness in Real-world Communication Scenarios
The paper creates ADD-C benchmark dataset for audio deepfake detection under codec compression and packet loss, shows baseline degradation, and demonstrates a data augmentation method that boosts robustness.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.