Parker, Anton Smirnov, Jordi Pons, C

Julian D · 2024 · arXiv 2411.19842

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

AffectCodec: Emotion-Preserving Neural Speech Codec for Expressive Speech Modeling

cs.SD · 2026-05-11 · unverdicted · novelty 7.0

AffectCodec is an emotion-guided neural speech codec that preserves emotional cues during quantization while maintaining semantic fidelity and prosodic naturalness.

Two-Dimensional Quantization for Geometry-Aware Audio Coding

cs.SD · 2025-12-01 · unverdicted · novelty 6.0

Q2D2 uses 2D geometric grid projections to quantize feature pairs in neural audio codecs, yielding implicit codebooks that improve efficiency and utilization over RVQ, VQ, and FSQ while maintaining reconstruction quality.

Voxtral TTS

cs.AI · 2026-03-26 · unverdicted · novelty 5.0

Voxtral TTS produces expressive multilingual speech from 3-second reference audio with a hybrid autoregressive-plus-flow-matching architecture and a new VQ-FSQ tokenizer, achieving 68.4% win rate over ElevenLabs in human evaluations.

citing papers explorer

Showing 3 of 3 citing papers.

AffectCodec: Emotion-Preserving Neural Speech Codec for Expressive Speech Modeling cs.SD · 2026-05-11 · unverdicted · none · ref 18
AffectCodec is an emotion-guided neural speech codec that preserves emotional cues during quantization while maintaining semantic fidelity and prosodic naturalness.
Two-Dimensional Quantization for Geometry-Aware Audio Coding cs.SD · 2025-12-01 · unverdicted · none · ref 54
Q2D2 uses 2D geometric grid projections to quantize feature pairs in neural audio codecs, yielding implicit codebooks that improve efficiency and utilization over RVQ, VQ, and FSQ while maintaining reconstruction quality.
Voxtral TTS cs.AI · 2026-03-26 · unverdicted · none · ref 13
Voxtral TTS produces expressive multilingual speech from 3-second reference audio with a hybrid autoregressive-plus-flow-matching architecture and a new VQ-FSQ tokenizer, achieving 68.4% win rate over ElevenLabs in human evaluations.

Parker, Anton Smirnov, Jordi Pons, C

fields

years

verdicts

representative citing papers

citing papers explorer