EnCodec is an end-to-end trained streaming neural audio codec that uses a single multiscale spectrogram discriminator and a gradient-normalizing loss balancer to achieve higher fidelity than prior methods at the same bitrates for 24 kHz mono and 48 kHz stereo audio.
Generative spoken dialogue language modeling
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
DuplexOmni achieves real-time full-duplex multimodal interaction by separating an interaction layer from a pluggable thinking layer, supported by a Writer-Director pipeline for continuous-interaction training data.
citing papers explorer
No citing papers match the current filters.