In: Interspeech 2022

Takaaki Saeki, Detai Xin, Wataru Nakata, Tomoki Koriyama, Shinnosuke Takamichi, Hiroshi Saruwatari · 2022 · DOI 10.21437/interspeech.2022-439

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

Sign-to-Speech Prosody Transfer via Sign Reconstruction-based GAN

cs.SD · 2026-04-12 · unverdicted · novelty 7.0

SignRecGAN trains on separate sign and speech datasets via adversarial and reconstruction objectives to inject sign-derived prosody into TTS output using the S2PFormer model.

CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding

cs.SD · 2026-06-03 · unverdicted · novelty 6.0

CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.

EntangleCodec: A Unified Discrete Audio Tokenizer via Semantic-Acoustic Entanglement

cs.SD · 2026-06-01 · unverdicted · novelty 4.0

EntangleCodec unifies semantic and acoustic audio tokenization via caption alignment and flow-matching decoding, reporting competitive reconstruction, +7.4% gains on MMAR understanding, and 0.6B-parameter ALMs surpassing 13B-parameter continuous baselines.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Sign-to-Speech Prosody Transfer via Sign Reconstruction-based GAN cs.SD · 2026-04-12 · unverdicted · none · ref 22
SignRecGAN trains on separate sign and speech datasets via adversarial and reconstruction objectives to inject sign-derived prosody into TTS output using the S2PFormer model.
CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding cs.SD · 2026-06-03 · unverdicted · none · ref 30
CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.
EntangleCodec: A Unified Discrete Audio Tokenizer via Semantic-Acoustic Entanglement cs.SD · 2026-06-01 · unverdicted · none · ref 25
EntangleCodec unifies semantic and acoustic audio tokenization via caption alignment and flow-matching decoding, reporting competitive reconstruction, +7.4% gains on MMAR understanding, and 0.6B-parameter ALMs surpassing 13B-parameter continuous baselines.

In: Interspeech 2022

fields

years

verdicts

representative citing papers

citing papers explorer