Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spo- ken Language Understanding in SpeechLLMs

Wang, D · 2025 · arXiv 2508.17863

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Speech Meets ELF: Audio Conditional Continuous-Target Diffusion for Speech Recognition and Translation

cs.SD · 2026-06-09 · unverdicted · novelty 6.0

ELF-S2T applies audio-conditioned flow-matching on continuous text latents from pre-trained ELF to achieve competitive ASR and S2TT results, with analysis showing shared close-distance confusion in latent space.

HybridCodec: Modeling Discrete and Continuous Representations for Efficient Speech Language Models

cs.LG · 2026-06-26 · unverdicted · novelty 5.0

HybridCodec combines discrete tokens with continuous residuals via a focal modulation codec and hybrid Transformer to improve speaker retention and reduce autoregressive steps in speech language models.

Towards Building Speech Large Language Models for Multitask Understanding in Low-Resource Languages

cs.SD · 2025-09-18 · unverdicted · novelty 5.0

Introduces XLSR-Thai encoder, U-Align alignment, and Thai-SUP data pipeline to enable multitask speech understanding SLLMs for Thai.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Speech Meets ELF: Audio Conditional Continuous-Target Diffusion for Speech Recognition and Translation cs.SD · 2026-06-09 · unverdicted · none · ref 20
ELF-S2T applies audio-conditioned flow-matching on continuous text latents from pre-trained ELF to achieve competitive ASR and S2TT results, with analysis showing shared close-distance confusion in latent space.
Towards Building Speech Large Language Models for Multitask Understanding in Low-Resource Languages cs.SD · 2025-09-18 · unverdicted · none · ref 17
Introduces XLSR-Thai encoder, U-Align alignment, and Thai-SUP data pipeline to enable multitask speech understanding SLLMs for Thai.

Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spo- ken Language Understanding in SpeechLLMs

fields

years

verdicts

representative citing papers

citing papers explorer