Title resolution pending

Wu, H · 2024 · arXiv 2411.18803

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

FlexiSLM: A Dynamic and Controllable Frame Rate Spoken Language Model

cs.SD · 2026-06-30 · unverdicted · novelty 7.0

FlexiSLM is the first spoken language model supporting dynamic and controllable frame rates on speech input and output, outperforming fixed-rate 7B models at high quality and enabling faster inference at lower rates like 6.25 Hz.

Self-Guidance: Enhancing Neural Codecs via Decoder Manifold Alignment

cs.SD · 2026-06-11 · unverdicted · novelty 6.0

Self-guidance adds a lightweight feature-mapping loss to align decoder manifolds in VQ-VAE speech codecs, raising reconstruction metrics and allowing 4x codebook reduction with no fidelity loss.

Afrispeech Semantics: Evaluating Audio Semantic Reasoning in Spoken Language Models Across Domains and Accents

cs.CL · 2026-05-11 · unverdicted · novelty 4.0

Audio language models are benchmarked on five semantic and paralinguistic reasoning tasks to reveal limitations in handling spoken audio evidence, accent variation, and domain shifts.

From Objectives to Applications: Aligning Architectural Biases in Audio Self-Supervised Learning

eess.AS · 2026-07-01 · unverdicted · novelty 3.0

A survey that organizes audio SSL into five objective paradigms, relates their demands to architectural biases, and interprets downstream applications as tests of generalization.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Afrispeech Semantics: Evaluating Audio Semantic Reasoning in Spoken Language Models Across Domains and Accents cs.CL · 2026-05-11 · unverdicted · none · ref 51
Audio language models are benchmarked on five semantic and paralinguistic reasoning tasks to reveal limitations in handling spoken audio evidence, accent variation, and domain shifts.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer