pith. sign in

arXiv preprint arXiv:2305.15255 , year=

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

years

2026 6 2025 1

clear filters

representative citing papers

Audio Interaction Model

cs.SD · 2026-06-03 · unverdicted · novelty 6.0

Audio-Interaction unifies offline and online audio tasks into one streaming model via the SoundFlow framework and a new 2.6M-item streaming corpus, enabling real-time instruction following and proactive responses.

Learning When to Think While Listening in Large Audio-Language Models

cs.CL · 2026-05-26 · unverdicted · novelty 6.0

A wait-think-answer controller for LALMs is trained via SFT followed by six-reward DAPO, raising row-weighted accuracy from 67.6% to 70.3% and cutting post-endpoint thinking length by 14% on synthetic spoken QA while remaining functional on real recorded audio.

Enhancing Speech Large Language Models through Reinforced Behavior Alignment

cs.CL · 2025-08-25 · unverdicted · novelty 5.0

Reinforced Behavior Alignment (RBA) uses self-synthesized data from a teacher LLM and reinforcement learning to close the instruction-following gap in SpeechLMs, outperforming distillation and reaching SOTA on spoken QA and speech-to-text translation benchmarks.

citing papers explorer

Showing 6 of 6 citing papers after filters.