InICASSP 2021-2021 IEEE International Conference on Acous- tics, Speech and Signal Processing (ICASSP), pages 606–610

Text-to-audio grounding: Building correspondence between captions, sound events · 2021 · arXiv 2507.18897

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Listen, Pause, and Reason: Toward Perception-Grounded Hybrid Reasoning for Audio Understanding

cs.SD · 2026-04-16 · unverdicted · novelty 6.0

HyPeR is a hybrid perception-reasoning framework that uses a new hierarchical PAQA dataset and PAUSE tokens to improve large audio language models' handling of multi-speaker and ambiguous audio.

citing papers explorer

Showing 1 of 1 citing paper.

Listen, Pause, and Reason: Toward Perception-Grounded Hybrid Reasoning for Audio Understanding cs.SD · 2026-04-16 · unverdicted · none · ref 5
HyPeR is a hybrid perception-reasoning framework that uses a new hierarchical PAQA dataset and PAUSE tokens to improve large audio language models' handling of multi-speaker and ambiguous audio.

InICASSP 2021-2021 IEEE International Conference on Acous- tics, Speech and Signal Processing (ICASSP), pages 606–610

fields

years

verdicts

representative citing papers

citing papers explorer