Can Large Language Models Understand Spatial Audio?

Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li + 2 more · 2024 · Interspeech 2024 · DOI 10.21437/interspeech.2024-2419

1 Pith paper cite this work, alongside 8 external citations. Polarity classification is still indexing.

1 Pith paper citing it

8 external citations · Crossref

open at publisher browse 1 citing papers

representative citing papers

SpeakerLLM: A Speaker-Specialized Audio-LLM for Speaker Understanding and Verification Reasoning

cs.SD · 2026-05-14 · unverdicted · novelty 6.0

SpeakerLLM unifies speaker profiling, recording-condition understanding, and structured verification reasoning in an audio-LLM via a hierarchical tokenizer and decision traces.

citing papers explorer

Showing 1 of 1 citing paper.

SpeakerLLM: A Speaker-Specialized Audio-LLM for Speaker Understanding and Verification Reasoning cs.SD · 2026-05-14 · unverdicted · none · ref 6
SpeakerLLM unifies speaker profiling, recording-condition understanding, and structured verification reasoning in an audio-LLM via a hierarchical tokenizer and decision traces.

Can Large Language Models Understand Spatial Audio?

fields

years

verdicts

representative citing papers

citing papers explorer