We show that the usage of text is higher for multiple-choice questions, aligning with results from Vision LLMs

Examine how the Audio LLMs use the two modalities

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Investigating Modality Contribution in Audio LLMs for Music

cs.LG · 2025-09-25 · unverdicted · novelty 6.0

Adapts MM-SHAP to quantify modality contributions in two Audio LLMs on MuChoMusic, showing text dominance alongside limited audio localization of key events.

citing papers explorer

Showing 1 of 1 citing paper.

Investigating Modality Contribution in Audio LLMs for Music cs.LG · 2025-09-25 · unverdicted · none · ref 4
Adapts MM-SHAP to quantify modality contributions in two Audio LLMs on MuChoMusic, showing text dominance alongside limited audio localization of key events.

We show that the usage of text is higher for multiple-choice questions, aligning with results from Vision LLMs

fields

years

verdicts

representative citing papers

citing papers explorer