A training-free GAR pipeline leverages MLLM meta-reasoning for sound source localization by generating bounding boxes, quantifying consistency via role tagging and voting, and applying adaptive refinement, achieving competitive benchmark results.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Generate, Analyze, and Refine: Training-Free Sound Source Localization via MLLM Meta-Reasoning
A training-free GAR pipeline leverages MLLM meta-reasoning for sound source localization by generating bounding boxes, quantifying consistency via role tagging and voting, and applying adaptive refinement, achieving competitive benchmark results.