MAMMQA is a multi-agent framework that decomposes multimodal queries, retrieves modality-specific answers, performs cross-modal synthesis with VLMs, and integrates results via an LLM to outperform single-model baselines on QA benchmarks.
Preprint, arXiv:2306.16762
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Rethinking Information Synthesis in Multimodal Question Answering A Multi-Agent Perspective
MAMMQA is a multi-agent framework that decomposes multimodal queries, retrieves modality-specific answers, performs cross-modal synthesis with VLMs, and integrates results via an LLM to outperform single-model baselines on QA benchmarks.