mKG-RAG constructs multimodal KGs via MLLM-driven extraction and vision-text matching then applies dual-stage query-aware retrieval to achieve new state-of-the-art results on knowledge-based VQA.
Recommender systems in the era of large language models (llms) // IEEE Transactions on Knowledge and Data Engineering
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
mKG-RAG: Leveraging Multimodal Knowledge Graphs in Retrieval-Augmented Generation for Knowledge-intensive VQA
mKG-RAG constructs multimodal KGs via MLLM-driven extraction and vision-text matching then applies dual-stage query-aware retrieval to achieve new state-of-the-art results on knowledge-based VQA.