Language-guided semantic cues from MLLM visual pipelines, steered by text embeddings, refine object semantics and boost grounding accuracy against occlusion and small objects.
These modules derive and integrate lingusitic semantic pri- ors, enhancing grounding robustness against occlusion and small objects in crowded scenes
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Robust Grounding with MLLMs Against Occlusion and Small Objects via Language-Guided Semantic Cues
Language-guided semantic cues from MLLM visual pipelines, steered by text embeddings, refine object semantics and boost grounding accuracy against occlusion and small objects.