Simvg: A simple framework for visual ground- ing with decoupled multi-modal fusion.Advances in neural information processing systems, 37:121670–121698, 2024

Ming Dai, Lingfeng Yang, Yihao Xu, Zhenhua Feng, Wankou Yang · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

cs.CV · 2025-12-03 · unverdicted · novelty 6.0

ThinkDeeper introduces a world-model-based reasoning step that predicts future spatial states to improve multimodal visual grounding for autonomous vehicles, achieving top results on Talk2Car and other benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles cs.CV · 2025-12-03 · unverdicted · none · ref 10
ThinkDeeper introduces a world-model-based reasoning step that predicts future spatial states to improve multimodal visual grounding for autonomous vehicles, achieving top results on Talk2Car and other benchmarks.

Simvg: A simple framework for visual ground- ing with decoupled multi-modal fusion.Advances in neural information processing systems, 37:121670–121698, 2024

fields

years

verdicts

representative citing papers

citing papers explorer