PointVG-R is a new MLLM that reaches SOTA on pointing localization by 15.86 mIoU points via a geometric reasoning pipeline, EgoPoint-CoT dataset, SFT, RL, and variance-based reward weighting.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
VistaRef improves pointing-to-object detection accuracy by 14 points via local hand entity modeling, geometric ray modeling, and an orientation-consistent alignment loss.
citing papers explorer
-
PointVG-R: Internalizing Geometric Reasoning in MLLMs for Precise Pointing Localization via Visual Chain of Thought
PointVG-R is a new MLLM that reaches SOTA on pointing localization by 15.86 mIoU points via a geometric reasoning pipeline, EgoPoint-CoT dataset, SFT, RL, and variance-based reward weighting.
-
VistaRef: Boosting Visual Spatial Orientation Awareness for Pointing-to-Object Detection
VistaRef improves pointing-to-object detection accuracy by 14 points via local hand entity modeling, geometric ray modeling, and an orientation-consistent alignment loss.