OneCanvas aggregates multi-view 3D patches onto one panoramic canvas with continuous angular placement and 3D embeddings, enabling pretrained VLMs to achieve SOTA on SQA3D and VSI-Bench with an order of magnitude less compute via a new spatial pretraining curriculum.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
VLM-GLoc is a hierarchical semantic Monte Carlo Localization system that uses VLMs for discriminative observations and inverse text-to-map proposals, reporting 70% and 74% success in a grocery store and lab respectively.
PointVG-R is a new MLLM that reaches SOTA on pointing localization by 15.86 mIoU points via a geometric reasoning pipeline, EgoPoint-CoT dataset, SFT, RL, and variance-based reward weighting.
citing papers explorer
-
OneCanvas: 3D Scene Understanding via Panoramic Reprojection
OneCanvas aggregates multi-view 3D patches onto one panoramic canvas with continuous angular placement and 3D embeddings, enabling pretrained VLMs to achieve SOTA on SQA3D and VSI-Bench with an order of magnitude less compute via a new spatial pretraining curriculum.
-
VLM-GLoc: Vision-Language Model Enhanced Monte Carlo Localization for Robust Semantic Global Localization in Cluttered Quasi-Static Environments
VLM-GLoc is a hierarchical semantic Monte Carlo Localization system that uses VLMs for discriminative observations and inverse text-to-map proposals, reporting 70% and 74% success in a grocery store and lab respectively.
-
PointVG-R: Internalizing Geometric Reasoning in MLLMs for Precise Pointing Localization via Visual Chain of Thought
PointVG-R is a new MLLM that reaches SOTA on pointing localization by 15.86 mIoU points via a geometric reasoning pipeline, EgoPoint-CoT dataset, SFT, RL, and variance-based reward weighting.