V2-SAM adapts SAM2 to cross-view object correspondence with geometry-aware and appearance-based prompt generators plus a post-hoc cyclic consistency selector, reporting new state-of-the-art results on Ego-Exo4D, DAVIS-2017, and HANDAL-X.
Objectrelator: Enabling cross-view object relation understanding in ego-centric and exo-centric videos.arXiv preprint arXiv:2411.19083, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
CrossView Suite supplies a 1.6M-sample dataset, scene-disjoint benchmark, and explicit-alignment framework to advance MLLMs from single-view perception to cross-view spatial intelligence.
citing papers explorer
-
V$^{2}$-SAM: Marrying SAM2 with Multi-Prompt Experts for Cross-View Object Correspondence
V2-SAM adapts SAM2 to cross-view object correspondence with geometry-aware and appearance-based prompt generators plus a post-hoc cyclic consistency selector, reporting new state-of-the-art results on Ego-Exo4D, DAVIS-2017, and HANDAL-X.
-
CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark
CrossView Suite supplies a 1.6M-sample dataset, scene-disjoint benchmark, and explicit-alignment framework to advance MLLMs from single-view perception to cross-view spatial intelligence.