SEDualVLN proposes a spatially-enhanced dual-system VLN framework that pairs a fast VLM action generator with a slow MLLM waypoint planner and reports state-of-the-art results on VLN-CE benchmarks.
Segment anything,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
RAFT-MSF++ recurrently fuses Geometry-Motion Features across frames with positional attention and occlusion regularization to improve self-supervised monocular scene flow estimation.
citing papers explorer
-
SEDualVLN: A Spatially-Enhanced Dual-System for Vision-Language Navigation
SEDualVLN proposes a spatially-enhanced dual-system VLN framework that pairs a fast VLM action generator with a slow MLLM waypoint planner and reports state-of-the-art results on VLN-CE benchmarks.
-
RAFT-MSF++: Temporal Geometry-Motion Feature Fusion for Self-Supervised Monocular Scene Flow
RAFT-MSF++ recurrently fuses Geometry-Motion Features across frames with positional attention and occlusion regularization to improve self-supervised monocular scene flow estimation.