MyGO-Splat is a closed-loop RGB-only Gaussian SLAM system that rasterizes depth and normals from the map to supervise pose optimization and align monocular depth priors for scale consistency.
FrameVGGT: Geometry-Aligned Frame-Level Memory for Bounded Streaming VGGT
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
Streaming Visual Geometry Transformers such as StreamVGGT enable strong online 3D perception, but their KV-cache grows unbounded over long streams, limiting practical deployment. We study bounded-memory streaming geometry from the perspective of memory organization: unlike language modeling, where useful information can often be compressed at token level, geometry-driven inference relies on coherent and mutually compatible observations across views. Under fixed memory budgets, retaining history as isolated entries can progressively fragment the geometric context needed for stable long-horizon matching and fusion. We therefore propose \textbf{FrameVGGT}, a bounded-memory framework that maintains a fixed-capacity set of complementary memory units for streaming geometry. In our implementation, each unit is instantiated as a frame-wise KV segment summarized by a compact key-space prototype, together with a sparse anchor tier for persistent long-range references. Across long-sequence 3D reconstruction, video depth estimation, and camera pose estimation, FrameVGGT achieves favorable accuracy--memory trade-offs under bounded budgets while maintaining more stable geometry over long streams.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A two-stage diversity-plus-entropy token selection framework speeds up visual geometry transformers by over 85% on 500-image scenes while preserving baseline accuracy.
PanoImager is an SfM-free pipeline combining feed-forward priors, geometry-conditioned diffusion view completion, and depth-guided 3DGS optimization to reconstruct from sparse panoramic images.
citing papers explorer
-
Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers
A two-stage diversity-plus-entropy token selection framework speeds up visual geometry transformers by over 85% on 500-image scenes while preserving baseline accuracy.
-
PanoImager: Geometry-Guided Novel View Synthesis and Reconstruction from Sparse Panoramic Views
PanoImager is an SfM-free pipeline combining feed-forward priors, geometry-conditioned diffusion view completion, and depth-guided 3DGS optimization to reconstruct from sparse panoramic images.