GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens

· 2026 · cs.CV · arXiv 2604.15284

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

The efficient spatial allocation of primitives serves as the foundation of 3D Gaussian Splatting, as it directly dictates the synergy between representation compactness, reconstruction speed, and rendering fidelity. Previous solutions, whether based on iterative optimization or feed-forward inference, suffer from significant trade-offs between these goals, mainly due to the reliance on local, heuristic-driven allocation strategies that lack global scene awareness. Specifically, current feed-forward methods are largely pixel-aligned or voxel-aligned. By unprojecting pixels into dense, view-aligned primitives, they bake redundancy into the 3D asset. As more input views are added, the representation size increases and global consistency becomes fragile. To this end, we introduce GlobalSplat, a framework built on the principle of align first, decode later. Our approach learns a compact, global, latent scene representation that encodes multi-view input and resolves cross-view correspondences before decoding any explicit 3D geometry. Crucially, this formulation enables compact, globally consistent reconstructions without relying on pretrained pixel-prediction backbones or reusing latent features from dense baselines. Utilizing a coarse-to-fine training curriculum that gradually increases decoded capacity, GlobalSplat natively prevents representation bloat. On RealEstate10K and ACID, our model achieves competitive novel-view synthesis performance while utilizing as few as 16K Gaussians, significantly less than required by dense pipelines, obtaining a light 4MB footprint. Further, GlobalSplat enables significantly faster inference than the baselines, operating under 78 milliseconds in a single forward pass. Project page is available at https://r-itk.github.io/globalsplat/

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

AdaptSplat: Adapting Vision Foundation Models for Feed-Forward 3D Gaussian Splatting

cs.CV · 2026-05-11 · unverdicted · novelty 7.0 · 2 refs

AdaptSplat adds a Frequency-Preserving Adapter to vision foundation models to boost high-frequency fidelity and cross-domain performance in feed-forward 3D Gaussian Splatting.

PRISM: Feed-Forward Single-Image 3D Reconstruction via Geometric Warp-Residual Modeling

cs.CV · 2026-06-24 · unverdicted · novelty 6.0

PRISM is a feed-forward framework that decomposes single-image 3D reconstruction into a geometric warp prior plus residual correction, claiming competitive quality at 36-second inference.

Learning Stable Canonical Worlds for Novel View Synthesis and Beyond

cs.CV · 2026-06-22 · unverdicted · novelty 4.0

CanonicalGS aggregates view-centric evidence into a canonical latent world with uncertainty-aware fusion to improve novel view synthesis and downstream perception tasks.

citing papers explorer

Showing 3 of 3 citing papers after filters.

AdaptSplat: Adapting Vision Foundation Models for Feed-Forward 3D Gaussian Splatting cs.CV · 2026-05-11 · unverdicted · none · ref 10 · 2 links · internal anchor
AdaptSplat adds a Frequency-Preserving Adapter to vision foundation models to boost high-frequency fidelity and cross-domain performance in feed-forward 3D Gaussian Splatting.
PRISM: Feed-Forward Single-Image 3D Reconstruction via Geometric Warp-Residual Modeling cs.CV · 2026-06-24 · unverdicted · none · ref 10 · internal anchor
PRISM is a feed-forward framework that decomposes single-image 3D reconstruction into a geometric warp prior plus residual correction, claiming competitive quality at 36-second inference.
Learning Stable Canonical Worlds for Novel View Synthesis and Beyond cs.CV · 2026-06-22 · unverdicted · none · ref 3 · internal anchor
CanonicalGS aggregates view-centric evidence into a canonical latent world with uncertainty-aware fusion to improve novel view synthesis and downstream perception tasks.

GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer