Storm: Spatio-temporal re- construction model for large-scale outdoor scenes

· 2024 · arXiv 2501.00602

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

citation-role summary

background 2 baseline 1

citation-polarity summary

background 2 baseline 1

representative citing papers

PointForward: Feedforward Driving Reconstruction through Point-Aligned Representations

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

PointForward uses sparse world-space 3D queries and scene graphs to deliver consistent single-pass reconstruction of dynamic driving scenes via point-aligned representations.

ConFixGS: Learning to Fix Feedforward 3D Gaussian Splatting with Confidence-Aware Diffusion Priors in Driving Scenes

cs.CV · 2026-05-10 · unverdicted · novelty 7.0

ConFixGS repairs feedforward 3D Gaussian Splatting with confidence-aware diffusion priors, delivering up to 3.68 dB PSNR gains and halved FID scores on Waymo, nuScenes, and KITTI novel view synthesis tasks.

Ground4D: Spatially-Grounded Feedforward 4D Reconstruction for Unstructured Off-Road Scenes

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

Ground4D resolves temporal conflicts in feedforward 4D Gaussian reconstruction for off-road scenes via voxel-grounded temporal aggregation with intra-voxel softmax and surface normal regularization, outperforming prior methods on ORAD-3D and RELLIS-3D while generalizing zero-shot.

TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.

FFAvatar: Feed-Forward 4D Head Avatar Reconstruction from Sparse Portrait Images

cs.CV · 2026-06-29 · unverdicted · novelty 6.0

FFAvatar uses a Transformer-based 3D Gaussian model with alternating attention and sparse-to-dense learning to enable feed-forward, incremental reconstruction of animatable 4D head avatars from sparse portrait images.

Envision4D: Envisioning Visual Futures via Feed-forward 4D Gaussian Splatting for Autonomous Driving

cs.CV · 2026-06-09 · unverdicted · novelty 6.0

Envision4D presents a feed-forward 4D Gaussian Splatting framework with future pose prediction, temporal attention, and conditioned motion lifting for pose-free extrapolation in autonomous driving scenes.

EnerGS: Energy-Based Gaussian Splatting with Partial Geometric Priors

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

EnerGS introduces an energy-based soft guidance mechanism for partial geometry in 3D Gaussian Splatting to improve reconstruction quality and reduce overfitting in sparse outdoor settings.

GaussianDWM: 3D Gaussian Driving World Model for Unified Scene Understanding and Multi-Modal Generation

cs.CV · 2025-12-29 · unverdicted · novelty 6.0

GaussianDWM uses 3D Gaussians with embedded linguistic features, language-guided sampling, and dual-condition generation for unified scene understanding and multi-modal output in driving world models.

Flux4D: Flow-based Unsupervised 4D Reconstruction

cs.CV · 2025-12-02 · unverdicted · novelty 6.0

Flux4D reconstructs large-scale dynamic 4D scenes unsupervised by predicting moving 3D Gaussians from photometric losses and static regularization when trained across multiple scenes.

SimScale: Learning to Drive via Real-World Simulation at Scale

cs.CV · 2025-11-28 · conditional · novelty 6.0

SimScale synthesizes unseen driving states from real logs via neural rendering and reactive environments, generates pseudo-expert trajectories, and shows that co-training on real plus simulated data improves planning robustness and generalization on real benchmarks, with gains scaling by simulation

L2D2-GS: Learning to Densify for Feedforward Dynamic Gaussian Scene Reconstruction

cs.CV · 2026-06-28 · unverdicted · novelty 5.0

L2D2-GS reformulates generalizable dynamic Gaussian reconstruction as iterative optimization with a self-supervised densification policy and geometric regularization, claiming SOTA fidelity and zero-shot generalization on PandaSet and Waymo with fewer primitives.

Xiaomi Auto World Model: A Joint World Model Integrating Reconstruction and Generation for Autonomous Driving

cs.CV · 2026-05-18 · unverdicted · novelty 5.0 · 2 refs

A unified system integrating sparse-query 3D Gaussian reconstruction with multi-stage causal video generation for autonomous driving world models.

citing papers explorer

Showing 12 of 12 citing papers.

PointForward: Feedforward Driving Reconstruction through Point-Aligned Representations cs.CV · 2026-05-12 · unverdicted · none · ref 36
PointForward uses sparse world-space 3D queries and scene graphs to deliver consistent single-pass reconstruction of dynamic driving scenes via point-aligned representations.
ConFixGS: Learning to Fix Feedforward 3D Gaussian Splatting with Confidence-Aware Diffusion Priors in Driving Scenes cs.CV · 2026-05-10 · unverdicted · none · ref 62
ConFixGS repairs feedforward 3D Gaussian Splatting with confidence-aware diffusion priors, delivering up to 3.68 dB PSNR gains and halved FID scores on Waymo, nuScenes, and KITTI novel view synthesis tasks.
Ground4D: Spatially-Grounded Feedforward 4D Reconstruction for Unstructured Off-Road Scenes cs.CV · 2026-05-06 · unverdicted · none · ref 49
Ground4D resolves temporal conflicts in feedforward 4D Gaussian reconstruction for off-road scenes via voxel-grounded temporal aggregation with intra-voxel softmax and surface normal regularization, outperforming prior methods on ORAD-3D and RELLIS-3D while generalizing zero-shot.
TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens cs.CV · 2026-04-16 · unverdicted · none · ref 49
TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.
FFAvatar: Feed-Forward 4D Head Avatar Reconstruction from Sparse Portrait Images cs.CV · 2026-06-29 · unverdicted · none · ref 67
FFAvatar uses a Transformer-based 3D Gaussian model with alternating attention and sparse-to-dense learning to enable feed-forward, incremental reconstruction of animatable 4D head avatars from sparse portrait images.
Envision4D: Envisioning Visual Futures via Feed-forward 4D Gaussian Splatting for Autonomous Driving cs.CV · 2026-06-09 · unverdicted · none · ref 14
Envision4D presents a feed-forward 4D Gaussian Splatting framework with future pose prediction, temporal attention, and conditioned motion lifting for pose-free extrapolation in autonomous driving scenes.
EnerGS: Energy-Based Gaussian Splatting with Partial Geometric Priors cs.CV · 2026-04-29 · unverdicted · none · ref 44
EnerGS introduces an energy-based soft guidance mechanism for partial geometry in 3D Gaussian Splatting to improve reconstruction quality and reduce overfitting in sparse outdoor settings.
GaussianDWM: 3D Gaussian Driving World Model for Unified Scene Understanding and Multi-Modal Generation cs.CV · 2025-12-29 · unverdicted · none · ref 57
GaussianDWM uses 3D Gaussians with embedded linguistic features, language-guided sampling, and dual-condition generation for unified scene understanding and multi-modal output in driving world models.
Flux4D: Flow-based Unsupervised 4D Reconstruction cs.CV · 2025-12-02 · unverdicted · none · ref 60
Flux4D reconstructs large-scale dynamic 4D scenes unsupervised by predicting moving 3D Gaussians from photometric losses and static regularization when trained across multiple scenes.
SimScale: Learning to Drive via Real-World Simulation at Scale cs.CV · 2025-11-28 · conditional · none · ref 80
SimScale synthesizes unseen driving states from real logs via neural rendering and reactive environments, generates pseudo-expert trajectories, and shows that co-training on real plus simulated data improves planning robustness and generalization on real benchmarks, with gains scaling by simulation
L2D2-GS: Learning to Densify for Feedforward Dynamic Gaussian Scene Reconstruction cs.CV · 2026-06-28 · unverdicted · none · ref 7
L2D2-GS reformulates generalizable dynamic Gaussian reconstruction as iterative optimization with a self-supervised densification policy and geometric regularization, claiming SOTA fidelity and zero-shot generalization on PandaSet and Waymo with fewer primitives.
Xiaomi Auto World Model: A Joint World Model Integrating Reconstruction and Generation for Autonomous Driving cs.CV · 2026-05-18 · unverdicted · none · ref 13 · 2 links
A unified system integrating sparse-query 3D Gaussian reconstruction with multi-stage causal video generation for autonomous driving world models.

Storm: Spatio-temporal re- construction model for large-scale outdoor scenes

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer