arXiv preprint arXiv:2412.10371 (2024) DVGT-2 21

Wenzhao Zheng, Junjie Wu, Yao Zheng, Sicheng Zuo, Zixun Xie, Longchao Yang, Yong Pan, Zhihui Hao, Peng Jia, Xianpeng Lang, et al · 2024 · arXiv 2412.10371

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

cs.CV · 2026-04-01 · unverdicted · novelty 6.0

DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.

ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding

cs.CV · 2025-12-03 · unverdicted · novelty 6.0

ShelfGaussian achieves state-of-the-art zero-shot semantic occupancy prediction on Occ3D-nuScenes by jointly supervising Gaussian representations with vision foundation model features at 2D image and 3D scene levels.

PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving

cs.CV · 2025-07-23 · unverdicted · novelty 6.0

PRIX presents an efficient camera-only planner with a novel CaRT module that matches larger multimodal models on NavSim and nuScenes while reducing model size and inference time.

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

cs.CV · 2025-06-16 · unverdicted · novelty 6.0

AutoVLA unifies semantic reasoning and trajectory planning in one autoregressive VLA model for end-to-end autonomous driving by tokenizing trajectories into discrete actions and using GRPO reinforcement fine-tuning to adaptively reduce unnecessary reasoning.

SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model

cs.CV · 2025-11-27 · unverdicted · novelty 5.0

A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.

citing papers explorer

Showing 5 of 5 citing papers.

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale cs.CV · 2026-04-01 · unverdicted · none · ref 87
DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.
ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding cs.CV · 2025-12-03 · unverdicted · none · ref 95
ShelfGaussian achieves state-of-the-art zero-shot semantic occupancy prediction on Occ3D-nuScenes by jointly supervising Gaussian representations with vision foundation model features at 2D image and 3D scene levels.
PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving cs.CV · 2025-07-23 · unverdicted · none · ref 61
PRIX presents an efficient camera-only planner with a novel CaRT module that matches larger multimodal models on NavSim and nuScenes while reducing model size and inference time.
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning cs.CV · 2025-06-16 · unverdicted · none · ref 63
AutoVLA unifies semantic reasoning and trajectory planning in one autoregressive VLA model for end-to-end autonomous driving by tokenizing trajectories into discrete actions and using GRPO reinforcement fine-tuning to adaptively reduce unnecessary reasoning.
SparseWorld-TC: Trajectory-Conditioned Sparse Occupancy World Model cs.CV · 2025-11-27 · unverdicted · none · ref 55
A sparse transformer predicts multi-frame 3D occupancy from images without BEV or VAE tokenization and reports SOTA results on nuScenes for 1-3s forecasting under arbitrary trajectories.

arXiv preprint arXiv:2412.10371 (2024) DVGT-2 21

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer