GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Barbara R\"ossle; Katharina Schmid; Matthias Nie{\ss}ner; Nicolas von L\"utzow

arxiv: 2603.26661 · v2 · pith:56LUWHQ6new · submitted 2026-03-27 · 💻 cs.CV

GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Nicolas von L\"utzow , Barbara R\"ossle , Katharina Schmid , Matthias Nie{\ss}ner This is my paper

classification 💻 cs.CV

keywords generationautoregressivecontrollableformulationgaussiangaussiangptmodelingscene

0 comments

read the original abstract

Most recent advances in 3D generative modeling rely on diffusion or flow-matching formulations. We instead explore a fully autoregressive alternative and introduce GaussianGPT, a transformer-based model that directly generates 3D Gaussians via next-token prediction, thus facilitating full 3D scene generation. We first compress Gaussian primitives into a discrete latent grid using a sparse 3D convolutional autoencoder with vector quantization. The resulting tokens are serialized and modeled using a causal transformer with 3D rotary positional embedding, enabling sequential generation of spatial structure and appearance. Unlike diffusion-based methods that refine scenes holistically, our formulation constructs scenes step-by-step, naturally supporting completion, outpainting, controllable sampling via temperature, and flexible generation horizons. This formulation leverages the compositional inductive biases and scalability of autoregressive modeling while operating on explicit representations compatible with modern neural rendering pipelines, positioning autoregressive transformers as a complementary paradigm for controllable and context-aware 3D generation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

BrickAnything: Geometry-Conditioned Buildable Brick Generation with Structure-Aware Tokenization
cs.AI 2026-05 unverdicted novelty 6.0

BrickAnything generates buildable brick structures from 3D point clouds via geometry-conditioned autoregressive prediction with structure-aware tree tokenization and post-training for stability.