pith. sign in

arxiv: 2602.23172 · v2 · pith:U7FI6JM3new · submitted 2026-02-26 · 💻 cs.CV · cs.AI· cs.RO

Latent Gaussian Splatting for 4D Panoptic Occupancy Tracking

classification 💻 cs.CV cs.AIcs.RO
keywords gaussianoccupancytrackingcoarsed-potdynamicenablesfeatures
0
0 comments X
read the original abstract

Capturing 4D spatiotemporal scene structure is crucial for the safe and reliable operation of robots in dynamic environments. However, existing approaches typically address only part of the problem: they either provide coarse geometric tracking via bounding boxes or detailed 3D occupancy estimates that lack explicit temporal association and instance-level reasoning. In this work, we present Latent Gaussian Splatting (LaGS) for 4D Panoptic Occupancy Tracking (4D-POT). We revisit the underlying representation and model 3D features as a sparse set of feature-bearing Gaussians. These act as dynamic, volume-oriented keypoints that enable spatially continuous, distance-weighted aggregation of multi-view features before being splatted into a voxel grid for decoding. This point-centric formulation enables flexible, data-dependent receptive fields and long-range spatial interactions that are difficult to capture with local and dense voxel-based operators. A hierarchical Gaussian representation further enables multi-scale reasoning by combining global context from coarse super-points with fine-grained detail from higher-resolution streams. Extensive experiments on Occ3D nuScenes and Waymo demonstrate state-of-the-art performance for 4D-POT. We provide code and models at https://lags.cs.uni-freiburg.de/.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Streaming Gaussian Encoding for 4D Panoptic Occupancy Tracking

    cs.CV 2026-06 unverdicted novelty 7.0

    Introduces a streaming Gaussian encoder maintaining persistent volumetric representations via ego-motion compensation and confidence-guided updates for improved 4D panoptic occupancy tracking from cameras.

  2. Hyp2Former: Hierarchy-Aware Hyperbolic Embeddings for Open-Set Panoptic Segmentation

    cs.CV 2026-05 unverdicted novelty 6.0

    Hyp2Former learns hierarchical semantic similarities in hyperbolic space among known categories so that unknown objects remain close to higher-level concepts and can be detected reliably.