VistaBot integrates 4D geometry estimation and spatiotemporal view synthesis into action policies to improve cross-view generalization by 2.6-2.8x on a new VGS metric in simulation and real tasks.
Langscene-x: Re- construct generalizable 3d language-embedded scenes with trimap video diffusion.arXiv preprint arXiv:2507.02813
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
NG-GS uses NeRF guidance and RBF interpolation on 3DGS to produce smoother, higher-quality object segmentation boundaries.
FLEG reconstructs language-embedded 3D Gaussians from arbitrary input views using a dual-branch distillation framework and a sparse set of semantic Gaussians that requires only 5% of prior embeddings.
citing papers explorer
-
VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis
VistaBot integrates 4D geometry estimation and spatiotemporal view synthesis into action policies to improve cross-view generalization by 2.6-2.8x on a new VGS metric in simulation and real tasks.
-
NG-GS: NeRF-Guided 3D Gaussian Splatting Segmentation
NG-GS uses NeRF guidance and RBF interpolation on 3DGS to produce smoother, higher-quality object segmentation boundaries.
-
FLEG: Feed-Forward Language Embedded Gaussian Splatting from Any Views via Compact Semantic Representation
FLEG reconstructs language-embedded 3D Gaussians from arbitrary input views using a dual-branch distillation framework and a sparse set of semantic Gaussians that requires only 5% of prior embeddings.