3DReflecNet is a 22 TB+ dataset of over 120,000 synthetic and 1,000 real objects with millions of multi-view frames for benchmarking 3D reconstruction on reflective, transparent, and low-texture surfaces.
hub Canonical reference
3d gaussian splatting for real-time radiance field rendering.ACM Trans
Canonical reference. 71% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
fields
cs.CV 58representative citing papers
ULF-Loc removes bias from 3DGS landmark features via geometry-weighted fusion and consistency checks, cutting median translation error 17% while using 1/10 training time and 1/6 GPU memory of prior state-of-the-art.
PAGaS refines multi-view stereo depths by optimizing 1DoF Gaussians whose positions and sizes are fixed by back-projected pixel volumes, producing detailed depth maps that outperform reference baselines on 3D reconstruction benchmarks.
TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.
ClipGStream enables scalable flicker-free reconstruction of long dynamic multi-view videos by performing stream optimization at the clip level with clip-independent spatio-temporal fields, residual anchor compensation, and inter-clip inherited anchors.
A novel explicit neural height field method for descent-phase wide-angle imagery achieves greater spatial coverage than multi-view stereo while preserving estimation accuracy on simulated planetary terrains.
DreamStereo uses GAPW, PBDP, and SASI to enable real-time stereo video inpainting at 25 FPS for HD videos by reducing over 70% redundant computation while maintaining quality.
AnchorSplat uses anchor-aligned 3D Gaussians guided by geometric priors for feed-forward scene reconstruction, achieving SOTA novel view synthesis on ScanNet++ with fewer primitives and better view consistency.
AvatarPointillist autoregressively generates adaptive 3D point clouds via Transformer for photorealistic 4D Gaussian avatars from one image, jointly predicting animation bindings and using a conditioned Gaussian decoder.
Test-time constrained optimization incorporates priors into pre-trained multiview transformers via self-supervised losses and penalty terms to improve 3D reconstruction accuracy.
THOM is a training-free two-stage framework that generates physically plausible hand-object 3D meshes directly from text by combining text-guided Gaussians with contact-aware physics optimization and VLM refinement.
ProDiG progressively transforms aerial Gaussian splats into coherent ground-level 3D reconstructions via diffusion guidance and specialized attention modules.
MoGaF groups Gaussians by motion in 4D splatting representations to enable stable long-term forecasting of dynamic scenes.
PerpetualWonder introduces a closed-loop generative simulator with a unified physical-visual representation for long-horizon action-conditioned 4D scene generation from one image.
AGILE generates complete object meshes via VLM-guided synthesis and tracks poses with anchor-and-track plus contact-aware optimization to achieve robust hand-object reconstruction from video.
ART is a category-agnostic transformer that maps sparse multi-state RGB images to per-part 3D geometry, texture, and articulation parameters via learnable part slots.
RDSplat is the first 3D Gaussian Splatting watermarking method that maintains 0.701 bit accuracy against both 2D and 3D diffusion editing by embedding only in low-frequency primitives selected via FAPS.
A Z-order transformer organizes unstructured Gaussians for sparse attention, enabling feed-forward prediction of high-quality 3D splats with fewer primitives.
HumanSplatHMR jointly refines 3D human poses and learns Gaussian Splatting avatars by backpropagating photometric, segmentation, and depth losses through a differentiable renderer to improve novel-view and novel-pose synthesis from in-the-wild video.
Pruned local linear blendshapes on Gaussians capture pose-dependent appearance changes to deliver high-quality mobile avatars at 120 FPS from multi-view video without pretrained models.
GLMap combines explicit 3D Gaussians with multi-scale language semantics in a dual-modality structure and uses an analytical Gaussian Estimator for incremental map building, improving zero-shot performance on navigation and reasoning tasks.
GenWildSplat is a feed-forward model that reconstructs 3D Gaussians from sparse unposed unconstrained images by predicting depth and poses with learned priors, an appearance adapter, and semantic segmentation for transients.
Color-encoded illumination combined with dynamic Gaussian Splatting enables first-of-a-kind high-speed volumetric reconstruction from unaugmented low-speed multi-view cameras.
Unprojecting latent embeddings via depth maps and recalibrating with cross-view attention improves 3D Gaussian localization for generalizable sparse-view human rendering.
citing papers explorer
-
Turbo-GS: Accelerating 3D Gaussian Fitting for High-Quality Radiance Fields
Turbo-GS accelerates 3D Gaussian Splatting training via dilated rendering of pixel subsets, convergence-aware Gaussian budget allocation, and combined positional-appearance error densification to enable faster 4K fitting with preserved or improved rendering quality.