A latent diffusion model over continuous implicit neural representations samples INR parameters from sparse keyframes to reconstruct plausible, smooth, and diverse motions while preserving keyframe accuracy.
hub
Nerf: Representing scenes as neural radiance fields for view synthesis
26 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
SplatWeaver uses cardinality Gaussian experts and pixel-level routing to dynamically allocate varying numbers of Gaussian primitives for generalizable novel view synthesis.
RadTwin conditions a neural radio-propagation model on scene point clouds via physics-informed sparse attention, achieving 0.846 SSIM and 0.023 LPIPS on dynamic indoor scenes without retraining.
CDPR integrates polarization priors into a diffusion-based monocular depth estimator via shared latent space and adaptive gating, outperforming RGB-only methods in challenging scenes.
The paper presents a multimodal framework, dataset, and reconstruction pipeline to create immersive volumetric videos supporting large 6-DoF audiovisual interaction from real multi-view captures.
ADM-GS decomposes static background appearance into traversal-invariant material and traversal-dependent illumination via a frequency-separated neural light field, yielding +0.98 dB PSNR gains and better cross-traversal consistency on Argoverse 2 and Waymo data.
GaussLock embeds traps targeting position, scale, rotation, opacity, and color in 3D Gaussian models to degrade unauthorized fine-tunes while preserving authorized performance.
Diffusion models reconstruct high-resolution 3D cardiac ultrasound volumes from heavily undersampled elevation planes and outperform traditional interpolation and supervised deep learning baselines.
AIR amortizes 2D Gaussian splatting into a self-supervised feed-forward network via residual stages, explicit stage control, and Predict-Optimize-Distill training.
LANCE extends OIC frameworks with a spatial hyperprior and predictive coding scheme, reporting BD-rate gains of 1.4-3% over Cool-Chic 4.0 on Kodak and CLIC.
Spark3R achieves up to 28x speedup on 1000-frame 3D reconstruction inputs by asymmetrically reducing query and key-value tokens in Vision Transformers while keeping competitive quality.
GA-GS uses motion segmentation, diffusion-based inpainting for pseudo-ground-truth, and per-Gaussian authenticity scalars to achieve SOTA static scene reconstruction from videos with dynamic occlusions.
TFusionOcc uses a family of Student's t-distribution T-primitives and a T-mixture model for multi-sensor 3D occupancy prediction, reporting state-of-the-art results on nuScenes.
MapRF reaches about 75% of fully supervised HD map accuracy on Argoverse 2 and nuScenes by generating view-consistent pseudo labels via a NeRF conditioned on map predictions and refining them with Map-to-Ray Matching in self-training.
Lumos3D enables pose-free single-forward restoration of low-light 3D scenes via cross-illumination distillation from a teacher network and a custom Lumos loss on 3D Gaussians.
A ray-driven neural base-material field model parameterizes attenuation coefficients as continuous implicit functions and uses auto-differentiation to solve spectral CT reconstruction.
Forecast-GS predicts task-completed 3D states via Gaussian splatting to achieve higher success rates than baselines in real-world language-conditioned manipulation tasks.
GAI-NeRF combines geometric algebra attention and an adaptive ray tracing module inside a NeRF model to deliver more accurate and generalizable wireless channel predictions across varied indoor environments.
An AR system using 3D Gaussian Splatting, WIM navigation, and semantic POIs enables real-time disaster scene visualization with high usability and acceptance in preliminary user tests.
LGDWT-GS adds local and global discrete wavelet regularization to 3D Gaussian Splatting for sharper, more stable sparse-view reconstructions and releases a multispectral plant dataset with standardized benchmarks.
Adapting Depth Anything V2 with DV-LORA bridges the ex-vivo to in-vivo gap in monocular depth estimation for specular surgical environments, achieving SOTA on SCARED and superior results on new ROCAL-T 90 dataset.
Scene-adaptive lattice vector quantization improves rate-distortion performance of 3DGS compression over uniform scalar quantization while adding little overhead and supporting multiple bit rates from one trained model.
AnimateAnyMesh++ animates arbitrary 3D meshes from text using an expanded 300K-identity DyMesh-XL dataset, a power-law topology-aware DyMeshVAE-Flex, and a variable-length rectified-flow generator to produce semantically accurate, temporally coherent animations in seconds.
Safe robot navigation in obstacle-rich environments is demonstrated in simulation by representing obstacles with NeRFs and enforcing safety via reachable-set constraints inside a linear-matrix-inequality optimal-control formulation.
citing papers explorer
-
Generative Motion In-betweening by Diffusion over Continuous Implicit Representations
A latent diffusion model over continuous implicit neural representations samples INR parameters from sparse keyframes to reconstruct plausible, smooth, and diverse motions while preserving keyframe accuracy.
-
SplatWeaver: Learning to Allocate Gaussian Primitives for Generalizable Novel View Synthesis
SplatWeaver uses cardinality Gaussian experts and pixel-level routing to dynamically allocate varying numbers of Gaussian primitives for generalizable novel view synthesis.
-
RadTwin: Generalizable Wireless Digital Twin for Dynamic Environments
RadTwin conditions a neural radio-propagation model on scene point clouds via physics-informed sparse attention, achieving 0.846 SSIM and 0.023 LPIPS on dynamic indoor scenes without retraining.
-
CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation
CDPR integrates polarization priors into a diffusion-based monocular depth estimator via shared latent space and adaptive gating, outperforming RGB-only methods in challenging scenes.
-
Realizing Immersive Volumetric Video: A Multimodal Framework for 6-DoF VR Engagement
The paper presents a multimodal framework, dataset, and reconstruction pipeline to create immersive volumetric videos supporting large 6-DoF audiovisual interaction from real multi-view captures.
-
Appearance Decomposition Gaussian Splatting for Multi-Traversal Reconstruction
ADM-GS decomposes static background appearance into traversal-invariant material and traversal-dependent illumination via a frequency-separated neural light field, yielding +0.98 dB PSNR gains and better cross-traversal consistency on Argoverse 2 and Waymo data.
-
Immunizing 3D Gaussian Generative Models Against Unauthorized Fine-Tuning via Attribute-Space Traps
GaussLock embeds traps targeting position, scale, rotation, opacity, and color in 3D Gaussian models to degrade unauthorized fine-tunes while preserving authorized performance.
-
High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models
Diffusion models reconstruct high-resolution 3D cardiac ultrasound volumes from heavily undersampled elevation planes and outperform traditional interpolation and supervised deep learning baselines.
-
AIR: Amortized Image Reconstruction Framework for Self-Supervised Feed-Forward 2D Gaussian Splatting
AIR amortizes 2D Gaussian splatting into a self-supervised feed-forward network via residual stages, explicit stage control, and Predict-Optimize-Distill training.
-
LANCE: Locally Adaptive Neural Context Estimation for Overfitted Image Compression
LANCE extends OIC frameworks with a spatial hyperprior and predictive coding scheme, reporting BD-rate gains of 1.4-3% over Cool-Chic 4.0 on Kodak and CLIC.
-
Spark3R: Asymmetric Token Reduction Makes Fast Feed-Forward 3D Reconstruction
Spark3R achieves up to 28x speedup on 1000-frame 3D reconstruction inputs by asymmetrically reducing query and key-value tokens in Vision Transformers while keeping competitive quality.
-
GA-GS: Generation-Assisted Gaussian Splatting for Static Scene Reconstruction
GA-GS uses motion segmentation, diffusion-based inpainting for pseudo-ground-truth, and per-Gaussian authenticity scalars to achieve SOTA static scene reconstruction from videos with dynamic occlusions.
-
TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction
TFusionOcc uses a family of Student's t-distribution T-primitives and a T-mixture model for multi-sensor 3D occupancy prediction, reporting state-of-the-art results on nuScenes.
-
MapRF: Weakly Supervised Online HD Map Construction via NeRF-Guided Self-Training
MapRF reaches about 75% of fully supervised HD map accuracy on Argoverse 2 and nuScenes by generating view-consistent pseudo labels via a NeRF conditioned on map predictions and refining them with Map-to-Ray Matching in self-training.
-
Lumos3D: A Single-Forward Framework for Low-Light 3D Scene Restoration
Lumos3D enables pose-free single-forward restoration of low-light 3D scenes via cross-illumination distillation from a teacher network and a custom Lumos loss on 3D Gaussians.
-
Ray-driven Spectral CT Reconstruction Based on Neural Base-Material Fields
A ray-driven neural base-material field model parameterizes attenuation coefficients as continuous implicit functions and uses auto-differentiation to solve spectral CT reconstruction.
-
Forecast-aware Gaussian Splatting for Predictive 3D Representation in Language-Guided Pick-and-Place Manipulation
Forecast-GS predicts task-completed 3D states via Gaussian splatting to achieve higher success rates than baselines in real-world language-conditioned manipulation tasks.
-
A Geometric Algebra-informed NeRF Framework for Generalizable Wireless Channel Prediction
GAI-NeRF combines geometric algebra attention and an adaptive ray tracing module inside a NeRF model to deliver more accurate and generalizable wireless channel predictions across varied indoor environments.
-
Interactive Augmented Reality-enabled Outdoor Scene Visualization For Enhanced Real-time Disaster Response
An AR system using 3D Gaussian Splatting, WIM navigation, and semantic POIs enables real-time disaster scene visualization with high usability and acceptance in preliminary user tests.
-
LGDWT-GS: Local and Global Discrete Wavelet-Regularized 3D Gaussian Splatting for Sparse-View Scene Reconstruction
LGDWT-GS adds local and global discrete wavelet regularization to 3D Gaussian Splatting for sharper, more stable sparse-view reconstructions and releases a multispectral plant dataset with standardized benchmarks.
-
Bridging the Ex-Vivo to In-Vivo Gap: Synthetic Priors for Monocular Depth Estimation in Specular Surgical Environments
Adapting Depth Anything V2 with DV-LORA bridges the ex-vivo to in-vivo gap in monocular depth estimation for specular surgical environments, achieving SOTA on SCARED and superior results on new ROCAL-T 90 dataset.
-
Improving 3D Gaussian Splatting Compression by Scene-Adaptive Lattice Vector Quantization
Scene-adaptive lattice vector quantization improves rate-distortion performance of 3DGS compression over uniform scalar quantization while adding little overhead and supporting multiple bit rates from one trained model.
-
AnimateAnyMesh++: A Flexible 4D Foundation Model for High-Fidelity Text-Driven Mesh Animation
AnimateAnyMesh++ animates arbitrary 3D meshes from text using an expanded 300K-identity DyMesh-XL dataset, a power-law topology-aware DyMeshVAE-Flex, and a variable-length rectified-flow generator to produce semantically accurate, temporally coherent animations in seconds.
-
Safe Navigation using Neural Radiance Fields via Reachable Sets
Safe robot navigation in obstacle-rich environments is demonstrated in simulation by representing obstacles with NeRFs and enforcing safety via reachable-set constraints inside a linear-matrix-inequality optimal-control formulation.
-
Implicit Neural Representations: A Signal Processing Perspective
INRs parameterize signals as neural networks to enable continuous representations, analytical differentiation, and adaptive approximation spaces that address spectral bias through specialized activations and structured encodings.
-
A Tutorial on Learning-Based Radio Map Construction: Data, Paradigms, and Physics-Awareness
A tutorial organizes learning-based radio map construction around data sources, neural architectures, and physics-awareness integration for wireless environments.