The paper presents a multimodal framework, dataset, and reconstruction pipeline to create immersive volumetric videos supporting large 6-DoF audiovisual interaction from real multi-view captures.
The unreasonable effectiveness of deep features as a perceptual metric
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7verdicts
UNVERDICTED 7roles
method 1polarities
use method 1representative citing papers
A dual-axis taxonomy classifies image degradations by causal source and perceptual effect, with a severity quantification layer using standard quality metrics, demonstrated via a COCO-based object detector robustness benchmark.
CLIMB generates controllable longitudinal brain MRI images from baseline scans using a Mamba-based latent diffusion model and Gaussian-aligned autoencoder, reporting SSIM 0.9433 on the ADNI dataset of 6306 scans.
Multi-level DWT frequency modulation in 3DGS reduces Gaussian counts by recursive low-frequency decomposition and a single scaling parameter while preserving rendering quality.
DecoRec decomposes single-view 3D scene reconstruction into per-object diffusion reconstructions followed by a differentiable rendering and diffusion-guided merging pipeline.
DDiffusion uses semantic retrieval on prompt embeddings and localized editing inside the diffusion process to suppress NSFW content while avoiding binary allow/block signals.
MM-GS combines per-instance multi-view fusion with scene-level interaction modeling on 3D Gaussians to render high-fidelity multi-human multi-object scenes from sparse views.
citing papers explorer
-
Realizing Immersive Volumetric Video: A Multimodal Framework for 6-DoF VR Engagement
The paper presents a multimodal framework, dataset, and reconstruction pipeline to create immersive volumetric videos supporting large 6-DoF audiovisual interaction from real multi-view captures.
-
A Causally Grounded Taxonomy for Image Degradation Robustness Evaluation
A dual-axis taxonomy classifies image degradations by causal source and perceptual effect, with a severity quantification layer using standard quality metrics, demonstrated via a COCO-based object detector robustness benchmark.
-
CLIMB: Controllable Longitudinal Brain Image Generation using Mamba-based Latent Diffusion Model and Gaussian-aligned Autoencoder
CLIMB generates controllable longitudinal brain MRI images from baseline scans using a Mamba-based latent diffusion model and Gaussian-aligned autoencoder, reporting SSIM 0.9433 on the ADNI dataset of 6306 scans.
-
Learnable Multi-level Discrete Wavelet Transforms for 3D Gaussian Splatting Frequency Modulation
Multi-level DWT frequency modulation in 3DGS reduces Gaussian counts by recursive low-frequency decomposition and a single scaling parameter while preserving rendering quality.
-
DecoRec: Decomposed 3D Scene Reconstruction from Single-View Images via Object-Level Diffusion
DecoRec decomposes single-view 3D scene reconstruction into per-object diffusion reconstructions followed by a differentiable rendering and diffusion-guided merging pipeline.
-
Disciplined Diffusion: Text-to-Image Diffusion Model against NSFW Generation
DDiffusion uses semantic retrieval on prompt embeddings and localized editing inside the diffusion process to suppress NSFW content while avoiding binary allow/block signals.
-
Rendering Multi-Human and Multi-Object with 3D Gaussian Splatting
MM-GS combines per-instance multi-view fusion with scene-level interaction modeling on 3D Gaussians to render high-fidelity multi-human multi-object scenes from sparse views.