A single-stage pixel-space diffusion model for direct 3D Gaussian Splat generation that bypasses latent compression and adds geometric supervisions to outperform prior multi-stage methods.
Mvdif- fusion: Enabling holistic multi-view image generation with correspondence-aware diffusion.arXiv preprint arXiv:2307.01097
9 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 9representative citing papers
TextHOI-3D generates text-conditioned 3D hand-object meshes using a VQ token space and CLIP-conditioned autoregressive multi-view prediction followed by joint mesh optimization, reporting large reductions in object CD and penetration volume versus single-view baselines on HO3D-derived data.
Viewpoint tokens learned on a mixed 3D-rendered and photorealistic dataset enable precise camera control in text-to-image generation while factorizing geometry from appearance and transferring to unseen object categories.
BoostDream refines coarse feed-forward text-to-3D assets via 3D distillation, multi-view SDS loss from a 2D diffusion model, and prompt-consistent normal maps to produce higher-quality results more efficiently than standard SDS.
SyncDreamer produces multiview-consistent images from a single input image by jointly modeling their distribution and synchronizing intermediate diffusion states via 3D-aware attention.
MVDream is a multi-view diffusion model that functions as a generalizable 3D prior, enabling more consistent text-to-3D generation and few-shot 3D concept learning from 2D examples.
Restore3D restores shape and texture of broken 3D objects via multi-view image refinement with a Mask Self-Perceiver and coarse-to-fine mesh reconstruction, outperforming baselines on synthetic and real benchmarks.
Native3D introduces a direct 3D scene generation method using unified mesh-texture representation and 3D REPA Loss for semantic alignment, claimed to outperform prior 2D-dependent approaches.
DecoRec decomposes single-view 3D scene reconstruction into per-object diffusion reconstructions followed by a differentiable rendering and diffusion-guided merging pipeline.
citing papers explorer
-
MVDream: Multi-view Diffusion for 3D Generation
MVDream is a multi-view diffusion model that functions as a generalizable 3D prior, enabling more consistent text-to-3D generation and few-shot 3D concept learning from 2D examples.