Voxify3D generates voxel art from 3D meshes via orthographic pixel supervision, patch-based CLIP alignment, and palette-constrained Gumbel-Softmax quantization, achieving 37.12 CLIP-IQA and 77.90% user preference.
arXiv preprint arXiv:2405.20343 (2024)
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 9verdicts
UNVERDICTED 9roles
dataset 1polarities
use dataset 1representative citing papers
Materialist performs single-image inverse rendering via neural-initialized progressive differentiable rendering to enable physically consistent material editing, object insertion, relighting, and transparency edits without full scene geometry.
HiFiVe is a training-free framework using an auto-regressive texture refinement pipeline with depth-based warping, multi-view fusion, and symmetry to enhance both texture and geometry fidelity in vehicle generation from 2D priors.
ROAR-3D adds a token-wise view router and dual-stream attention to pretrained single-view 3D generators so they can use arbitrary unposed images for higher-fidelity output.
SegviGen shows pretrained 3D generative models can be repurposed for part segmentation via voxel colorization, beating prior methods by 40% interactively and 15% on full segmentation using only 0.32% of labeled data.
MV-SAM3D adds multi-view fusion via multi-diffusion with attention-entropy and visibility weighting plus physics-aware optimization to improve fidelity and physical plausibility in layout-aware 3D generation.
TripoSG generates high-fidelity 3D meshes from input images via a large-scale rectified flow transformer and hybrid-trained 3D VAE on a custom 2-million-sample dataset, claiming state-of-the-art fidelity and generalization.
3DCarGen synthesizes 3D-consistent multi-view images from one input photo, builds a coarse 3D Gaussian representation, then generates arbitrary views and recovers detailed meshes with color-normal optimization for real-world car images.
Dual-stream EEG decoder separates identity and orientation to support 3D reconstruction from neural signals via circular regression and conditioned diffusion.
citing papers explorer
-
HiFiVe: High-Fidelity Vehicle Generation Leveraging Auto-Regressive 2D Generative Priors
HiFiVe is a training-free framework using an auto-regressive texture refinement pipeline with depth-based warping, multi-view fusion, and symmetry to enhance both texture and geometry fidelity in vehicle generation from 2D priors.
-
ROAR-3D: Routing Arbitrary Views for High-Fidelity 3D Generation
ROAR-3D adds a token-wise view router and dual-stream attention to pretrained single-view 3D generators so they can use arbitrary unposed images for higher-fidelity output.
-
SegviGen: Repurposing 3D Generative Model for Part Segmentation
SegviGen shows pretrained 3D generative models can be repurposed for part segmentation via voxel colorization, beating prior methods by 40% interactively and 15% on full segmentation using only 0.32% of labeled data.
-
MV-SAM3D: Adaptive Multi-View Fusion for Layout-Aware 3D Generation
MV-SAM3D adds multi-view fusion via multi-diffusion with attention-entropy and visibility weighting plus physics-aware optimization to improve fidelity and physical plausibility in layout-aware 3D generation.
-
3DCarGen: Scalable 3D Car Generation via 3D-consistent Multi-view Synthesis
3DCarGen synthesizes 3D-consistent multi-view images from one input photo, builds a coarse 3D Gaussian representation, then generates arbitrary views and recovers detailed meshes with color-normal optimization for real-world car images.
-
Dual-Stream EEG Decoding for 3D Visual Perception
Dual-stream EEG decoder separates identity and orientation to support 3D reconstruction from neural signals via circular regression and conditioned diffusion.