{"paper":{"title":"Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"Targeted conditioning on Stable Diffusion lets a model turn one image into geometrically consistent multi-view outputs.","cross_cats":["cs.GR"],"primary_cat":"cs.CV","authors_text":"Chao Xu, Chong Zeng, Hansheng Chen, Hao Su, Linghao Chen, Minghua Liu, Ruoxi Shi, Xinyue Wei, Zhuoyang Zhang","submitted_at":"2023-10-23T17:18:59Z","abstract_excerpt":"We report Zero123++, an image-conditioned diffusion model for generating 3D-consistent multi-view images from a single input view. To take full advantage of pretrained 2D generative priors, we develop various conditioning and training schemes to minimize the effort of finetuning from off-the-shelf image diffusion models such as Stable Diffusion. Zero123++ excels in producing high-quality, consistent multi-view images from a single image, overcoming common issues like texture degradation and geometric misalignment. Furthermore, we showcase the feasibility of training a ControlNet on Zero123++ f"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Zero123++ excels in producing high-quality, consistent multi-view images from a single image, overcoming common issues like texture degradation and geometric misalignment.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the proposed conditioning and training schemes applied to off-the-shelf Stable Diffusion will reliably produce 3D geometric and texture consistency without new failure modes not captured in the reported examples.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Zero123++ produces high-quality 3D-consistent multi-view images from a single input by fine-tuning Stable Diffusion with targeted conditioning and training methods.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Targeted conditioning on Stable Diffusion lets a model turn one image into geometrically consistent multi-view outputs.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"27309c20f661e36843a3a6d0bd4f0488b55ad7edcb966c6323c76780c4d6b861"},"source":{"id":"2310.15110","kind":"arxiv","version":1},"verdict":{"id":"6ccedf79-be9d-4db0-a380-f29a2dd2850f","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T22:02:36.780458Z","strongest_claim":"Zero123++ excels in producing high-quality, consistent multi-view images from a single image, overcoming common issues like texture degradation and geometric misalignment.","one_line_summary":"Zero123++ produces high-quality 3D-consistent multi-view images from a single input by fine-tuning Stable Diffusion with targeted conditioning and training methods.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the proposed conditioning and training schemes applied to off-the-shelf Stable Diffusion will reliably produce 3D geometric and texture consistency without new failure modes not captured in the reported examples.","pith_extraction_headline":"Targeted conditioning on Stable Diffusion lets a model turn one image into geometrically consistent multi-view outputs."},"references":{"count":26,"sample":[{"doi":"","year":2015,"title":"ShapeNet: An Information-Rich 3D Model Repository","work_id":"b2ac5b60-daa9-435b-9369-12271e126edd","ref_index":1,"cited_arxiv_id":"1512.03012","is_internal_anchor":true},{"doi":"","year":2023,"title":"On the importance of noise scheduling for diffusion models","work_id":"1783cbc7-505c-4ebc-82f3-86877ef131b3","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Objaverse-XL: A Universe of 10M+ 3D Objects","work_id":"1c5475ad-d1ec-4de1-8670-b8cd5a4c85d3","ref_index":3,"cited_arxiv_id":"2307.05663","is_internal_anchor":true},{"doi":"","year":2023,"title":"Objaverse: A universe of annotated 3d objects","work_id":"d953dbea-284d-4b7a-a33b-2454de5cc2e6","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Efficient diffu- sion training via min-snr weighting strategy","work_id":"f908ce8d-6164-45c4-afa4-11df4ae81e42","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":26,"snapshot_sha256":"e3d112cbf41fd9359e870781ecffd0218dafb2e0d8a945725ff827a0bbf9648e","internal_anchors":12},"formal_canon":{"evidence_count":2,"snapshot_sha256":"fdadcfd782e68c99c15a1ec9bee7e9910e1b542ce7634f07aaa399adaa604555"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}