{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2023:FMXLIZQA2ART2NQOKAFIIS2ZCQ","short_pith_number":"pith:FMXLIZQA","schema_version":"1.0","canonical_sha256":"2b2eb46600d0233d360e500a844b5914142a957fcb9bd7bee406437a9fd86469","source":{"kind":"arxiv","id":"2308.16512","version":4},"attestation_state":"computed","paper":{"title":"MVDream: Multi-view Diffusion for 3D Generation","license":"http://creativecommons.org/licenses/by/4.0/","headline":"A multi-view diffusion model trained on both 2D and 3D data acts as a generalizable 3D prior that improves consistency in text-to-3D generation.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Jianglong Ye, Kejie Li, Mai Long, Peng Wang, Xiao Yang, Yichun Shi","submitted_at":"2023-08-31T07:49:06Z","abstract_excerpt":"We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can achieve the generalizability of 2D diffusion models and the consistency of 3D renderings. We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods. It can also learn new concepts from a few 2"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2308.16512","kind":"arxiv","version":4},"metadata":{"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.CV","submitted_at":"2023-08-31T07:49:06Z","cross_cats_sorted":[],"title_canon_sha256":"27270046251efa8990a7ca4ed2d33c762601425c210d993eb51c6e174db65e07","abstract_canon_sha256":"86d2bbfa11e682f3b0b6d39426f8406902084f3e197a7e80234ff1f0c8640bde"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:53.011208Z","signature_b64":"icATlX3ryduVpX3fyvxz30MrCzJWvfjHksSUJ6zj9YIvzaiNi2LlXk3YLHTPeasyHc24dAZdPrSE645rZSIdCg==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"2b2eb46600d0233d360e500a844b5914142a957fcb9bd7bee406437a9fd86469","last_reissued_at":"2026-05-17T23:38:53.010570Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:53.010570Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"MVDream: Multi-view Diffusion for 3D Generation","license":"http://creativecommons.org/licenses/by/4.0/","headline":"A multi-view diffusion model trained on both 2D and 3D data acts as a generalizable 3D prior that improves consistency in text-to-3D generation.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Jianglong Ye, Kejie Li, Mai Long, Peng Wang, Xiao Yang, Yichun Shi","submitted_at":"2023-08-31T07:49:06Z","abstract_excerpt":"We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt. Learning from both 2D and 3D data, a multi-view diffusion model can achieve the generalizability of 2D diffusion models and the consistency of 3D renderings. We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods. It can also learn new concepts from a few 2"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That joint training on 2D and 3D data produces a prior that remains generalizable to novel text prompts and 3D shapes without overfitting to the specific 3D renderings used or sacrificing single-view quality.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"MVDream is a multi-view diffusion model that functions as a generalizable 3D prior, enabling more consistent text-to-3D generation and few-shot 3D concept learning from 2D examples.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A multi-view diffusion model trained on both 2D and 3D data acts as a generalizable 3D prior that improves consistency in text-to-3D generation.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"4d472f64f3c14e59151f739b6333b0fef61973fa7492b6460feae3086ce71405"},"source":{"id":"2308.16512","kind":"arxiv","version":4},"verdict":{"id":"0a18a511-cce3-431b-b005-758833fd1b28","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T08:31:50.509985Z","strongest_claim":"We demonstrate that such a multi-view diffusion model is implicitly a generalizable 3D prior agnostic to 3D representations. It can be applied to 3D generation via Score Distillation Sampling, significantly enhancing the consistency and stability of existing 2D-lifting methods.","one_line_summary":"MVDream is a multi-view diffusion model that functions as a generalizable 3D prior, enabling more consistent text-to-3D generation and few-shot 3D concept learning from 2D examples.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That joint training on 2D and 3D data produces a prior that remains generalizable to novel text prompts and 3D shapes without overfitting to the specific 3D renderings used or sacrificing single-view quality.","pith_extraction_headline":"A multi-view diffusion model trained on both 2D and 3D data acts as a generalizable 3D prior that improves consistency in text-to-3D generation."},"references":{"count":161,"sample":[{"doi":"","year":2023,"title":"https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0","work_id":"bca32d14-ca0e-45d5-aabb-29735f777c42","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"https://sketchfab.com/3d-models/popular","work_id":"b2699ee7-f789-4b50-968f-62e0fbf6f7a7","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"https://huggingface.co/DeepFloyd","work_id":"0804d3f6-1aaa-4ec4-a0fc-cbc41b1f14d6","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"https://lumalabs.ai/dashboard/imagine","work_id":"8ceb3039-f241-407f-b9d8-ce4b2e13ee17","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"https://huggingface.co/spaces/lambdalabs/stable-diffusion-image-variations","work_id":"5faab264-b3c9-45c5-80eb-71dc59ff0c18","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":161,"snapshot_sha256":"e6b46909af3c518bf1f71bab7612cb73cd6eb27d5d6f6c85a75393bc270793e4","internal_anchors":10},"formal_canon":{"evidence_count":2,"snapshot_sha256":"9909097488ee0cf5b31d5b3c6738472d570a3600b5c45167e05e0998051ec760"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2308.16512","created_at":"2026-05-17T23:38:53.010669+00:00"},{"alias_kind":"arxiv_version","alias_value":"2308.16512v4","created_at":"2026-05-17T23:38:53.010669+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2308.16512","created_at":"2026-05-17T23:38:53.010669+00:00"},{"alias_kind":"pith_short_12","alias_value":"FMXLIZQA2ART","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_16","alias_value":"FMXLIZQA2ART2NQO","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_8","alias_value":"FMXLIZQA","created_at":"2026-05-18T12:33:33.725879+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":44,"internal_anchor_count":44,"sample":[{"citing_arxiv_id":"2605.21489","citing_title":"Variance Reduction for Expectations with Diffusion Teachers","ref_index":70,"is_internal_anchor":true},{"citing_arxiv_id":"2401.16764","citing_title":"BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"2412.13111","citing_title":"Motion-2-To-3: Leveraging 2D Motion Data for 3D Motion Generations","ref_index":64,"is_internal_anchor":true},{"citing_arxiv_id":"2504.02316","citing_title":"ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2605.21472","citing_title":"Stream3D: Sequential Multi-View 3D Generation via Evidential Memory","ref_index":57,"is_internal_anchor":true},{"citing_arxiv_id":"2605.21489","citing_title":"Variance Reduction for Expectations with Diffusion Teachers","ref_index":70,"is_internal_anchor":true},{"citing_arxiv_id":"2605.18010","citing_title":"Functionalization via Structure Completion and Motion Rectification","ref_index":49,"is_internal_anchor":true},{"citing_arxiv_id":"2605.18132","citing_title":"Who Generated This 3D Asset? Learning Source Attribution for Generative 3D Models","ref_index":51,"is_internal_anchor":true},{"citing_arxiv_id":"2605.18052","citing_title":"Efficient 3D Content Reconstruction and Generation","ref_index":222,"is_internal_anchor":true},{"citing_arxiv_id":"2605.18365","citing_title":"GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation","ref_index":59,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16807","citing_title":"DecoRec: Decomposed 3D Scene Reconstruction from Single-View Images via Object-Level Diffusion","ref_index":71,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16937","citing_title":"DEVIS-GRPO: Unleashing GRPO on Dynamic Extreme View Synthesis","ref_index":25,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16990","citing_title":"DreamEdit3D: Personalization of Multi-View Diffusion Models for 3D Editing","ref_index":39,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16873","citing_title":"HAD: Hallucination-Aware Diffusion Priors for 3D Reconstruction","ref_index":36,"is_internal_anchor":true},{"citing_arxiv_id":"2509.07435","citing_title":"DreamLifting: A Plug-in Module Lifting MV Diffusion Models for 3D Asset Generation","ref_index":19,"is_internal_anchor":true},{"citing_arxiv_id":"2310.15110","citing_title":"Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model","ref_index":20,"is_internal_anchor":true},{"citing_arxiv_id":"2502.06608","citing_title":"TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models","ref_index":187,"is_internal_anchor":true},{"citing_arxiv_id":"2601.00285","citing_title":"SV-GS: Sparse View 4D Reconstruction with Skeleton-Driven Gaussian Splatting","ref_index":45,"is_internal_anchor":true},{"citing_arxiv_id":"2403.02151","citing_title":"TripoSR: Fast 3D Object Reconstruction from a Single Image","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2601.09211","citing_title":"Affostruction: 3D Affordance Grounding with Generative Reconstruction","ref_index":37,"is_internal_anchor":true},{"citing_arxiv_id":"2601.11194","citing_title":"ATATA: One Algorithm to Align Them All","ref_index":49,"is_internal_anchor":true},{"citing_arxiv_id":"2309.03453","citing_title":"SyncDreamer: Generating Multiview-consistent Images from a Single-view Image","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2309.16653","citing_title":"DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation","ref_index":141,"is_internal_anchor":true},{"citing_arxiv_id":"2603.00918","citing_title":"Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards","ref_index":61,"is_internal_anchor":true},{"citing_arxiv_id":"2605.13838","citing_title":"R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow","ref_index":108,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ","json":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ.json","graph_json":"https://pith.science/api/pith-number/FMXLIZQA2ART2NQOKAFIIS2ZCQ/graph.json","events_json":"https://pith.science/api/pith-number/FMXLIZQA2ART2NQOKAFIIS2ZCQ/events.json","paper":"https://pith.science/paper/FMXLIZQA"},"agent_actions":{"view_html":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ","download_json":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ.json","view_paper":"https://pith.science/paper/FMXLIZQA","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2308.16512&json=true","fetch_graph":"https://pith.science/api/pith-number/FMXLIZQA2ART2NQOKAFIIS2ZCQ/graph.json","fetch_events":"https://pith.science/api/pith-number/FMXLIZQA2ART2NQOKAFIIS2ZCQ/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ/action/timestamp_anchor","attest_storage":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ/action/storage_attestation","attest_author":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ/action/author_attestation","sign_citation":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ/action/citation_signature","submit_replication":"https://pith.science/pith/FMXLIZQA2ART2NQOKAFIIS2ZCQ/action/replication_record"}},"created_at":"2026-05-17T23:38:53.010669+00:00","updated_at":"2026-05-17T23:38:53.010669+00:00"}