{"paper":{"title":"DreamEdit3D: Personalization of Multi-View Diffusion Models for 3D Editing","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Personalizing multi-view diffusion models enables text-guided 3D editing with object-level control and preserved consistency.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Jinxin Ai, Matthias Nie{\\ss}ner, Ziya Erko\\c{c}","submitted_at":"2026-05-16T13:21:22Z","abstract_excerpt":"While 2D diffusion models have achieved remarkable success in identity-preserving personalization, extending this capability to 3D assets remains a significant challenge due to the complexities of multi-view consistency and spatial control. Inspired by these 2D advancements, we present a novel personalization method for text-guided 3D editing that enables compositional, object-level control through natural language. Given a 3D input, we render orthogonal views and extract object-level segmentation masks to isolate semantic components. We then learn distinct token embeddings for each component "},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Extensive evaluations across diverse editing scenarios demonstrate that our method successfully transfers the flexibility of 2D personalization to 3D, achieving state-of-the-art edit faithfulness and identity preservation compared to existing baselines.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The approach assumes that rendering orthogonal views and extracting object-level segmentation masks will allow learning of distinct, composable token embeddings that preserve multi-view consistency when combined with editing prompts (stated in the abstract description of the input processing and two-phase optimization).","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"DreamEdit3D learns separate token embeddings for segmented object components via two-phase multi-view optimization to enable text-guided 3D editing with consistent image generation and mesh reconstruction.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Personalizing multi-view diffusion models enables text-guided 3D editing with object-level control and preserved consistency.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"783ed69174db829ca1bea16a8d4122d83c8a1b6f47cf9296e5eac67cf85bee83"},"source":{"id":"2605.16990","kind":"arxiv","version":1},"verdict":{"id":"17c95cdb-2d62-405d-8cf6-72f8fcc2194e","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T20:24:04.921343Z","strongest_claim":"Extensive evaluations across diverse editing scenarios demonstrate that our method successfully transfers the flexibility of 2D personalization to 3D, achieving state-of-the-art edit faithfulness and identity preservation compared to existing baselines.","one_line_summary":"DreamEdit3D learns separate token embeddings for segmented object components via two-phase multi-view optimization to enable text-guided 3D editing with consistent image generation and mesh reconstruction.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The approach assumes that rendering orthogonal views and extracting object-level segmentation masks will allow learning of distinct, composable token embeddings that preserve multi-view consistency when combined with editing prompts (stated in the abstract description of the input processing and two-phase optimization).","pith_extraction_headline":"Personalizing multi-view diffusion models enables text-guided 3D editing with object-level control and preserved consistency."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.16990/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"doi_compliance","ran_at":"2026-05-19T20:31:33.526693Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T20:31:19.032114Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"citation_quote_validity","ran_at":"2026-05-19T19:50:03.341353Z","status":"skipped","version":"0.1.0","findings_count":0},{"name":"cited_work_retraction","ran_at":"2026-05-19T19:23:44.291164Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"claim_evidence","ran_at":"2026-05-19T18:41:56.207514Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T18:33:26.296066Z","status":"skipped","version":"1.0.0","findings_count":0}],"snapshot_sha256":"5500fbc624935198dcf6643432d3e8659cc99d55986c70f526b865ddd6ead77c"},"references":{"count":52,"sample":[{"doi":"10.1145/3610548.3618154","year":2023,"title":"In: SIGGRAPH Asia 2023 Conference Papers","work_id":"e9011604-8e04-4abb-8699-4ea5e5063195","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"arXiv preprint arXiv:2408.07009 (2024)","work_id":"a1dd317f-8300-4a79-a1d0-92ddd93fa983","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"In- stant3dit: Multiview inpainting for fast editing of 3d objects","work_id":"334cb1af-d967-4b42-8a43-91d705b2480a","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Betker, J., Goh, G., Jing, L., TimBrooks, Wang, J., Li, L., LongOuyang, Jun- tangZhuang, JoyceLee, YufeiGuo, WesamManassra, PrafullaDhariwal, CaseyChu, YunxinJiao, Ramesh, A.: Improving image generati","work_id":"dc86f290-6860-40cf-a624-8dc47ef83d80","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2022,"title":"In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition","work_id":"3abb7fd7-a4d5-4eec-a516-50482221a431","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":52,"snapshot_sha256":"2cdab04c5b2f980fcaa4eabb9b17030964c0d2a067b70a5579e437a16b0a5bf7","internal_anchors":14},"formal_canon":{"evidence_count":1,"snapshot_sha256":"19c9629b3e309564290db3bb6e7b9889a9f1c661d62965a2ddae0d50a0a8e56a"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}