{"paper":{"title":"ClickRemoval: An Interactive Open-Source Tool for Object Removal in Diffusion Models","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Clicks alone let users remove objects from images in pretrained diffusion models","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Ledun Zhang, Xinying Yao, Xufei Zhuang, Yatu Ji","submitted_at":"2026-05-14T06:56:53Z","abstract_excerpt":"Existing object removal tools often rely on manual masks or text prompts, making precise removal difficult for non-expert users in complex scenes and often leading to incomplete removal or unnatural background completion. To address this issue, we present ClickRemoval, an open-source interactive object removal tool built on pretrained Stable Diffusion models and driven solely by user clicks. Without additional training, hand-drawn masks, or text descriptions, ClickRemoval localizes target objects and restores the background through self-attention modulation during denoising. Experiments show t"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"ClickRemoval localizes target objects and restores the background through self-attention modulation during denoising without additional training, hand-drawn masks, or text descriptions, achieving competitive results across quantitative metrics and user studies.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That user clicks alone, combined with self-attention modulation in a pretrained Stable Diffusion model, are sufficient to accurately localize objects and produce natural background completions in complex scenes.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"ClickRemoval delivers click-driven object removal and background restoration in diffusion models through self-attention modulation without additional training or inputs.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Clicks alone let users remove objects from images in pretrained diffusion models","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"870d9af08fee772987712d274b31fdafe7ba6f975cf91bb75c7a8b5064492bbc"},"source":{"id":"2605.14461","kind":"arxiv","version":1},"verdict":{"id":"deeaec04-eef5-47b2-bbd3-1e3816176e42","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T02:44:52.808238Z","strongest_claim":"ClickRemoval localizes target objects and restores the background through self-attention modulation during denoising without additional training, hand-drawn masks, or text descriptions, achieving competitive results across quantitative metrics and user studies.","one_line_summary":"ClickRemoval delivers click-driven object removal and background restoration in diffusion models through self-attention modulation without additional training or inputs.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That user clicks alone, combined with self-attention modulation in a pretrained Stable Diffusion model, are sufficient to accurately localize objects and produce natural background completions in complex scenes.","pith_extraction_headline":"Clicks alone let users remove objects from images in pretrained diffusion models"},"references":{"count":15,"sample":[{"doi":"","year":2024,"title":"Aditya Chandrasekar, Goirik Chakrabarty, Jai Bardhan, Ramya Hebbalaguppe, and Prathosh AP. 2024. Remove: A reference-free metric for object erasure. InPro- ceedings of the IEEE/CVF Conference on Compu","work_id":"142283c7-6131-42ba-bd0c-904cc7032508","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Nima Fathi, Amar Kumar, and Tal Arbel. 2025. Aura: A multi-modal medical agent for understanding, reasoning and annotation. InInternational Workshop on Agentic AI for Medicine. 105–114","work_id":"6d1b0e57-1367-48e5-a973-7ce0dda59672","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Tariq Berrada Ifriqi, Adriana Romero-Soriano, Michal Drozdzal, Jakob Verbeek, and Karteek Alahari. 2025. Entropy Rectifying Guidance for Diffusion and Flow Models. InNeurIPS 2025-Thirty-ninth Conferen","work_id":"5fa38a74-a531-4809-8028-57a347b870a8","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"Xuan Ju, Xian Liu, Xintao Wang, Yuxuan Bian, Ying Shan, and Qiang Xu. 2024. Brushnet: A plug-and-play image inpainting model with decomposed dual- branch diffusion. InEuropean Conference on Computer V","work_id":"84fd47ed-f9ff-462b-b1ef-3e85422662af","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Markus Karmann and Onay Urfalioglu. 2025. Repurposing stable diffusion attention for training-free unsupervised interactive segmentation. InProceedings of the Computer Vision and Pattern Recognition C","work_id":"9b4198be-ddba-4818-bef6-7dd84e3edb02","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":15,"snapshot_sha256":"a7125be0ef49186b070aa97230f10c53d7ea48d1fccfdb8e6413570f5e4c34dc","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}