{"paper":{"title":"ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Vision-language models can unlearn specific concepts by decomposing images into sparse semantic combinations and suppressing only the targets.","cross_cats":["cs.AI","cs.LG"],"primary_cat":"cs.CV","authors_text":"Jing Lin, Junhao Dong, Li Xu, Piotr Koniusz, Shen Lin","submitted_at":"2026-05-14T03:22:12Z","abstract_excerpt":"Machine unlearning in Vision-Language Models (VLMs) is typically performed at the image or instance level, making it difficult to precisely remove target knowledge without affecting unrelated semantics. This issue is especially pronounced since a single image often contains multiple entangled concepts, including both target concepts to be forgotten and contextual information that should be preserved. In this paper, we propose an interpretable concept-level unlearning framework for VLMs, which constructs a compact task-specific concept vocabulary from the forgetting set using a multimodal large"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Based on this decomposition, our method formulates unlearning as concept-level optimization, where target concepts are selectively suppressed while intra-instance non-target semantics and global cross-modal knowledge are preserved. Extensive experiments across both in-domain and out-of-domain forgetting settings demonstrate that our method enables more comprehensive target forgetting, better preserves non-target knowledge within the same image, and maintains competitive model utility compared with existing VLM unlearning methods.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Visual representations can be decomposed into sparse, nonnegative combinations of semantic concepts from a compact task-specific vocabulary, providing an explicit interface for fine-grained knowledge manipulation without affecting unrelated semantics.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"ICED decomposes visual features into interpretable concepts to enable selective unlearning of target knowledge in VLMs while preserving non-target semantics and model utility.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Vision-language models can unlearn specific concepts by decomposing images into sparse semantic combinations and suppressing only the targets.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"db919c7e5a568620f817eb08713d7c12cba22118ba29b425831a2cb1ceaf87ca"},"source":{"id":"2605.14309","kind":"arxiv","version":1},"verdict":{"id":"23c34ff8-855c-49c4-8a73-2293c2ef291b","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T02:32:10.764681Z","strongest_claim":"Based on this decomposition, our method formulates unlearning as concept-level optimization, where target concepts are selectively suppressed while intra-instance non-target semantics and global cross-modal knowledge are preserved. Extensive experiments across both in-domain and out-of-domain forgetting settings demonstrate that our method enables more comprehensive target forgetting, better preserves non-target knowledge within the same image, and maintains competitive model utility compared with existing VLM unlearning methods.","one_line_summary":"ICED decomposes visual features into interpretable concepts to enable selective unlearning of target knowledge in VLMs while preserving non-target semantics and model utility.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Visual representations can be decomposed into sparse, nonnegative combinations of semantic concepts from a compact task-specific vocabulary, providing an explicit interface for fine-grained knowledge manipulation without affecting unrelated semantics.","pith_extraction_headline":"Vision-language models can unlearn specific concepts by decomposing images into sparse semantic combinations and suppressing only the targets."},"references":{"count":36,"sample":[{"doi":"","year":2021,"title":"Learning transferable visual models from natural language supervi- sion,","work_id":"07a68397-0cae-4b42-869b-feb434d4e372","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Trustworthy ai: From principles to practices,","work_id":"9cd10867-cc25-4e72-9c4b-3fda01197bb4","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"Allies teach better than enemies: Inverse adversaries for robust knowledge distillation,","work_id":"99e74045-d1e0-4c83-bb1b-2126f4f62a06","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"Tug-of-war no more: Harmonizing accuracy and robustness in vision-language models via stability-aware task vector merging,","work_id":"60817445-b214-4c28-94e2-95d4fe5a95af","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher,","work_id":"27395386-03c5-4bf5-a86e-72ffb49e0842","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":36,"snapshot_sha256":"16be0492b35438dfacab3605b2dddac45c89e69c539b310d148263b0ac3a14b2","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}