{"paper":{"title":"AIM: Adversarial Information Masking for Faithfulness Evaluation of Saliency Maps","license":"http://creativecommons.org/licenses/by-nc-nd/4.0/","headline":"AIM uses adversarial feature replacement to evaluate saliency map faithfulness with less masking bias.","cross_cats":["cs.CV"],"primary_cat":"cs.LG","authors_text":"Chia-Ying Hsieh, Chun-Shu Wei, Hsin-Yuan Fang","submitted_at":"2026-05-16T09:36:58Z","abstract_excerpt":"Post-hoc saliency methods are widely used to interpret deep neural networks, but their faithfulness is difficult to evaluate reliably. Existing evaluations mask features according to saliency-induced feature ordering and measure performance degradation, but this degradation can be confounded by the masking operator: zero masking may create out-of-distribution artifacts, while interpolation-based masking may preserve residual predictive information. We propose Adversarial Information Masking (AIM), a saliency-guided adversarial feature replacement framework for evaluating both saliency-map fait"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Experiments on image, audio, and EEG tasks suggest that AIM reduces masking-induced bias compared with zero and interpolation-based masking, while revealing modality-dependent differences between signed and unsigned attributions.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The adversarial counterpart of the input can be generated such that feature replacement removes predictive information without introducing new confounding artifacts or residual signals that affect the faithfulness measurement.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"AIM is a new saliency-guided adversarial feature replacement method to evaluate faithfulness of saliency maps and reliability of masking operators on image, audio, and EEG tasks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"AIM uses adversarial feature replacement to evaluate saliency map faithfulness with less masking bias.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"8e398304c0617e6acfa117b36375ffb39ae9eabc6965f480be21a32d909f4449"},"source":{"id":"2605.16905","kind":"arxiv","version":1},"verdict":{"id":"7859b897-c28b-48d3-89c8-e33c7e5bf22b","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T20:20:05.738400Z","strongest_claim":"Experiments on image, audio, and EEG tasks suggest that AIM reduces masking-induced bias compared with zero and interpolation-based masking, while revealing modality-dependent differences between signed and unsigned attributions.","one_line_summary":"AIM is a new saliency-guided adversarial feature replacement method to evaluate faithfulness of saliency maps and reliability of masking operators on image, audio, and EEG tasks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The adversarial counterpart of the input can be generated such that feature replacement removes predictive information without introducing new confounding artifacts or residual signals that affect the faithfulness measurement.","pith_extraction_headline":"AIM uses adversarial feature replacement to evaluate saliency map faithfulness with less masking bias."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.16905/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"cited_work_retraction","ran_at":"2026-05-19T20:52:07.546928Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T20:31:40.108583Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T20:31:19.106381Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"claim_evidence","ran_at":"2026-05-19T18:41:56.273609Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T18:33:26.352970Z","status":"skipped","version":"1.0.0","findings_count":0}],"snapshot_sha256":"57e21711284585bd058094597733eabbda66c701fa15a67054d52bf8ac51a12a"},"references":{"count":76,"sample":[{"doi":"","year":2014,"title":"Visualizing and understanding convolutional networks","work_id":"80d9e54b-65a8-44c0-9742-a5474f5e7f8f","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2016,"title":"Evaluating the visualization of what a deep neural network has learned.IEEE Transactions on Neural Networks and Learning Systems, 28(11):2660–2673, 2016","work_id":"2543e86a-2c10-4595-a293-9e4da0fa05ec","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"A unified approach to interpreting model predictions","work_id":"bcddc314-fe48-4632-bd5c-a5d30175a310","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2017,"title":"Towards better understanding of gradient-based attribution methods for deep neural networks.arXiv preprint arXiv:1711.06104","work_id":"be0ee150-9b01-455f-8764-dd5b18aee12b","ref_index":4,"cited_arxiv_id":"1711.06104","is_internal_anchor":true},{"doi":"","year":2017,"title":"right to explanation","work_id":"7e743ce2-8e35-43a8-89c7-92e4ee23ea5e","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":76,"snapshot_sha256":"4d978ea3cd6294f26d116f46910302600a34a0b5661a5c694ace3d36bf3bbb0f","internal_anchors":5},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}