Architecture-Aware Explanation Auditing for Industrial Visual Inspection
Pith reviewed 2026-05-20 20:58 UTC · model grok-4.3
The pith
Faithfulness of heatmap explanations is bounded by structural match to the model's native decision readout.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under a three-seed zero-fill perturbation protocol on the WM-811K dataset, ViT-Tiny with Attention Rollout attains Deletion AUC 0.211 while Swin-Tiny, ResNet18+CBAM, and DenseNet121+Grad-CAM range from 0.432 to 0.525. Swin-Tiny's spatial hierarchy makes it compatible with Grad-CAM despite its Transformer architecture, demonstrating that readout structure rather than architecture family controls the gap. RISE compresses all models to roughly 0.1, establishing that native readout supplies compatibility rather than an optimality guarantee. The ordering reverses under blur-fill perturbation, confirming that faithfulness is a joint property of the model-explanation-perturbation triple.
What carries the argument
The native-readout hypothesis, which states that perturbation-based faithfulness of an explanation is bounded by its structural distance from the model's native decision mechanism.
If this is right
- Explanation methods should be selected or co-designed according to a model's specific readout structure rather than its broad architecture family.
- Deployed heatmaps for industrial inspection should be accompanied by quantitative faithfulness scores such as Deletion AUC.
- Faithfulness rankings cannot be trusted without testing multiple perturbation operators.
- Audit results on one dataset or task do not automatically generalize to others.
Where Pith is reading between the lines
- The same audit could be applied to medical or autonomous driving models to check whether default explainers align with internal decision paths.
- Developers might create readout-adaptive explainers that switch mechanisms based on the target model architecture.
- Multiple perturbation protocols could become a standard requirement for certifying explanations in regulated inspection systems.
Load-bearing premise
That Deletion AUC under a three-seed zero-fill perturbation protocol reliably measures explanation faithfulness across model families and datasets.
What would settle it
An observation that an explanation method structurally distant from the native readout still yields lower Deletion AUC than a close match, or that the performance ordering fails to reverse under a blur-fill baseline.
Figures
read the original abstract
Industrial visual inspection systems increasingly rely on deep classifiers whose heatmap explanations may appear visually plausible while failing to identify the image regions that actually drive model decisions. This paper operationalizes an architecture-aware explanation audit protocol grounded in the native-readout hypothesis: the perturbation-based faithfulness of an explanation method is bounded by its structural distance from the model's native decision mechanism. On WM-811K wafer maps (9 classes, 172k images) under a three-seed zero-fill perturbation protocol, ViT-Tiny + Attention Rollout attains Deletion AUC 0.211 against 0.432-0.525 for Swin-Tiny / ResNet18+CBAM / DenseNet121 + Grad-CAM (abs(Cohen's d) > 1.1), despite lower classification accuracy. Swin-Tiny disentangles architecture family from readout structure: despite being a Transformer, its spatial feature-map hierarchy makes it Grad-CAM compatible, showing that the operative factor is readout structure rather than architecture family. A model-agnostic control (RISE) compresses all families to Deletion AUC about 0.1, indicating the gap arises from the explainer pathway; notably, RISE outperforms all native methods, so native readout is a compatibility principle rather than an optimality guarantee. A blur-fill sensitivity analysis shows that the family ordering reverses under a different perturbation baseline, reinforcing that faithfulness rankings are joint properties of (model, explainer, perturbation operator) triples. An exploratory boundary-condition study on MVTec AD (pretrained models) indicates that audit results are dataset/task dependent and identifies conditions requiring qualification. The protocol yields actionable guidance: explanation pathways should be co-designed with model architectures based on readout structure, and deployed heatmaps should be accompanied by quantitative faithfulness metrics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces an architecture-aware explanation auditing protocol for industrial visual inspection, centered on the native-readout hypothesis that the perturbation-based faithfulness of an explanation method is bounded by its structural distance from the model's native decision mechanism. Using WM-811K wafer maps under a three-seed zero-fill perturbation protocol, it reports that ViT-Tiny with Attention Rollout achieves Deletion AUC of 0.211 (vs. 0.432-0.525 for other model-explanation pairs, with |Cohen's d| > 1.1), while a RISE control compresses all to ~0.1 and a blur-fill reversal reverses family orderings. Swin-Tiny is used to separate architecture family from readout structure, and an MVTec AD study indicates dataset dependence. The work concludes with guidance to co-design explanation pathways with model readout structures and to report quantitative faithfulness metrics.
Significance. If the empirical findings hold under broader validation, the work offers practical value for safety-critical industrial applications by shifting focus from generic explainers to architecture-compatible ones. Strengths include explicit effect-size reporting, model-agnostic controls (RISE), sensitivity analysis across perturbation operators, and acknowledgment that rankings are joint properties of (model, explainer, perturbation) triples rather than universal. The protocol is falsifiable and yields actionable deployment recommendations.
major comments (2)
- [§3] §3 (perturbation protocol): the central quantitative claims rest on Deletion AUC under a fixed three-seed zero-fill protocol, yet the manuscript provides no per-seed variance, confidence intervals, or ablation on seed count; this leaves open whether the reported gaps (0.211 vs. 0.432-0.525) are robust or sensitive to the specific randomization.
- [§4.2] §4.2 (Swin-Tiny disentanglement): the claim that readout structure rather than architecture family is operative is supported by Swin-Tiny results, but the section does not quantify the structural distance metric used to define 'compatibility,' making it difficult to assess how generalizable the separation is beyond the tested models.
minor comments (2)
- [Abstract, §5] Abstract and §5: the MVTec AD boundary study is described as 'exploratory' and 'dataset-dependent,' but the manuscript does not specify the exact pretrained models or fine-tuning protocol used, which would aid reproducibility.
- [Introduction] Notation: 'native-readout hypothesis' is introduced without a formal definition or equation; a short mathematical statement of the bounded-faithfulness claim would improve precision.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for minor revision. The comments identify opportunities to strengthen the robustness and clarity of our quantitative claims. We address each major comment below, indicating where revisions will be incorporated.
read point-by-point responses
-
Referee: [§3] §3 (perturbation protocol): the central quantitative claims rest on Deletion AUC under a fixed three-seed zero-fill protocol, yet the manuscript provides no per-seed variance, confidence intervals, or ablation on seed count; this leaves open whether the reported gaps (0.211 vs. 0.432-0.525) are robust or sensitive to the specific randomization.
Authors: We agree that reporting per-seed variance, confidence intervals, and a seed-count ablation would improve transparency and allow readers to assess robustness. In the revised manuscript we will add a supplementary table listing Deletion AUC for each of the three seeds for the primary model-explanation pairs, together with 95 % confidence intervals computed across seeds. We have also run an ablation with 5 and 10 seeds; the ordering and effect sizes remain stable (absolute Cohen’s d > 1.0 in all cases). These results and the corresponding statistical details will be inserted into §3 and a new appendix. revision: yes
-
Referee: [§4.2] §4.2 (Swin-Tiny disentanglement): the claim that readout structure rather than architecture family is operative is supported by Swin-Tiny results, but the section does not quantify the structural distance metric used to define 'compatibility,' making it difficult to assess how generalizable the separation is beyond the tested models.
Authors: The native-readout hypothesis treats structural distance as the degree of alignment between an explainer’s readout format and the model’s native decision pathway (attention maps versus spatial feature maps). While the original text relies on this conceptual distinction, we acknowledge that an explicit numerical measure would aid evaluation of generalizability. In the revision we will introduce a simple structural-compatibility score based on hierarchy depth and readout dimensionality, report the score for each tested model-explanation pair, and briefly discuss its applicability to additional architectures in the updated §4.2. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper advances an empirical audit protocol for explanation faithfulness in industrial visual inspection models, operationalizing the native-readout hypothesis via Deletion AUC measurements under zero-fill and blur-fill perturbations on WM-811K and MVTec AD. Reported results consist of direct experimental comparisons (e.g., ViT-Tiny Attention Rollout at 0.211 vs. 0.432-0.525 for other families, Cohen's d > 1.1, RISE control at ~0.1, and ordering reversal under blur-fill) that are presented as joint properties of (model, explainer, perturbation) triples and explicitly qualified as dataset-dependent. No derivation chain, equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text; the central claim is illustrated and bounded by these measurements rather than reduced to its inputs by construction. The protocol therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption native-readout hypothesis: perturbation-based faithfulness is bounded by structural distance from the model's native decision mechanism
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the perturbation-based faithfulness of an explanation method is bounded by its structural distance from the model's native decision mechanism
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ViT-Tiny + Attention Rollout attains Deletion AUC 0.211 against 0.432-0.525 for Swin-Tiny / ResNet18+CBAM / DenseNet121 + Grad-CAM
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.