An iERF-centric framework unifies local, global, and mechanistic interpretability in vision models via SRD for saliency, CAFE for concept anchoring, and ICAT for interlayer attribution.
hub
Grad-cam: Visual explanations from deep networks via gradient-based localization
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
FogFool creates fog-based adversarial perturbations using Perlin noise optimization to achieve high black-box transferability (83.74% TASR) and robustness to defenses in remote sensing classification.
DiCLIP uses diffusion-based visual correlation enhancement and text semantic augmentation to improve CLIP-generated class activation maps for weakly supervised semantic segmentation, outperforming prior methods on PASCAL VOC and MS COCO.
Transformer-based ReID embeddings encode BMI most strongly in deeper layers, followed by pitch, gender, and yaw, with pose peaking in middle layers and BMI increasing with depth; cross-spectral settings shift reliance toward structural cues.
HIL-CBM is a hierarchical label-free concept bottleneck model that improves classification accuracy and explanation quality over prior single-level CBMs using a visual consistency loss and dual heads.
PhiNet adds phonetic interpretability to speaker verification while matching the accuracy of standard black-box models on VoxCeleb, SITW, and LibriSpeech.
FADNet reformulates face forgery detection as one-class learning on real faces only, using EDL uncertainty and a PFIG to achieve 96.63% average accuracy and 98.83% precision on DF40 and ASFD benchmarks.
ICA-based artifact removal does not consistently improve deep network decoding performance on EEG data across three BCI tasks and multiple models.
EV-CLIP introduces mask and context visual prompts to adapt CLIP for improved few-shot video action recognition under visual challenges such as low light and egocentric views, outperforming other efficient methods with backbone-scale-independent efficiency.
The paper delivers a mechanism-centric taxonomy and unified perspective on explainable human activity recognition methods across sensing modalities.
A ResNet50 OOD filter plus YOLOv8/11/12 pipeline reaches 99.77% OOD rejection accuracy and 0.947 mAP on mammograms while blocking irrelevant imaging inputs.
citing papers explorer
-
From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models
An iERF-centric framework unifies local, global, and mechanistic interpretability in vision models via SRD for saliency, CAFE for concept anchoring, and ICAT for interlayer attribution.
-
Physically-Induced Atmospheric Adversarial Perturbations: Enhancing Transferability and Robustness in Remote Sensing Image Classification
FogFool creates fog-based adversarial perturbations using Perlin noise optimization to achieve high black-box transferability (83.74% TASR) and robustness to defenses in remote sensing classification.
-
DiCLIP: Diffusion Model Enhances CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation
DiCLIP uses diffusion-based visual correlation enhancement and text semantic augmentation to improve CLIP-generated class activation maps for weakly supervised semantic segmentation, outperforming prior methods on PASCAL VOC and MS COCO.
-
AttriBE: Quantifying Attribute Expressivity in Body Embeddings for Recognition and Identification
Transformer-based ReID embeddings encode BMI most strongly in deeper layers, followed by pitch, gender, and yaw, with pose peaking in middle layers and BMI increasing with depth; cross-spectral settings shift reliance toward structural cues.
-
Hierarchical, Interpretable, Label-Free Concept Bottleneck Model
HIL-CBM is a hierarchical label-free concept bottleneck model that improves classification accuracy and explanation quality over prior single-level CBMs using a visual consistency loss and dual heads.
-
PhiNet: Speaker Verification with Phonetic Interpretability
PhiNet adds phonetic interpretability to speaker verification while matching the accuracy of standard black-box models on VoxCeleb, SITW, and LibriSpeech.
-
Only Train Once: Uncertainty-Aware One-Class Learning for Face Authenticity Detection
FADNet reformulates face forgery detection as one-class learning on real faces only, using EDL uncertainty and a PFIG to achieve 96.63% average accuracy and 98.83% precision on DF40 and ASFD benchmarks.
-
I see artifacts: ICA-based EEG artifact removal does not improve deep network decoding across three BCI tasks
ICA-based artifact removal does not consistently improve deep network decoding performance on EEG data across three BCI tasks and multiple models.
-
EV-CLIP: Efficient Visual Prompt Adaptation for CLIP in Few-shot Action Recognition under Visual Challenges
EV-CLIP introduces mask and context visual prompts to adapt CLIP for improved few-shot video action recognition under visual challenges such as low light and egocentric views, outperforming other efficient methods with backbone-scale-independent efficiency.
-
Explainable Human Activity Recognition: A Unified Review of Concepts and Mechanisms
The paper delivers a mechanism-centric taxonomy and unified perspective on explainable human activity recognition methods across sensing modalities.
-
Analysis of Invasive Breast Cancer in Mammograms Using YOLO, Explainability, and Domain Adaptation
A ResNet50 OOD filter plus YOLOv8/11/12 pipeline reaches 99.77% OOD rejection accuracy and 0.947 mAP on mammograms while blocking irrelevant imaging inputs.