hub Canonical reference

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Karen Simonyan, Andrea Vedaldi, Andrew Zisserman · 2013 · cs.CV · arXiv 1312.6034

Canonical reference. 82% of citing Pith papers cite this work as background.

86 Pith papers citing it

Background 82% of classified citations

open full Pith review browse 86 citing papers arXiv PDF

abstract

This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets). We consider two visualisation techniques, based on computing the gradient of the class score with respect to the input image. The first one generates an image, which maximises the class score [Erhan et al., 2009], thus visualising the notion of the class, captured by a ConvNet. The second technique computes a class saliency map, specific to a given image and class. We show that such maps can be employed for weakly supervised object segmentation using classification ConvNets. Finally, we establish the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks [Zeiler et al., 2013].

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 10 method 1

citation-polarity summary

background 9 unclear 1 use method 1

claims ledger

abstract This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets). We consider two visualisation techniques, based on computing the gradient of the class score with respect to the input image. The first one generates an image, which maximises the class score [Erhan et al., 2009], thus visualising the notion of the class, captured by a ConvNet. The second technique computes a class saliency map, specific to a given image and class. We show that such maps can be employed for weakly supervised object segmentation using classification ConvNe

co-cited works

representative citing papers

Measuring Cross-Modal Synergy: A Benchmark for VLM Explainability

cs.AI · 2026-05-21 · unverdicted · novelty 7.0

Introduces Synergistic Faithfulness metric based on Shapley Interaction Index to evaluate cross-modal synergy in VLM explainers, revealing over-reliance on visual salience in existing methods.

Spectral Integrated Gradients for Coarse-to-Fine Feature Attribution

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

Spectral Integrated Gradients constructs SVD-based integration paths that activate singular components from largest to smallest, producing cleaner attribution maps and better quantitative scores than standard Integrated Gradients on image classification tasks.

Toy Combinatorial Interpretability Models Reveal Lottery Tickets in Early Feature Space

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

In a combinatorial toy setting, winning lottery tickets preserve families of compatible feature locations in early feature space that balance proximity to final codes with low interference, rather than specific weight subnetworks.

AIM: Adversarial Information Masking for Faithfulness Evaluation of Saliency Maps

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

AIM is a new saliency-guided adversarial feature replacement method to evaluate faithfulness of saliency maps and reliability of masking operators on image, audio, and EEG tasks.

$\alpha$-TCAV: A Unified Framework for Testing with Concept Activation Vectors

stat.ML · 2026-05-15 · unverdicted · novelty 7.0

α-TCAV replaces TCAV's hard indicator with a tunable smooth function to create a unified probabilistic framework with lower variance and guidance for parameter choice or Bayes-optimal scoring.

How to Evaluate and Refine your CAM

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.

From Mechanistic to Compositional Interpretability

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

Compositional interpretability defines explanations as commuting syntactic-semantic mapping pairs grounded in compositionality and minimum description length, with compressive refinement and a parsimony theorem guaranteeing concise human-aligned decompositions.

SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

SeBA is a joint-embedding framework that separates tabular data into two complementary views and aligns one view's representations to the nearest-neighbor structure of the other, improving feature-label relationships and achieving SOTA results in most benchmarks without relying on augmentations.

GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

cs.LG · 2026-05-06 · unverdicted · novelty 7.0 · 2 refs

GRALIS unifies linear XAI attribution methods via a Riesz Representation Theorem-derived canonical form (Q, w, Delta), delivering seven theorems on completeness, convergence, interactions, and multi-scale extensions.

Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution

cs.LG · 2026-05-04 · unverdicted · novelty 7.0 · 2 refs

MA-GIG uses VAE latent space to align Integrated Gradients paths with the data manifold for more faithful feature attributions in deep neural networks.

Mapping data sensitivities in global QCD analysis with linear response and influence functions

hep-ph · 2026-04-30 · unverdicted · novelty 7.0

A framework based on linear response and influence functions maps data sensitivities in global QCD analyses to show how experiments determine central values, uncertainties, and correlations of non-perturbative functions.

Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning

astro-ph.GA · 2026-04-28 · unverdicted · novelty 7.0

A single end-to-end Transformer model unifies stellar labels from heterogeneous spectroscopic surveys into a self-consistent scale without post-hoc recalibration.

Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings

cond-mat.mtrl-sci · 2026-04-28 · unverdicted · novelty 7.0

Introduces the RealMat-BaG benchmark showing fundamental generalization limits of ML models when predicting experimental bandgaps from DFT-trained data.

TRANSPORTER: Transferring Visual Semantics from VLM Manifolds

cs.CV · 2025-11-23 · unverdicted · novelty 7.0

TRANSPORTER generates videos from VLM logits using optimal transport to interpret model predictions on object attributes, actions, and scenes.

Human-Centered Supervision for Sentiment Analysis in Telugu: A Systematic Inquiry Beyond Accuracy

cs.CL · 2025-08-02 · unverdicted · novelty 7.0

Human rationales in supervision for Telugu sentiment analysis improve model alignment with human reasoning and often produce gains in predictive performance.

Scaling and evaluating sparse autoencoders

cs.LG · 2024-06-06 · unverdicted · novelty 7.0

K-sparse autoencoders with dead-latent fixes produce clean scaling laws and better feature quality metrics that improve with size, shown by training a 16-million-latent model on GPT-4 activations.

Improving Dictionary Learning with Gated Sparse Autoencoders

cs.LG · 2024-04-24 · unverdicted · novelty 7.0

Gated SAEs decouple which features to use from how large their activations should be, applying the L1 penalty only to selection and thereby eliminating shrinkage while halving the number of firing features needed for good fidelity.

ISAAC: Auditing Causal Reasoning in Deep Models for Drug-Target Interaction

cs.LG · 2026-05-03 · unverdicted · novelty 7.0

ISAAC auditing applied to three DTI models on the Davis benchmark finds 25% relative differences in causal reasoning scores despite nearly identical AUROC values.

From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models

cs.CV · 2026-05-01 · unverdicted · novelty 7.0

An iERF-centric framework unifies local, global, and mechanistic interpretability in vision models via SRD for saliency, CAFE for concept anchoring, and ICAT for interlayer attribution.

Mamba-SSM with LLM Reasoning for Feature Selection: Faithfulness-Aware Biomarker Discovery

q-bio.QM · 2026-04-15 · unverdicted · novelty 7.0

LLM chain-of-thought filtering of Mamba saliency features on TCGA-BRCA data produces a 17-gene set with AUC 0.927 that beats both the raw 50-gene saliency list and a 5000-gene baseline while using far fewer features, though it misses many known BRCA genes.

Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

Cross-Layer Transcoders decompose ViT activations into sparse, depth-aware layer contributions that maintain zero-shot accuracy and enable faithful attribution of the final representation.

Transcoders Trace Visual Grounding and Hallucinations in Vision-Language Models

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Transcoders decompose MLP layers in Gemma 3-4B-IT to trace visual grounding more effectively than SAEs and predict hallucinations from circuit graph features at AUC 0.68.

Rethinking Visual Attribution for Chest X-ray Reasoning in Large Vision Language Models

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

Existing visual attribution methods often fail to identify the visual evidence used by LVLMs in chest X-ray reasoning, while MedFocus using unbalanced optimal transport and targeted interventions substantially outperforms them across multiple models and settings.

OCCAM: Open-set Causal Concept explAnation and Ontology induction for black-box vision Models

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

OCCAM discovers open-set visual concepts, estimates causal contributions via object-level interventions on black-box vision models, and induces a global concept ontology from aggregated dataset evidence.

citing papers explorer

Showing 50 of 86 citing papers.

Measuring Cross-Modal Synergy: A Benchmark for VLM Explainability cs.AI · 2026-05-21 · unverdicted · none · ref 20 · internal anchor
Introduces Synergistic Faithfulness metric based on Shapley Interaction Index to evaluate cross-modal synergy in VLM explainers, revealing over-reliance on visual salience in existing methods.
Spectral Integrated Gradients for Coarse-to-Fine Feature Attribution cs.CV · 2026-05-19 · unverdicted · none · ref 42 · internal anchor
Spectral Integrated Gradients constructs SVD-based integration paths that activate singular components from largest to smallest, producing cleaner attribution maps and better quantitative scores than standard Integrated Gradients on image classification tasks.
Toy Combinatorial Interpretability Models Reveal Lottery Tickets in Early Feature Space cs.LG · 2026-05-18 · unverdicted · none · ref 24 · internal anchor
In a combinatorial toy setting, winning lottery tickets preserve families of compatible feature locations in early feature space that balance proximity to final codes with low interference, rather than specific weight subnetworks.
AIM: Adversarial Information Masking for Faithfulness Evaluation of Saliency Maps cs.LG · 2026-05-16 · unverdicted · none · ref 54 · internal anchor
AIM is a new saliency-guided adversarial feature replacement method to evaluate faithfulness of saliency maps and reliability of masking operators on image, audio, and EEG tasks.
$\alpha$-TCAV: A Unified Framework for Testing with Concept Activation Vectors stat.ML · 2026-05-15 · unverdicted · none · ref 140 · internal anchor
α-TCAV replaces TCAV's hard indicator with a tunable smooth function to create a unified probabilistic framework with lower variance and guidance for parameter choice or Bayes-optimal scoring.
How to Evaluate and Refine your CAM cs.CV · 2026-05-14 · unverdicted · none · ref 25 · internal anchor
Introduces synthetic ground-truth dataset for CAM evaluation, proposes ARCC composite metric, and RefineCAM method that aggregates layers for higher-resolution maps outperforming baselines.
From Mechanistic to Compositional Interpretability cs.LG · 2026-05-09 · unverdicted · none · ref 212 · internal anchor
Compositional interpretability defines explanations as commuting syntactic-semantic mapping pairs grounded in compositionality and minimum description length, with compressive refinement and a parsimony theorem guaranteeing concise human-aligned decompositions.
SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data cs.LG · 2026-05-08 · unverdicted · none · ref 192 · internal anchor
SeBA is a joint-embedding framework that separates tabular data into two complementary views and aligns one view's representations to the nearest-neighbor structure of the other, improving feature-label relationships and achieving SOTA results in most benchmarks without relying on augmentations.
GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation cs.LG · 2026-05-06 · unverdicted · none · ref 7 · 2 links · internal anchor
GRALIS unifies linear XAI attribution methods via a Riesz Representation Theorem-derived canonical form (Q, w, Delta), delivering seven theorems on completeness, convergence, interactions, and multi-scale extensions.
Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution cs.LG · 2026-05-04 · unverdicted · none · ref 8 · 2 links · internal anchor
MA-GIG uses VAE latent space to align Integrated Gradients paths with the data manifold for more faithful feature attributions in deep neural networks.
Mapping data sensitivities in global QCD analysis with linear response and influence functions hep-ph · 2026-04-30 · unverdicted · none · ref 48 · internal anchor
A framework based on linear response and influence functions maps data sensitivities in global QCD analyses to show how experiments determine central values, uncertainties, and correlations of non-perturbative functions.
Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning astro-ph.GA · 2026-04-28 · unverdicted · none · ref 66 · internal anchor
A single end-to-end Transformer model unifies stellar labels from heterogeneous spectroscopic surveys into a self-consistent scale without post-hoc recalibration.
Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings cond-mat.mtrl-sci · 2026-04-28 · unverdicted · none · ref 45 · internal anchor
Introduces the RealMat-BaG benchmark showing fundamental generalization limits of ML models when predicting experimental bandgaps from DFT-trained data.
TRANSPORTER: Transferring Visual Semantics from VLM Manifolds cs.CV · 2025-11-23 · unverdicted · none · ref 73 · internal anchor
TRANSPORTER generates videos from VLM logits using optimal transport to interpret model predictions on object attributes, actions, and scenes.
Human-Centered Supervision for Sentiment Analysis in Telugu: A Systematic Inquiry Beyond Accuracy cs.CL · 2025-08-02 · unverdicted · none · ref 48 · internal anchor
Human rationales in supervision for Telugu sentiment analysis improve model alignment with human reasoning and often produce gains in predictive performance.
Scaling and evaluating sparse autoencoders cs.LG · 2024-06-06 · unverdicted · none · ref 58 · internal anchor
K-sparse autoencoders with dead-latent fixes produce clean scaling laws and better feature quality metrics that improve with size, shown by training a 16-million-latent model on GPT-4 activations.
Improving Dictionary Learning with Gated Sparse Autoencoders cs.LG · 2024-04-24 · unverdicted · none · ref 126 · internal anchor
Gated SAEs decouple which features to use from how large their activations should be, applying the L1 penalty only to selection and thereby eliminating shrinkage while halving the number of firing features needed for good fidelity.
ISAAC: Auditing Causal Reasoning in Deep Models for Drug-Target Interaction cs.LG · 2026-05-03 · unverdicted · none · ref 12
ISAAC auditing applied to three DTI models on the Davis benchmark finds 25% relative differences in causal reasoning scores despite nearly identical AUROC values.
From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models cs.CV · 2026-05-01 · unverdicted · none · ref 3
An iERF-centric framework unifies local, global, and mechanistic interpretability in vision models via SRD for saliency, CAFE for concept anchoring, and ICAT for interlayer attribution.
Mamba-SSM with LLM Reasoning for Feature Selection: Faithfulness-Aware Biomarker Discovery q-bio.QM · 2026-04-15 · unverdicted · none · ref 7
LLM chain-of-thought filtering of Mamba saliency features on TCGA-BRCA data produces a 17-gene set with AUC 0.927 that beats both the raw 50-gene saliency list and a 5000-gene baseline while using far fewer features, though it misses many known BRCA genes.
Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision cs.CV · 2026-04-14 · unverdicted · none · ref 31
Cross-Layer Transcoders decompose ViT activations into sparse, depth-aware layer contributions that maintain zero-shot accuracy and enable faithful attribution of the final representation.
Transcoders Trace Visual Grounding and Hallucinations in Vision-Language Models cs.LG · 2026-05-21 · unverdicted · none · ref 16 · internal anchor
Transcoders decompose MLP layers in Gemma 3-4B-IT to trace visual grounding more effectively than SAEs and predict hallucinations from circuit graph features at AUC 0.68.
Rethinking Visual Attribution for Chest X-ray Reasoning in Large Vision Language Models cs.CV · 2026-05-19 · unverdicted · none · ref 64 · internal anchor
Existing visual attribution methods often fail to identify the visual evidence used by LVLMs in chest X-ray reasoning, while MedFocus using unbalanced optimal transport and targeted interventions substantially outperforms them across multiple models and settings.
OCCAM: Open-set Causal Concept explAnation and Ontology induction for black-box vision Models cs.AI · 2026-05-18 · unverdicted · none · ref 42 · internal anchor
OCCAM discovers open-set visual concepts, estimates causal contributions via object-level interventions on black-box vision models, and induces a global concept ontology from aggregated dataset evidence.
GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging cs.CV · 2026-05-17 · unverdicted · none · ref 64 · internal anchor
GCE-MIL is a backbone-agnostic wrapper that directly optimizes MIL evidence for sufficiency, necessity, and recoverability, yielding modest gains in Macro-F1 and C-index plus more faithful patch selection across many backbones and datasets.
From Weight Perturbation to Feature Attribution for Explaining Fully Connected Neural Networks cs.LG · 2026-05-14 · unverdicted · none · ref 24 · internal anchor
XWP and XWP_c are novel attribution methods for FCNNs that estimate feature importance by perturbing attached weights to avoid added bias and out-of-distribution issues in occlusion approaches.
Feature Visualization Recovers Known Cortical Selectivity from TRIBE v2 q-bio.NC · 2026-05-13 · unverdicted · none · ref 19 · internal anchor
Feature visualization on TRIBE v2 brain encoders recovers the known ventral visual hierarchy from V1 to V4 and produces distinctive patterns for MT, FFA, and PPA, with optimized stimuli driving ~4x higher activation than natural images.
APEX: Audio Prototype EXplanations for Classification Tasks cs.SD · 2026-05-11 · unverdicted · none · ref 16 · internal anchor
APEX generates four types of prototype-based explanations for pre-trained audio classifiers that preserve output invariance and target acoustic properties better than gradient methods applied to spectrograms.
Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality cs.CV · 2026-05-11 · accept · none · ref 4 · internal anchor
Scaling vision models by depth and parameter count does not consistently improve localisation-based explanation quality across architectures, datasets, and post-hoc methods; smaller models often perform comparably or better.
Evaluating the Alignment Between GeoAI Explanations and Domain Knowledge in Satellite-Based Flood Mapping cs.CV · 2026-04-28 · unverdicted · none · ref 21 · internal anchor
ADAGE uses Channel-Group SHAP to quantify alignment between GeoAI model explanations and domain knowledge references in satellite-based flood mapping.
Hierarchical, Interpretable, Label-Free Concept Bottleneck Model cs.CV · 2026-04-02 · unverdicted · none · ref 26 · internal anchor
HIL-CBM is a hierarchical label-free concept bottleneck model that improves classification accuracy and explanation quality over prior single-level CBMs using a visual consistency loss and dual heads.
UNBOX: Unveiling Black-box visual models with Natural-language cs.CV · 2026-03-09 · unverdicted · none · ref 17 · internal anchor
UNBOX recovers interpretable text concepts that maximally activate classes in black-box vision models by recasting activation maximization as semantic search with LLMs and diffusion models.
MaskDiME: Adaptive Masked Diffusion for Precise and Efficient Visual Counterfactual Explanations cs.CV · 2026-02-21 · unverdicted · none · ref 37 · internal anchor
MaskDiME uses adaptive masked diffusion to produce 30x faster, localized, and semantically consistent visual counterfactual explanations without training, matching or exceeding prior performance on five datasets.
Faster Verified Explanations for Neural Networks cs.LG · 2025-11-28 · unverdicted · none · ref 27 · internal anchor
FaVeX accelerates verified explanations for neural networks via dynamic batch-sequential processing and query reuse while introducing verifier-optimal robust explanations that incorporate verifier incompleteness.
How to Use Deep Learning to Identify Sufficient Conditions: A Case Study on Stanley's $e$-Positivity math.CO · 2025-11-25 · unverdicted · none · ref 17 · internal anchor
Deep learning identifies co-triangle-free graphs as e-positive and proves e-positivity for claw-free claw-contractible-free graphs on 10 and 11 vertices, resolving an open conjecture.
AttnTrace: Contextual Attribution of Prompt Injection and Knowledge Corruption cs.CL · 2025-08-05 · unverdicted · none · ref 53 · internal anchor
AttnTrace is an attention-weight-based context traceback method for LLMs that claims higher accuracy and efficiency than prior art like TracLLM while aiding prompt injection detection.
Boosting Team Modeling through Tempo-Relational Representation Learning cs.LG · 2025-07-17 · unverdicted · none · ref 128 · internal anchor
A tempo-relational neural architecture jointly models temporal and relational aspects of team interactions to outperform prior approaches on team performance prediction and enable efficient multi-task prediction of team constructs.
Why Do Class-Dependent Evaluation Effects Occur with Time Series Feature Attributions? A Synthetic Data Investigation cs.LG · 2025-06-13 · unverdicted · none · ref 17 · internal anchor
Synthetic experiments reveal that class-dependent effects appear in both perturbation-based and ground-truth evaluations of time series feature attributions, often producing contradictory rankings of attribution quality due to differences in feature amplitude or temporal extent between classes.
UntrustVul: An Automated Approach for Identifying Untrustworthy Alerts in Vulnerability Detection Models cs.SE · 2025-03-19 · unverdicted · none · ref 53 · internal anchor
UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.
ExPath: Targeted Pathway Inference for Biological Knowledge Bases via Graph Learning and Explanation cs.LG · 2025-02-25 · unverdicted · none · ref 42 · internal anchor
ExPath is a subgraph inference framework that classifies bio-networks with experimental data and uses explanations to identify targeted pathways, reporting up to 4.5x higher Fidelity+ and 14x lower Fidelity- than baselines on 301 networks.
Explaining Object Detectors via Collective Contribution of Pixels cs.CV · 2024-12-01 · unverdicted · none · ref 43 · internal anchor
A Shapley-value method with interaction terms that explains object detector decisions by capturing collective pixel contributions for localization and classification.
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation cs.LG · 2023-10-19 · conditional · none · ref 87 · internal anchor
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
Multi-task Self-Supervised Learning for Human Activity Detection cs.LG · 2019-07-27 · unverdicted · none · ref 63 · internal anchor
A multi-task self-supervised approach trains a temporal CNN to detect transformations on sensory data, yielding features that match or exceed fully supervised performance in semi-supervised and transfer settings for smartphone-based HAR.
Interpretability Beyond Classification Output: Semantic Bottleneck Networks cs.CV · 2019-07-25 · unverdicted · none · ref 23 · internal anchor
Semantic Bottleneck Networks add interpretable semantic concept layers to deep networks, recovering SOTA segmentation performance with drastic channel reduction and enabling failure interpretation at over 99% accuracy for most outputs.
Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications cs.LG · 2019-07-19 · unverdicted · none · ref 32 · internal anchor
A scalable framework combining streaming graphs, topology computation, and topology-aware datacubes enables interactive analysis of high-dimensional functions in scientific ML applications.
Saliency-driven Word Alignment Interpretation for Neural Machine Translation cs.CL · 2019-06-25 · unverdicted · none · ref 35 · internal anchor
Saliency-driven interpretation methods reveal that NMT models learn word alignments of better quality than fast-align under force decoding and consistent with automatic tools under free decoding.
H-Sets: Hessian-Guided Discovery of Set-Level Feature Interactions in Image Classifiers cs.CV · 2026-04-23 · unverdicted · none · ref 34
H-Sets detects higher-order feature interactions in image classifiers via Hessian-guided pair merging and attributes them with IDG-Vis to generate more interpretable saliency maps than existing marginal or coarse methods.
On the Importance and Evaluation of Narrativity in Natural Language AI Explanations cs.CL · 2026-04-20 · unverdicted · none · ref 14
XAI explanations should be narratives with continuous structure, cause-effect, fluency and diversity, and new metrics are needed to evaluate this better than standard NLP scores.
Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks cs.AI · 2026-04-20 · conditional · none · ref 56
Token-level contrastive attribution yields informative signals for some LLM benchmark failures but is not universally applicable across datasets and models.
Potential of Gaia XP Spectra in Red Giant Star Asteroseismology: A Deep-Learning Approach astro-ph.SR · 2026-04-18 · unverdicted · none · ref 69
Hybrid deep learning models recover large frequency separation, frequency of maximum power, and dipole period spacing from low-resolution Gaia XP spectra with accuracy comparable to moderate-resolution spectroscopy.

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer