FaCT: Faithful Concept Traces for Explaining Neural Network Decisions

· 2025 · cs.LG · arXiv 2510.25512

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Deep networks have shown remarkable performance across a wide range of tasks, yet getting a global concept-level understanding of how they function remains a key challenge. Many post-hoc concept-based approaches have been introduced to understand their workings, yet they are not always faithful to the model. Further, they make restrictive assumptions on the concepts a model learns, such as class-specificity, small spatial extent, or alignment to human expectations. In this work, we put emphasis on the faithfulness of such concept-based explanations and propose a new model with model-inherent mechanistic concept-explanations. Our concepts are shared across classes and, from any layer, their contribution to the logit and their input-visualization can be faithfully traced. We also leverage foundation models to propose a new concept-consistency metric, C$^2$-Score, that can be used to evaluate concept-based methods. We show that, compared to prior work, our concepts are quantitatively more consistent and users find our concepts to be more interpretable, all while retaining competitive ImageNet performance.

representative citing papers

OCCAM: Open-set Causal Concept explAnation and Ontology induction for black-box vision Models

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

OCCAM discovers open-set visual concepts, estimates causal contributions via object-level interventions on black-box vision models, and induces a global concept ontology from aggregated dataset evidence.

citing papers explorer

Showing 1 of 1 citing paper.

OCCAM: Open-set Causal Concept explAnation and Ontology induction for black-box vision Models cs.AI · 2026-05-18 · unverdicted · none · ref 30 · internal anchor
OCCAM discovers open-set visual concepts, estimates causal contributions via object-level interventions on black-box vision models, and induces a global concept ontology from aggregated dataset evidence.

FaCT: Faithful Concept Traces for Explaining Neural Network Decisions

fields

years

verdicts

representative citing papers

citing papers explorer