pith. sign in

arxiv: 2504.19774 · v2 · submitted 2025-04-28 · 💻 cs.LG

If Concept Bottlenecks are the Question, are Foundation Models the Answer?

Pith reviewed 2026-05-22 17:47 UTC · model grok-4.3

classification 💻 cs.LG
keywords concept bottleneck modelsvision-language modelsweak supervisionconcept qualityinterpretabilityfoundation modelsempirical evaluation
0
0 comments X

The pith

Vision-language models supply concept labels for bottleneck models that often differ from expert annotations and whose accuracy does not track their actual quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Concept bottleneck models first map inputs to human-understandable concepts and then use those concepts to make the final prediction. This design promises both accuracy and built-in interpretability, yet it has long depended on costly expert labels for the concepts. Recent work replaces those labels with weak supervision drawn from foundation models. The paper tests whether this substitution preserves concept quality by running state-of-the-art VLM-CBMs through a battery of quality metrics on multiple tasks. The results indicate that VLM supervision produces noticeably different concepts from expert supervision and that simple accuracy on the concepts is only weakly related to their usefulness for the downstream task or for interpretability.

Core claim

The authors evaluate several VLM-CBM architectures by comparing the concepts they learn under VLM supervision against those learned under expert supervision. They measure concept quality with a selection of established metrics and find that the two sources of supervision diverge substantially on some tasks while agreeing on others. In addition, they report that concept accuracy and the chosen quality metrics are not strongly correlated, implying that accuracy alone is an unreliable indicator of whether a concept will support good performance or clear explanations.

What carries the argument

The empirical comparison of VLM-supervised CBMs against expert-supervised CBMs using a battery of concept quality metrics.

If this is right

  • Concept accuracy should not be used as the sole criterion when selecting or validating concepts in bottleneck models.
  • VLM supervision may be adequate for some tasks but requires task-specific validation before it replaces expert labels.
  • Quality metrics beyond accuracy become necessary to decide whether a VLM-CBM meets interpretability goals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Designers of future CBMs may need to combine VLM and expert signals rather than treat them as interchangeable.
  • The gap between VLM and expert concepts could be narrowed by more targeted prompting or fine-tuning of the foundation model on the target domain.

Load-bearing premise

The chosen metrics for concept quality give a valid and complete picture of what makes concepts useful for both task performance and interpretability.

What would settle it

A follow-up study in which the same models are evaluated with a different set of quality metrics or with direct human judgments of interpretability, and the new metrics reverse the ranking of VLM versus expert concepts.

Figures

Figures reproduced from arXiv: 2504.19774 by Andrea Passerini, Emanuele Marconato, Francesco Giannini, Nicola Debole, Pietro Barbiero, Stefano Teso.

Figure 1
Figure 1. Figure 1: A prototypical VLM-CBM. Left: given textual descriptions T of visual concepts, a VLM is used to annotate a training set D with per-concept labels or activation scores and (optionally) masks indicating where each concept activates. Right: the supervision is typically used to fine-tune a backbone f that extracts concepts from new inputs and a linear layer g for inferring predictions from the concepts. See Se… view at source ↗
Figure 2
Figure 2. Figure 2: Issues with CBM concepts. An illustration of the four issues with concept quality and usage affecting CBMs on the CelebA task (cf. Section 4): (1) Attaining high label accuracy does not prevent learning inaccurate concepts. (2) CBMs can maximize label performance by learning leaky concepts that include irrelevant contextual cues. (3) Entangled concepts encode unwanted information about one another, affecti… view at source ↗
Figure 3
Figure 3. Figure 3: Gaps of the concept leakage test in CelebA and CUB. The higher the gap, the more the subset of learned concepts leaks information for label prediction. (Left) Gaps on CelebA upon varying the subset of concepts from 1 to 39. All model concepts always display a non-negative gain in label F1 compared to ground-truth concepts. (Right) Gaps on CUB upon varying the subset of concepts from 1 to 112. Only CBM is a… view at source ↗
Figure 4
Figure 4. Figure 4: Gaps in concept leakage test on Shapes3d dataset. Negative values indicate that the classifier using predicted concepts has lower F1 score than the same which has access to ground-truth concept annotations. B.1 Disentanglement We provide in [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Annotations obtained using G-DINO on the 3 different datasets. On CelebA (left) and Shapes3d (center) the VLM makes several mistakes highlighted by red crosses. On the other hand, the model does a good job annotating CUB (right). CBM Naïve LF-CBM LaBo VLG-CBM [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Importance matrices in DCI on Shapes3d. High disentanglement is achieved when, for each row, only one square is active. For reference, a perfect DCI matrix is a diagonal matrix. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Importance matrices in DCI on CelebA. High disentanglement is achieved when, for each row, only one square is active. For reference, a perfect DCI matrix is a diagonal matrix. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Importance matrices in DCI on CUB. High disentanglement is achieved when, for each row, only one square is active. For reference, a perfect DCI matrix is a diagonal matrix. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
read the original abstract

Concept Bottleneck Models (CBMs) are neural networks designed to conjoin high performance with ante-hoc interpretability. CBMs work by first mapping inputs (e.g., images) to high-level concepts (e.g., visible objects and their properties) and then use these to solve a downstream task (e.g., tagging or scoring an image) in an interpretable manner. Their performance and interpretability, however, hinge on the quality of the concepts they learn. The go-to strategy for ensuring good quality concepts is to leverage expert annotations, which are expensive to collect and seldom available in applications. Researchers have recently addressed this issue by introducing "VLM-CBM" architectures that replace manual annotations with weak supervision from foundation models. It is however unclear what is the impact of doing so on the quality of the learned concepts. To answer this question, we put state-of-the-art VLM-CBMs to the test, analyzing their learned concepts empirically using a selection of significant metrics. Our results show that, depending on the task, VLM supervision can sensibly differ from expert annotations, and that concept accuracy and quality are not strongly correlated. Our code is available at https://github.com/debryu/CQA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents an empirical study on the use of Vision-Language Models (VLMs) for supervising Concept Bottleneck Models (CBMs) in place of expert annotations. The authors evaluate state-of-the-art VLM-CBM architectures across different tasks using multiple metrics for concept accuracy and quality. Key findings indicate that VLM supervision can differ from expert annotations in a task-dependent manner and that concept accuracy does not strongly correlate with the assessed concept quality.

Significance. Should the empirical results prove robust, this work underscores the challenges in replacing expert supervision with VLMs for CBMs and questions the adequacy of accuracy as a measure of concept quality. It contributes to the field by providing a comparative analysis and releasing code, which aids in assessing the practical utility of VLM-CBMs for interpretable machine learning.

major comments (1)
  1. [Empirical Analysis] The conclusion that concept accuracy and quality are not strongly correlated hinges on the chosen metrics providing a valid assessment of concept usefulness for CBM performance and interpretability. The manuscript should include analysis showing how these metrics correlate with downstream task accuracy or human-judged interpretability to rule out the possibility that the observed lack of correlation is due to metric selection rather than an inherent property of VLM supervision.
minor comments (1)
  1. [Abstract] The phrase 'a selection of significant metrics' is vague; specifying the metrics or pointing to the relevant section would improve clarity for readers.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We are grateful to the referee for the positive assessment of our work and the detailed feedback. Below we provide a point-by-point response to the major comment.

read point-by-point responses
  1. Referee: [Empirical Analysis] The conclusion that concept accuracy and quality are not strongly correlated hinges on the chosen metrics providing a valid assessment of concept usefulness for CBM performance and interpretability. The manuscript should include analysis showing how these metrics correlate with downstream task accuracy or human-judged interpretability to rule out the possibility that the observed lack of correlation is due to metric selection rather than an inherent property of VLM supervision.

    Authors: We appreciate this suggestion, which helps strengthen the validity of our conclusions. In the original manuscript, we selected metrics based on established practices in the CBM literature to assess concept quality in terms of their alignment with semantic meaning and contribution to interpretability. To address the referee's concern directly, we will add a new analysis in the revised manuscript that examines the correlation between our quality metrics and the downstream task accuracy for each VLM-CBM architecture and task. This will provide evidence that the observed weak correlation between concept accuracy and quality metrics reflects a genuine property rather than a limitation of the metrics. We note that conducting human-judged interpretability studies would require additional resources and participant recruitment, which we did not perform in the current work; however, we will expand the discussion to acknowledge this and suggest it as future work. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical comparison against external annotations

full rationale

The paper performs direct empirical evaluation of VLM-CBM concept quality against expert annotations and independent metrics on multiple tasks. No derivations, fitted parameters renamed as predictions, self-citation load-bearing premises, or ansatzes appear in the reported results. All claims rest on observable differences between VLM outputs and held-out expert labels, making the analysis self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard machine-learning assumptions about the validity of expert concept annotations as a benchmark and the relevance of the chosen quality metrics for assessing CBM interpretability.

axioms (1)
  • domain assumption Expert annotations represent a reliable ground-truth benchmark for evaluating the quality of learned concepts.
    Used as the reference against which VLM supervision is compared throughout the empirical analysis.

pith-pipeline@v0.9.0 · 5765 in / 1132 out tokens · 134539 ms · 2026-05-22T17:47:01.283716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

92 extracted references · 92 canonical work pages · 4 internal anchors

  1. [1]

    To Believe or Not to Believe Your LLM: Iterative Prompting for Estimating Epistemic Uncertainty

    Yasin Abbasi Yadkori et al. To Believe or Not to Believe Your LLM: Iterative Prompting for Estimating Epistemic Uncertainty. NeurIPS, 2024

  2. [2]

    Towards robust interpretability with self-explaining neural networks

    David Alvarez Melis and Tommi Jaakkola. Towards robust interpretability with self-explaining neural networks. NeurIPS, 2018

  3. [3]

    PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

    Jason Ansel et al. PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation. In ASPLOS, 2024

  4. [4]

    Debiasing concept-based explanations with causal analysis

    Mohammad Taha Bahadori and David Heckerman. Debiasing concept-based explanations with causal analysis. In ICLR, 2021

  5. [5]

    Entropy-based logic explanations of neural networks

    Pietro Barbiero et al. Entropy-based logic explanations of neural networks. In AAAI, 2022

  6. [6]

    Interpretable neural-symbolic concept reasoning

    Pietro Barbiero et al. Interpretable neural-symbolic concept reasoning. In ICML, 2023

  7. [7]

    Relational concept bottleneck models

    Pietro Barbiero et al. Relational concept bottleneck models. NeurIPS, 2024

  8. [8]

    Neural interpretable reasoning

    Pietro Barbiero et al. Neural interpretable reasoning. arXiv:2502.11639, 2025

  9. [9]

    Concept-level debugging of part-prototype networks

    Andrea Bontempelli et al. Concept-level debugging of part-prototype networks. In ICLR, 2023

  10. [10]

    Shortcuts and identifiability in concept-based models from a neuro-symbolic lens

    Samuele Bortolotti et al. Shortcuts and identifiability in concept-based models from a neuro-symbolic lens. arXiv:2502.11245, 2025

  11. [11]

    Logically consistent language models via neuro-symbolic integration

    Diego Calanzone et al. Logically consistent language models via neuro-symbolic integration. In ICLR, 2025

  12. [12]

    Interactive concept bottleneck models

    Kushal Chauhan et al. Interactive concept bottleneck models. In AAAI, 2023

  13. [13]

    This looks like that: Deep learning for interpretable image recognition

    Chaofan Chen et al. This looks like that: Deep learning for interpretable image recognition. NeurIPS, 2019

  14. [14]

    Concept whitening for interpretable image recognition

    Zhi Chen et al. Concept whitening for interpretable image recognition. Nature Machine Intelligence, 2020

  15. [15]

    Xtuner: A toolkit for efficiently fine-tuning llm

    XTuner Contributors. Xtuner: A toolkit for efficiently fine-tuning llm. https://github.com/InternLM/ xtuner, 2023

  16. [16]

    Support-vector networks

    Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine learning, 1995

  17. [17]

    Causally reliable concept bottleneck models

    Giovanni De Felice et al. Causally reliable concept bottleneck models. arXiv:2503.04363, 2025

  18. [18]

    Interpretable concept-based memory reasoning

    David Debot et al. Interpretable concept-based memory reasoning. arXiv:2407.15527, 2024

  19. [19]

    Anycbms: How to turn any black box into a concept bottleneck model, 2024

    Gabriele Dominici et al. Anycbms: How to turn any black box into a concept bottleneck model, 2024

  20. [20]

    Causal concept graph models: Beyond causal opacity in deep learning.arXiv:2405.16507, 2024

    Gabriele Dominici et al. Causal concept graph models: Beyond causal opacity in deep learning.arXiv:2405.16507, 2024

  21. [21]

    Counterfactual concept bottleneck models

    Gabriele Dominici et al. Counterfactual concept bottleneck models. arXiv:2402.01408, 2024

  22. [22]

    A framework for the quantitative evaluation of disentangled representations

    Cian Eastwood and Christopher KI Williams. A framework for the quantitative evaluation of disentangled representations. In ICLR, 2018

  23. [23]

    Learning to receive help: Intervention-aware concept embedding models

    Mateo Espinosa Zarlenga et al. Learning to receive help: Intervention-aware concept embedding models. NeurIPS, 2023

  24. [24]

    Learning to receive help: Intervention-aware concept embedding models

    Mateo Espinosa Zarlenga et al. Learning to receive help: Intervention-aware concept embedding models. NeurIPS, 2024

  25. [25]

    Bayesian concept bottleneck models with llm priors

    Jean Feng et al. Bayesian concept bottleneck models with llm priors. arXiv:2410.15555, 2024

  26. [26]

    Sample-efficient learning of concepts with theoretical guarantees: from data to concepts without interventions

    Hidde Fokkema et al. Sample-efficient learning of concepts with theoretical guarantees: from data to concepts without interventions. arXiv:2502.06536, 2025

  27. [27]

    Towards a deeper understanding of concept bottleneck models through end-to-end explanation

    Jack Furby et al. Towards a deeper understanding of concept bottleneck models through end-to-end explanation. In Workshop on Representation Learning for Responsible Human-Centric AI @ AAAI, 2023

  28. [28]

    The Llama 3 Herd of Models

    Aaron Grattafiori et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

  29. [29]

    Addressing leakage in concept bottleneck models

    Marton Havasi et al. Addressing leakage in concept bottleneck models. In NeurIPS, 2022

  30. [30]

    Deep residual learning for image recognition

    Kaiming He et al. Deep residual learning for image recognition. In CVPR, 2016

  31. [31]

    Towards a Definition of Disentangled Representations

    Irina Higgins et al. Towards a definition of disentangled representations. arXiv:1812.02230, 2018

  32. [32]

    A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

    Lei Huang et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM TOIS, 2023

  33. [33]

    GPT-4o System Card

    Aaron Hurst et al. Gpt-4o system card. arXiv:2410.21276, 2024. 11 A PREPRINT - A PRIL 30, 2025

  34. [34]

    Concept bottleneck generative models

    Aya Abdelsalam Ismail et al. Concept bottleneck generative models. In ICLR, 2023

  35. [35]

    A comprehensive survey on self-interpretable neural networks

    Yang Ji et al. A comprehensive survey on self-interpretable neural networks. arXiv:2501.15638, 2025

  36. [36]

    Is disentanglement all you need? comparing concept-based & disentanglement approaches

    Dmitry Kazhdan et al. Is disentanglement all you need? comparing concept-based & disentanglement approaches. arXiv:2104.06917, 2021

  37. [37]

    Interpretability beyond feature attribution: Quantitative testing with concept activation vectors

    Been Kim et al. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors. In ICML, 2018

  38. [38]

    Probabilistic concept bottleneck models

    Eunji Kim et al. Probabilistic concept bottleneck models. In ICML, 2023

  39. [39]

    Disentangling by factorising

    Hyunjik Kim and Andriy Mnih. Disentangling by factorising. In ICML, 2018

  40. [40]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014

  41. [41]

    Concept bottleneck models

    Pang Wei Koh et al. Concept bottleneck models. In ICML, 2020

  42. [42]

    Beyond concept bottleneck models: How to make black boxes intervenable? NeurIPS, 2024

    Sonia Laguna et al. Beyond concept bottleneck models: How to make black boxes intervenable? NeurIPS, 2024

  43. [43]

    Faithful vision-language interpretation via concept bottleneck models

    Songning Lai et al. Faithful vision-language interpretation via concept bottleneck models. In ICLR, 2024

  44. [44]

    Find: human-in-the-loop debugging deep text classifiers

    Piyawat Lertvittayakumjorn et al. Find: human-in-the-loop debugging deep text classifiers. In EMNLP, 2020

  45. [45]

    Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions

    Oscar Li et al. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In AAAI, 2018

  46. [46]

    On erroneous agreements of clip image embeddings

    Siting Li et al. On erroneous agreements of clip image embeddings. arXiv:2411.05195, 2024

  47. [47]

    Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

    Shilong Liu et al. Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection. In ECCV, 2024

  48. [48]

    Deep learning face attributes in the wild

    Ziwei Liu et al. Deep learning face attributes in the wild. In ICCV, 2015

  49. [49]

    Towards learning to explain with concept bottleneck models: mitigating information leakage

    Joshua Lockhart et al. Towards learning to explain with concept bottleneck models: mitigating information leakage. arXiv:2211.03656, 2022

  50. [50]

    Promises and pitfalls of black-box concept learning models

    Anita Mahinpei et al. Promises and pitfalls of black-box concept learning models. In Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI @ ICML, 2021

  51. [51]

    GlanceNets: Interpretabile, Leak-proof Concept-based Models

    Emanuele Marconato et al. GlanceNets: Interpretabile, Leak-proof Concept-based Models. In NeurIPS, 2022

  52. [52]

    Interpretability is in the mind of the beholder: A causal framework for human- interpretable representation learning

    Emanuele Marconato et al. Interpretability is in the mind of the beholder: A causal framework for human- interpretable representation learning. Entropy, 2023

  53. [53]

    Not all neuro-symbolic concepts are created equal: Analysis and mitigation of reasoning shortcuts

    Emanuele Marconato et al. Not all neuro-symbolic concepts are created equal: Analysis and mitigation of reasoning shortcuts. NeurIPS, 2024

  54. [54]

    Do concept bottleneck models learn as intended? arXiv:2105.04289, 2021

    Andrei Margeloiu et al. Do concept bottleneck models learn as intended? arXiv:2105.04289, 2021

  55. [55]

    Evaluating the stability of semantic concept representations in CNNs for robust explainability

    Georgii Mikriukov et al. Evaluating the stability of semantic concept representations in CNNs for robust explainability. In World Conference on Explainable Artificial Intelligence, 2023

  56. [56]

    Lost in latent space: Examining failures of disentangled models at combinatorial generalisa- tion

    Milton Montero et al. Lost in latent space: Examining failures of disentangled models at combinatorial generalisa- tion. NeurIPS, 2022

  57. [57]

    Diconstruct: Causal concept-based explanations through black-box distillation

    Ricardo Moreira et al. Diconstruct: Causal concept-based explanations through black-box distillation. arXiv:2401.08534, 2024

  58. [58]

    Label-free concept bottleneck models

    Tuomas Oikarinen et al. Label-free concept bottleneck models. In ICLR, 2023

  59. [59]

    Concept-based explainable artificial intelligence: A survey

    Eleonora Poeta et al. Concept-based explainable artificial intelligence: A survey. arXiv:2312.12936, 2023

  60. [60]

    Learning transferable visual models from natural language supervision

    Alec Radford et al. Learning transferable visual models from natural language supervision. In ICML, 2021

  61. [61]

    From causal to concept-based representation learning

    Goutham Rajendran et al. From causal to concept-based representation learning. NeurIPS, 2024

  62. [62]

    Do concept bottleneck models obey locality? In XAI in Action: Past, Present, and Future Applications, 2023

    Naveen Raman et al. Do concept bottleneck models obey locality? In XAI in Action: Past, Present, and Future Applications, 2023

  63. [63]

    Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery, 2024

    Sukrut Rao et al. Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery, 2024

  64. [64]

    Unpacking large language models with conceptual consistency

    Pritish Sahu et al. Unpacking large language models with conceptual consistency. arXiv:2209.15093, 2022

  65. [65]

    C-senn: Contrastive self-explaining neural network

    Yoshihide Sawada and Keigo Nakamura. C-senn: Contrastive self-explaining neural network. arXiv:2206.09575, 2022

  66. [66]

    Concept bottleneck model with additional unsupervised concepts

    Yoshihide Sawada and Keigo Nakamura. Concept bottleneck model with additional unsupervised concepts. IEEE Access, 2022. 12 A PREPRINT - A PRIL 30, 2025

  67. [67]

    New support vector algorithms

    Bernhard Schölkopf et al. New support vector algorithms. Neural computation, 2000

  68. [68]

    Toward causal representation learning

    Bernhard Schölkopf et al. Toward causal representation learning. IEEE, 2021

  69. [69]

    Concept bottleneck models without predefined concepts

    Simon Schrodi et al. Concept bottleneck models without predefined concepts. arXiv:2407.03921, 2024

  70. [70]

    Concept embedding analysis: A review

    Gesina Schwalbe. Concept embedding analysis: A review. arXiv:2203.13909, 2022

  71. [71]

    A closer look at the intervention procedure of concept bottleneck models

    Sungbin Shin et al. A closer look at the intervention procedure of concept bottleneck models. In ICML, 2023

  72. [72]

    VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance

    Divyansh Srivastava et al. VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance. In NeurIPS, 2024

  73. [73]

    Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations

    Wolfgang Stammer et al. Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations. In CVPR, 2021

  74. [74]

    Learning to intervene on concept bottlenecks

    David Steinmann et al. Learning to intervene on concept bottlenecks. In ICML, 2024

  75. [75]

    Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness

    Raphael Suter et al. Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness. In ICML, 2019

  76. [76]

    Leveraging explanations in interactive machine learning: An overview

    Stefano Teso et al. Leveraging explanations in interactive machine learning: An overview. Frontiers in Artificial Intelligence, 2023

  77. [77]

    Stochastic concept bottleneck models

    Moritz Vandenhirtz et al. Stochastic concept bottleneck models. arXiv:2406.19272, 2024

  78. [78]

    The caltech-ucsd birds-200-2011 dataset

    Catherine Wah et al. The caltech-ucsd birds-200-2011 dataset. 2011

  79. [79]

    Leveraging sparse linear layers for debuggable deep networks

    Eric Wong et al. Leveraging sparse linear layers for debuggable deep networks. In ICML, 2021

  80. [80]

    Language in a bottle: Language model guided concept bottlenecks for interpretable image classification

    Yue Yang et al. Language in a bottle: Language model guided concept bottlenecks for interpretable image classification. In CVPR, 2023

Showing first 80 references.