A Framework for Evaluating Zero-Shot Image Generation in Concept-based Explainability
Pith reviewed 2026-05-20 05:58 UTC · model grok-4.3
The pith
Zero-shot text-to-image models can generate synthetic concept datasets for explainable AI, yet four analyses reveal they fall short of real images in faithfulness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We investigate the use of zero-shot Text-to-Image (T2I) generative models as a source of synthetic concept datasets for concept-based XAI methods. We generate concepts using predefined prompts and evaluate their faithfulness to real ones through four complementary analyses: comparing synthetic versus real concept images via concept representation similarity; evaluating their intra-similarity by comparing pairs of subsets of the same concept with progressively increasing size; evaluating their performance for downstream explanation tasks using relevant class images; and evaluating how removing a concept from tested class images affects explanations of generated concepts.
What carries the argument
Four complementary analyses that measure representation similarity, intra-similarity scaling, downstream explanation performance, and concept-removal effects between synthetic and real concept images.
If this is right
- Synthetic concepts could replace large real datasets when linking model features to class predictions.
- Explanation quality would remain consistent if the synthetic images pass the similarity and performance checks.
- Concept removal experiments would produce matching shifts in explanations for both synthetic and real data.
- Scalability gains would appear once the faithfulness gaps are closed by better generation methods.
Where Pith is reading between the lines
- The same four-analysis template could be reused to test other forms of synthetic data in interpretability research.
- Hybrid datasets that mix a small number of real images with generated ones might close the observed gaps more quickly.
- Prompt design choices could be studied as a controllable variable to improve concept fidelity in future pipelines.
Load-bearing premise
That success across the four analyses together is enough to decide whether the synthetic concepts are faithful enough to replace real ones in XAI work.
What would settle it
If removing a generated concept from class images changes the model's explanations in a clearly different way than removing the matching real concept, the synthetic data would fail the faithfulness test.
Figures
read the original abstract
Concept-based Explainable Artificial Intelligence (XAI) interprets deep learning models using human-understandable visual features (e.g., textures or object parts) by linking internal representations to class predictions, thereby bridging the gap between low-level image data and high-level semantics. A major challenge, however, is the reliance on large sets of labeled images to represent each concept, which limits scalability. In this work, we investigate the use of zero-shot Text-to-Image (T2I) generative models as a source of synthetic concept datasets for concept-based XAI methods. Specifically, we generate concepts using predefined prompts and evaluate their faithfulness to real ones through four complementary analyses: (1) comparing synthetic vs. real concept images via concept representation similarity; (2) evaluating their intra-similarity by comparing pairs of subsets of the same concept with progressively increasing size; (3) evaluating their performance for downstream explanation tasks using relevant class images; (4) evaluating how removing a concept from tested class images affects explanations of generated concepts. While current T2I generative models promise a shortcut to concept-based XAI, our study highlights challenges and raises open questions about the use of synthetic data generated by zero-shot pipelines in model analyses. The resulting dataset is available at https://github.com/DataSciencePolimi/ZeroShot-T2I-Concepts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript investigates the use of zero-shot Text-to-Image (T2I) generative models to produce synthetic concept datasets for concept-based XAI. It generates concepts via predefined prompts and evaluates faithfulness to real concepts through four complementary analyses: (1) concept representation similarity between synthetic and real images, (2) intra-similarity scaling by comparing subsets of increasing size within the same concept, (3) downstream explanation performance when using synthetic concepts on relevant class images, and (4) effects of removing a concept from tested class images on explanations of generated concepts. The work emphasizes surfacing challenges and open questions about zero-shot synthetic data pipelines rather than claiming they are ready for reliable XAI deployment, and releases the resulting dataset.
Significance. If the empirical outcomes hold, the proposed evaluation framework could help address scalability limitations in concept-based XAI by providing diagnostic tools for synthetic data quality. The release of the dataset supports reproducibility and further community investigation. The exploratory framing, which prioritizes raising open questions over definitive certification, is a measured contribution to the field.
major comments (2)
- [§4.3] §4.3 (downstream explanation performance analysis): the evaluation uses synthetic concepts for explanation tasks but does not report a direct baseline comparison against explanations derived from real concept datasets on the same class images, which is load-bearing for quantifying the practical impact of any observed performance differences.
- [§4.4] §4.4 (concept-removal effects): the analysis of how concept removal affects explanations lacks a control condition using real concepts, making it difficult to isolate whether observed changes are due to synthetic data properties or inherent to the removal procedure itself.
minor comments (3)
- [Abstract and §3] The abstract and §3 could more explicitly state the criteria or thresholds used to interpret 'challenges' from the four analyses, to avoid ambiguity in how mixed results are classified.
- [Figures 1-3] Figure captions for generated image examples should include the exact prompt templates and any post-processing steps applied to the T2I outputs.
- [Related Work] A small number of recent references on T2I model biases and concept fidelity (e.g., works from 2023-2024 on prompt engineering for semantic consistency) are absent from the related work section.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and recommendation for minor revision. The comments highlight opportunities to strengthen the quantitative interpretation of our downstream analyses, which we address below.
read point-by-point responses
-
Referee: [§4.3] §4.3 (downstream explanation performance analysis): the evaluation uses synthetic concepts for explanation tasks but does not report a direct baseline comparison against explanations derived from real concept datasets on the same class images, which is load-bearing for quantifying the practical impact of any observed performance differences.
Authors: We agree that including a direct baseline using real concept datasets on the same class images would better quantify the practical impact of any performance differences observed with synthetic concepts. While §4.1 already compares synthetic and real concept representations and §4.3 focuses on the usability of synthetic concepts for downstream tasks, we did not report this specific real-concept baseline. In the revised manuscript we will add the requested comparison to provide clearer context for the results in §4.3. revision: yes
-
Referee: [§4.4] §4.4 (concept-removal effects): the analysis of how concept removal affects explanations lacks a control condition using real concepts, making it difficult to isolate whether observed changes are due to synthetic data properties or inherent to the removal procedure itself.
Authors: We concur that a control condition with real concepts is necessary to isolate whether the observed changes stem from properties of the synthetic data or from the removal procedure. The current analysis in §4.4 evaluates the effect of concept removal on explanations generated from synthetic concepts, but does not include the parallel real-concept control. We will incorporate this control analysis in the revised version of §4.4. revision: yes
Circularity Check
No significant circularity
full rationale
The paper is an empirical evaluation study that generates synthetic concept images via zero-shot T2I prompts and applies four diagnostic analyses (representation similarity, intra-similarity scaling, downstream explanation performance, and concept-removal effects) to compare against real data. No mathematical derivations, first-principles predictions, fitted parameters renamed as outputs, or self-citation chains are present; the work explicitly frames its contribution as surfacing challenges and open questions rather than certifying faithfulness through any closed loop. All steps rely on external models and real-world benchmarks, making the evaluation self-contained against independent data.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we generate concepts using predefined prompts and evaluate their faithfulness to real ones through four complementary analyses: (1) comparing synthetic vs. real concept images via concept representation similarity; (2) evaluating their intra-similarity...
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
RQ1Alignment of Concept Representations... RQ4Counterfactual Testing on Downstream Task
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Alejandro Barredo Arrieta, Natalia D ´ıaz-Rodr´ıguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, Sergio Gil-Lopez, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. Explainable artificial intelligence (xai): Concepts, taxonomies, opportu- nities and challenges toward responsible ai.Information Fu- ...
work page 2020
-
[2]
To- wards synthetic concept activation vectors via generative models
Riccardo Campi, Santiago Borrego, Antonio De Santis, Mat- teo Bianchi, Andrea Tocchetti, and Marco Brambilla. To- wards synthetic concept activation vectors via generative models. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2711– 2719, 2025. 2, 3
work page 2025
- [3]
-
[4]
2 - foundational approaches to post-hoc explainability for image classifica- tion
Antonio De Santis, Riccardo Campi, Matteo Bianchi, An- drea Tocchetti, and Marco Brambilla. 2 - foundational approaches to post-hoc explainability for image classifica- tion. InBi-directionality in Human-AI Collaborative Sys- tems, pages 23–54. Academic Press, 2025. 1, 2
work page 2025
-
[5]
Imagenet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. 4, 5
work page 2009
-
[6]
Scaling rec- tified flow transformers for high-resolution image synthesis
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas M ¨uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, and Robin Rombach. Scaling rec- tified flow transformers for high-resolution image synthesis. InProceedings of the 41st International Conference on Ma- chine Learning...
work page 2024
-
[7]
Identity mappings in deep residual networks
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. InComputer Vision – ECCV 2016, pages 630–645, Cham, 2016. Springer International Publishing. 5
work page 2016
-
[8]
Lg-cav: Train any concept activation vector with lan- guage guidance
Qihan Huang, Jie Song, Mengqi Xue, Haofei Zhang, Bingde Hu, Huiqiong Wang, Hao Jiang, Xingen Wang, and Mingli Song. Lg-cav: Train any concept activation vector with lan- guage guidance. InAdvances in Neural Information Process- ing Systems, pages 39522–39551. Curran Associates, Inc.,
-
[9]
Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory sayres. Inter- pretability beyond feature attribution: Quantitative testing with concept activation vectors (TCA V). InProceedings of the 35th International Conference on Machine Learning, pages 2668–2677. PMLR, 2018. 1, 2, 5
work page 2018
-
[10]
Flux.1 kontext: Flow matching for in-context image generation and editing in latent space,
Black Forest Labs, Stephen Batifol, Andreas Blattmann, Frederic Boesel, Saksham Consul, Cyril Diagne, Tim Dock- horn, Jack English, Zion English, Patrick Esser, Sumith Ku- lal, Kyle Lacey, Yam Levi, Cheng Li, Dominik Lorenz, Jonas M¨uller, Dustin Podell, Robin Rombach, Harry Saini, Axel Sauer, and Luke Smith. Flux.1 kontext: Flow matching for in-context i...
-
[11]
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11966–11976, 2022. 5
work page 2022
-
[12]
Tyler Martin and Adrian Weller.Interpretable Machine Learning. M.Phil. diss., Dept. of Engineering, University of Cambridge, 2019. 2
work page 2019
-
[13]
Text2concept: Concept activation vectors di- rectly from text
Mazda Moayeri, Keivan Rezaei, Maziar Sanjabi, and So- heil Feizi. Text2concept: Concept activation vectors di- rectly from text. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 3744–3749, 2023. 2
work page 2023
-
[14]
Indepen- dently published, 2022
Christoph Molnar.Interpretable Machine Learning: A Guide For Making Black Box Models Explainable. Indepen- dently published, 2022. 2
work page 2022
-
[15]
Anders, Thomas Wiegand, Wo- jciech Samek, and Sebastian Lapuschkin
Frederik Pahde, Maximilian Dreyer, Moritz Weckbecker, Le- ander Weber, Christopher J. Anders, Thomas Wiegand, Wo- jciech Samek, and Sebastian Lapuschkin. Navigating neural space: Revisiting concept activation vectors to overcome di- rectional divergence. InThe Thirteenth International Con- ference on Learning Representations, 2025. 2
work page 2025
-
[16]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learning, pages 8748–8763. PMLR, 2021. 2, 5
work page 2021
-
[17]
Antonio De Santis, Riccardo Campi, Matteo Bianchi, and Marco Brambilla. Visual-TCA V: Concept-based Attribu- tion and Saliency Maps for Post-hoc Explainability in Image Classification, 2025. arXiv:2411.05698 [cs]. 1, 2, 5
-
[18]
Lavanya Sharan, Ruth Rosenholtz, and Edward H. Adelson. Accuracy and speed of material categorization in real-world images.Journal of Vision, 14(10), 2014. 4
work page 2014
-
[19]
Very deep convo- lutional networks for large-scale image recognition
Karen Simonyan and Andrew Zisserman. Very deep convo- lutional networks for large-scale image recognition. InIn- ternational Conference on Learning Representations, 2015. 5
work page 2015
-
[20]
Rethinking the inception ar- chitecture for computer vision
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception ar- chitecture for computer vision. In2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, 2016. 5
work page 2016
-
[21]
Efros, Eli Shecht- man, and Oliver Wang
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shecht- man, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018. 3 9 A Framework for Evaluating Zero-Shot Image Generation in Concept-based Explainability Supplementary Material...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.