On the Semantic Interpretability of Artificial Intelligence Models

Andr\'e Freitas; Siegfried Handschuh; Vivian S. Silva

REVIEW 2 major objections 2 minor 14 references

AI models are classified by their nature and how they embed interpretability features to reveal gaps in human-centered explanations.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-25 00:33 UTC pith:WI3V5ENE

load-bearing objection This is a survey that organizes interpretability work across ML, semantics and fuzzy logic but introduces no new methods or results. the 2 major comments →

arxiv 1907.04105 v1 pith:WI3V5ENE submitted 2019-07-09 cs.AI cs.CL

On the Semantic Interpretability of Artificial Intelligence Models

Vivian S. Silva , Andr\'e Freitas , Siegfried Handschuh This is my paper

classification cs.AI cs.CL

keywords semantic interpretabilityAI modelsmodel classificationdistributional semanticsfuzzy logichuman-centered explanationsexplainable AI

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews semantic interpretability across multiple AI fields, including machine learning, distributional semantics, and fuzzy logic. It groups models first by their basic character and second by the specific mechanisms they use to make predictions understandable. The review then examines the practical effects of each group on end users and identifies shortcomings that block more intuitive, human-aligned solutions. A reader would care because AI systems now support or replace human decisions, so explanations must match how people actually reason and justify choices.

Core claim

We examine and classify the models according to their nature and also based on how they introduce interpretability features, analyzing how each approach affects the final users and pointing to gaps that still need to be addressed to provide more human-centered interpretability solutions.

What carries the argument

Dual classification of models by foundational nature and by the type of interpretability features they add.

Load-bearing premise

The chosen categories capture the main models without leaving out important approaches or applying biased groupings.

What would settle it

A significant AI model that fits none of the defined categories, or clear evidence that one of the identified gaps has already been closed by work outside the surveyed scope.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Each model type produces distinct effects on how well users can follow its reasoning.
Gaps remain in delivering explanations that align with ordinary human understanding.
Addressing the gaps would improve trust when AI assists or replaces human decisions.
Considering fields beyond machine learning uncovers limitations hidden in narrower surveys.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The classification could serve as a template for building new hybrid models that combine interpretability strengths from different fields.
Linking the gaps to findings from cognitive science might sharpen definitions of what counts as human-centered.
Re-applying the same lens to rapidly evolving AI systems could track whether the gaps narrow or widen over time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

This is a survey that organizes interpretability work across ML, semantics and fuzzy logic but introduces no new methods or results.

read the letter

The paper surveys semantic interpretability by pulling in models from machine learning, distributional semantics, fuzzy logic and related areas. It sorts them by model nature and by the way they add interpretability features, then comments on how those choices affect end users and lists open gaps for more human-centered solutions. That cross-field scope is the main thing it contributes; most interpretability discussions stay inside one sub-area, so the broader map can help readers see connections they might miss otherwise. The classification itself is straightforward and the user-impact discussion is direct. The soft spot is the lack of any stated search method, coverage numbers or check on the taxonomy, so the gaps it flags rest on whatever papers the authors chose to include. Without that, it is hard to know how complete or biased the picture is. This is the kind of overview that could be handy for a reader who wants a quick entry point into interpretability outside core ML, but it will not change how anyone builds or evaluates a model. A serious editor could send it for review as a survey if the full classification turns out to be coherent and the gaps are stated precisely; otherwise it is the sort of paper that gets desk-rejected for adding little beyond organization.

Referee Report

2 major / 2 minor

Summary. The paper surveys semantic interpretability in AI models across fields including machine learning, distributional semantics, and fuzzy logic. It classifies models by their nature and by the mechanisms used to introduce interpretability features, analyzes effects on end users, and identifies gaps that must be addressed to achieve more human-centered interpretability solutions.

Significance. If the taxonomy is reproducible and the gap analysis is grounded in a defensible selection of literature, the work could help bridge subfields and orient future research on interpretability. The explicit extension beyond ML is a potential contribution, but its value hinges on the rigor of the classification and the justification for the identified gaps.

major comments (2)

[Abstract and classification sections] The central claim—that the classification comprehensively identifies gaps—depends on the survey methodology, yet no section describes a systematic search protocol, inclusion/exclusion criteria, or coverage metrics for the models examined.
[Classification and gap analysis sections] The classification scheme is presented as a key contribution, but no section provides validation (e.g., inter-annotator agreement, sensitivity analysis of groupings, or explicit handling of borderline cases), leaving the taxonomy open to subjective bias that directly affects the gap-identification claim.

minor comments (2)

[Classification sections] Add concrete examples or a table summarizing representative models per category to make the distinctions between classification dimensions clearer to readers.
[User impact analysis] Ensure the discussion of user impact includes references to empirical studies on human-AI interaction rather than remaining at a high level.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address the two major points below and outline revisions to improve methodological transparency while preserving the interdisciplinary scope of the survey.

read point-by-point responses

Referee: [Abstract and classification sections] The central claim—that the classification comprehensively identifies gaps—depends on the survey methodology, yet no section describes a systematic search protocol, inclusion/exclusion criteria, or coverage metrics for the models examined.

Authors: We agree that an explicit account of the literature selection process would strengthen the justification for the identified gaps. The original manuscript drew on representative works across machine learning, distributional semantics, fuzzy logic and related areas to emphasize cross-field connections, but did not include a formal protocol. In revision we will add a short “Survey Methodology” subsection describing the main sources consulted (key venues and databases), the inclusion focus on semantic interpretability rather than purely technical XAI techniques, and a qualitative coverage statement. This addition will directly support the gap-analysis claims without altering the paper’s scope. revision: yes
Referee: [Classification and gap analysis sections] The classification scheme is presented as a key contribution, but no section provides validation (e.g., inter-annotator agreement, sensitivity analysis of groupings, or explicit handling of borderline cases), leaving the taxonomy open to subjective bias that directly affects the gap-identification claim.

Authors: The taxonomy was constructed by the authors through iterative examination of model properties (nature and interpretability-introduction mechanism). We accept that greater transparency is needed. The revised version will (i) expand the criteria description with concrete examples, (ii) discuss several borderline cases and the rationale for their placement, and (iii) report a sensitivity check by re-grouping a representative subset of models and noting effects on the gap list. Inter-annotator agreement is not applicable, as the taxonomy is a conceptual synthesis rather than an annotation task performed by independent coders; we will state this explicitly. These changes constitute a partial but substantive response to the concern. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a descriptive survey that classifies AI interpretability models by nature and mechanism, analyzes user impact, and identifies gaps. It contains no equations, derivations, fitted parameters, predictions, or load-bearing self-citations. The classification is presented as an organizational framework rather than a derived result, so no step reduces to its inputs by construction. This matches the expected finding for non-derivational survey work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review paper; no free parameters, axioms, or invented entities are introduced or relied upon beyond standard background knowledge in AI.

pith-pipeline@v0.9.0 · 5669 in / 992 out tokens · 20444 ms · 2026-05-25T00:33:23.635174+00:00 · methodology

0 comments

read the original abstract

Artificial Intelligence models are becoming increasingly more powerful and accurate, supporting or even replacing humans' decision making. But with increased power and accuracy also comes higher complexity, making it hard for users to understand how the model works and what the reasons behind its predictions are. Humans must explain and justify their decisions, and so do the AI models supporting them in this process, making semantic interpretability an emerging field of study. In this work, we look at interpretability from a broader point of view, going beyond the machine learning scope and covering different AI fields such as distributional semantics and fuzzy logic, among others. We examine and classify the models according to their nature and also based on how they introduce interpretability features, analyzing how each approach affects the final users and pointing to gaps that still need to be addressed to provide more human-centered interpretability solutions.

Figures

Figures reproduced from arXiv: 1907.04105 by Andr\'e Freitas, Siegfried Handschuh, Vivian S. Silva.

**Figure 1.** Figure 1: A decision set (left) and a decision list (right) learned from a diagnosis dataset, as provided by Lakkaraju et [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: Examples of post-hoc explanations: (a) ABCD report (Lloyd et al., 2014); (b) LIME prediction explanation [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: A generic AI system architecture and the points where interpretability features can be inserted. [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: The four quadrants of AI models’ interpretability. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 2 internal anchors

[1]

others (2015)

Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., . . . others (2015). Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. InProceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015
[2]

252–263)

(pp. 252–263). Agirre, E., Gonzalez-Agirre, A., Lopez-Gazpio, I., Maritxalar, M., Rigau, G., & Uria, L. (2015). Ubc: Cubes for english semantic textual similarity and supervised approaches for interpretable sts. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015
[3]

178–183)

(pp. 178–183). Alonso, J. M., Castiello, C., & Mencar, C. (2015). Interpretability of fuzzy systems: Current research trends and prospects. In Springer handbook of computational intelligence (pp. 219–237). Springer. Alonso, J. M., & Magdalena, L. (2011). Special issue on interpretable fuzzy systems. Information Sciences, 20(181), 4331–4339. Alonso, J. M.,...

work page 2015
[4]

B., Maharjan, N., Rus, V ., Stefanescu, D., Lintean, M., & Gautam, D

Banjade, R., Niraula, N. B., Maharjan, N., Rus, V ., Stefanescu, D., Lintean, M., & Gautam, D. (2015). Nerosim: A system for measuring and interpreting semantic textual similarity. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015
[5]

164–171)

(pp. 164–171). Baroni, M., Murphy, B., Barbu, E., & Poesio, M. (2010). Strudel: A corpus-based semantic model based on properties and types. Cognitive Science, 34(2), 222–254. Bic ¸ici, E. (2015). Rtm-dcu: Predicting semantic similarity with referential translation machines. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2010
[6]

(pp. 56–63). Biran, O., & Cotton, C. (2017). Explanation and justiﬁcation in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI) (pp. 8–13). Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. Blei, D. M., & Lafferty, J. D. (2005). Correlated topic models. In Proceedings of the 18th international c...

work page internal anchor Pith review Pith/arXiv arXiv 2017
[7]

264–268)

(pp. 264–268). Hassan, B., AbdelRahman, S., & Bahgat, R. (2015). Fcicu: The integration between sense-based kernel and surface- based methods to measure semantic textual similarity. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015
[8]

154–158)

(pp. 154–158). Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the twenty-second annual international sigir conference. Jolliffe, I. T. (1986). Principal component analysis and factor analysis. InPrincipal component analysis(pp. 115–128). Springer. Karumuri, S., Vuggumudi, V. K. R., & Chitirala, S. C. R. (2015). Umduluth-blue...

work page 1999
[9]

The Mythos of Model Interpretability

(pp. 107–110). Kim, B., Shah, J. A., & Doshi-Velez, F. (2015). Mind the gap: A generative approach to interpretable feature selection and extraction. In Advances in neural information processing systems (pp. 2260–2268). Klema, V ., & Laub, A. (1980). The singular value decomposition: Its computation and some applications. IEEE Transactions on automatic co...

work page internal anchor Pith review Pith/arXiv arXiv 2015
[10]

Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable ai: Beware of inmates running the asylum. In IJCAI-17 workshop on explainable AI (XAI) (pp. 36–42). Murphy, B., Talukdar, P., & Mitchell, T. (2012). Learning effective and interpretable semantic models using non- negative sparse embedding. Proceedings of COLING 2012, 1933–1950. Pancho, D. P., Alons...

work page 2017
[11]

V ., Blumer, K., Liu, Y ., McConnell, M

Poplin, R., Varadarajan, A. V ., Blumer, K., Liu, Y ., McConnell, M. V ., Corrado, G. S., . . . Webster, D. R. (2018). Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering, 158–164. Ramage, D., Manning, C. D., & Dumais, S. (2011). Partially labeled topic models for interpretable text mi...

work page 2018
[12]

Simonite, T. (2018). When it comes to gorillas, Google Photos remains blind. Wired. Retrieved from https://www .wired.com/story/when-it-comes-to-gorillas-google-photos-remains-blind/ Song, Y ., Wang, H., Wang, Z., Li, H., & Chen, W. (2011). Short text conceptualization using a probabilistic knowl- edgebase. In Proceedings of the twenty-second internationa...

work page 2018
[13]

163–172)

(pp. 163–172). V oorhees, E. M. (2008). Contradictions and justiﬁcations: Extensions to the textual entailment task. In 46th annual meeting of the association for computational linguistics: Human language technologies (ACL 2008). Wang, X., McCallum, A., & Wei, X. (2007). Topical n-grams: Phrase and topic discovery, with an application to information retri...

work page 2008
[14]

697–702)

(pp. 697–702). Zhao, J., Wang, T., Yatskar, M., Ordonez, V ., & Chang, K.-W. (2017). Men also like shopping: Reducing gender bias ampliﬁcation using corpus-level constraints. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 2979–2989). 17

work page 2017

[1] [1]

others (2015)

Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., . . . others (2015). Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. InProceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015

[2] [2]

252–263)

(pp. 252–263). Agirre, E., Gonzalez-Agirre, A., Lopez-Gazpio, I., Maritxalar, M., Rigau, G., & Uria, L. (2015). Ubc: Cubes for english semantic textual similarity and supervised approaches for interpretable sts. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015

[3] [3]

178–183)

(pp. 178–183). Alonso, J. M., Castiello, C., & Mencar, C. (2015). Interpretability of fuzzy systems: Current research trends and prospects. In Springer handbook of computational intelligence (pp. 219–237). Springer. Alonso, J. M., & Magdalena, L. (2011). Special issue on interpretable fuzzy systems. Information Sciences, 20(181), 4331–4339. Alonso, J. M.,...

work page 2015

[4] [4]

B., Maharjan, N., Rus, V ., Stefanescu, D., Lintean, M., & Gautam, D

Banjade, R., Niraula, N. B., Maharjan, N., Rus, V ., Stefanescu, D., Lintean, M., & Gautam, D. (2015). Nerosim: A system for measuring and interpreting semantic textual similarity. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015

[5] [5]

164–171)

(pp. 164–171). Baroni, M., Murphy, B., Barbu, E., & Poesio, M. (2010). Strudel: A corpus-based semantic model based on properties and types. Cognitive Science, 34(2), 222–254. Bic ¸ici, E. (2015). Rtm-dcu: Predicting semantic similarity with referential translation machines. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2010

[6] [6]

(pp. 56–63). Biran, O., & Cotton, C. (2017). Explanation and justiﬁcation in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI) (pp. 8–13). Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. Blei, D. M., & Lafferty, J. D. (2005). Correlated topic models. In Proceedings of the 18th international c...

work page internal anchor Pith review Pith/arXiv arXiv 2017

[7] [7]

264–268)

(pp. 264–268). Hassan, B., AbdelRahman, S., & Bahgat, R. (2015). Fcicu: The integration between sense-based kernel and surface- based methods to measure semantic textual similarity. In Proceedings of the 9th international workshop on semantic evaluation (SemEval

work page 2015

[8] [8]

154–158)

(pp. 154–158). Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the twenty-second annual international sigir conference. Jolliffe, I. T. (1986). Principal component analysis and factor analysis. InPrincipal component analysis(pp. 115–128). Springer. Karumuri, S., Vuggumudi, V. K. R., & Chitirala, S. C. R. (2015). Umduluth-blue...

work page 1999

[9] [9]

The Mythos of Model Interpretability

(pp. 107–110). Kim, B., Shah, J. A., & Doshi-Velez, F. (2015). Mind the gap: A generative approach to interpretable feature selection and extraction. In Advances in neural information processing systems (pp. 2260–2268). Klema, V ., & Laub, A. (1980). The singular value decomposition: Its computation and some applications. IEEE Transactions on automatic co...

work page internal anchor Pith review Pith/arXiv arXiv 2015

[10] [10]

Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable ai: Beware of inmates running the asylum. In IJCAI-17 workshop on explainable AI (XAI) (pp. 36–42). Murphy, B., Talukdar, P., & Mitchell, T. (2012). Learning effective and interpretable semantic models using non- negative sparse embedding. Proceedings of COLING 2012, 1933–1950. Pancho, D. P., Alons...

work page 2017

[11] [11]

V ., Blumer, K., Liu, Y ., McConnell, M

Poplin, R., Varadarajan, A. V ., Blumer, K., Liu, Y ., McConnell, M. V ., Corrado, G. S., . . . Webster, D. R. (2018). Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature Biomedical Engineering, 158–164. Ramage, D., Manning, C. D., & Dumais, S. (2011). Partially labeled topic models for interpretable text mi...

work page 2018

[12] [12]

Simonite, T. (2018). When it comes to gorillas, Google Photos remains blind. Wired. Retrieved from https://www .wired.com/story/when-it-comes-to-gorillas-google-photos-remains-blind/ Song, Y ., Wang, H., Wang, Z., Li, H., & Chen, W. (2011). Short text conceptualization using a probabilistic knowl- edgebase. In Proceedings of the twenty-second internationa...

work page 2018

[13] [13]

163–172)

(pp. 163–172). V oorhees, E. M. (2008). Contradictions and justiﬁcations: Extensions to the textual entailment task. In 46th annual meeting of the association for computational linguistics: Human language technologies (ACL 2008). Wang, X., McCallum, A., & Wei, X. (2007). Topical n-grams: Phrase and topic discovery, with an application to information retri...

work page 2008

[14] [14]

697–702)

(pp. 697–702). Zhao, J., Wang, T., Yatskar, M., Ordonez, V ., & Chang, K.-W. (2017). Men also like shopping: Reducing gender bias ampliﬁcation using corpus-level constraints. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 2979–2989). 17

work page 2017