Unexplainability and Incomprehensibility of Artificial Intelligence
Pith reviewed 2026-05-25 18:59 UTC · model grok-4.3
The pith
Advanced AIs cannot accurately explain some of their decisions, and humans will not understand some of the explanations they can provide.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that advanced AIs would not be able to accurately explain some of their decisions and that for the decisions they could explain people would not understand some of those explanations. These two results, labeled Unexplainability and Incomprehensibility, are presented as impossibility results that together rule out complete transparency between advanced AI and human users.
What carries the argument
The pair of complementary impossibility results Unexplainability and Incomprehensibility, which establish limits on an AI's capacity to explain its decisions and on human capacity to comprehend those explanations.
If this is right
- Requirements for explainable AI in safety-critical domains cannot be satisfied in full.
- Security and safety analysis of advanced AI systems will contain unavoidable gaps from unexplained decisions.
- User requests to understand decisions that affect them cannot always be met.
- Regulatory standards demanding complete explainability for advanced AI will encounter fundamental barriers.
Where Pith is reading between the lines
- Design priorities for future AI may shift away from explanation toward other verification techniques such as empirical testing.
- Similar limits on explanation could apply to other complex decision systems, including expert human judgment.
- Focus on post-hoc auditing methods rather than built-in explanations may become necessary.
Load-bearing premise
Advanced AI possesses decision processes whose full explanation exceeds both the AI's explanatory capacity and human comprehension limits.
What would settle it
Construction of an advanced AI that supplies accurate explanations for every decision it makes and where humans fully comprehend all supplied explanations.
read the original abstract
Explainability and comprehensibility of AI are important requirements for intelligent systems deployed in real-world domains. Users want and frequently need to understand how decisions impacting them are made. Similarly it is important to understand how an intelligent system functions for safety and security reasons. In this paper, we describe two complementary impossibility results (Unexplainability and Incomprehensibility), essentially showing that advanced AIs would not be able to accurately explain some of their decisions and for the decisions they could explain people would not understand some of those explanations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents two complementary impossibility results for advanced AI: unexplainability, asserting that such systems cannot accurately explain some of their decisions because decision-process complexity exceeds the AI's explanatory capacity, and incomprehensibility, asserting that humans cannot understand some explanations even when the AI can provide them. The claims rest on informal arguments linking complexity to these limits without formal models.
Significance. If the results were established via rigorous formalization, they would highlight conceptual barriers to explainable AI in high-stakes domains and inform discussions on transparency requirements. The paper usefully flags that complexity can outstrip both self-explanation and human comprehension, but the absence of derivations or models means the contribution remains at the level of known interpretability challenges rather than new impossibility theorems.
major comments (2)
- [Abstract] Abstract: the unexplainability claim that 'advanced AIs would not be able to accurately explain some of their decisions' is asserted without a formal model of explanation (e.g., via logical entailment, information-theoretic fidelity, or counterfactuals) or a complexity threshold, so the inference from 'high complexity' to 'impossible to explain accurately' is not derived and is load-bearing for the central result.
- [Abstract] Abstract: the incomprehensibility claim similarly lacks a precise definition of 'understand' or a model showing why AI-provided explanations must exceed human limits for some decisions; without this, the result reduces to the observation that some systems are hard to interpret, which does not establish impossibility and is load-bearing for the complementary claim.
minor comments (1)
- The abstract could more explicitly separate the two results and their distinct premises to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the need for greater precision regarding the formal status of our arguments. We respond to each major comment below and indicate planned revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: the unexplainability claim that 'advanced AIs would not be able to accurately explain some of their decisions' is asserted without a formal model of explanation (e.g., via logical entailment, information-theoretic fidelity, or counterfactuals) or a complexity threshold, so the inference from 'high complexity' to 'impossible to explain accurately' is not derived and is load-bearing for the central result.
Authors: The manuscript frames unexplainability as a conceptual impossibility result arising from the mismatch between the complexity of advanced AI decision processes and the capacity of any self-generated explanation. We acknowledge that the link is informal rather than derived from a specific formal model of explanation or an explicit complexity threshold. The contribution is intended as a high-level argument connecting complexity considerations to XAI requirements rather than a mathematical theorem. We will revise the abstract and introduction to explicitly characterize the argument as conceptual and informal. revision: partial
-
Referee: [Abstract] Abstract: the incomprehensibility claim similarly lacks a precise definition of 'understand' or a model showing why AI-provided explanations must exceed human limits for some decisions; without this, the result reduces to the observation that some systems are hard to interpret, which does not establish impossibility and is load-bearing for the complementary claim.
Authors: We agree that the incomprehensibility argument similarly rests on an informal connection between explanation complexity and human cognitive limits without a formal model of understanding. The paper presents this as a complementary conceptual limit rather than a formally derived impossibility. We will revise the abstract to clarify the informal and conceptual character of both results so that readers do not interpret them as formal theorems. revision: partial
Circularity Check
No significant circularity detected
full rationale
The paper presents two complementary impossibility results as philosophical assertions about advanced AI decision processes exceeding explanatory capacity and human comprehension. No equations, parameter fitting, self-citation load-bearing premises, uniqueness theorems, or ansatzes are described in the provided abstract or structure. The claims rest on informal reasoning about complexity thresholds rather than any derivation chain that reduces outputs to inputs by construction. This matches the default expectation for non-circular papers; the absence of formal models or derivations means no load-bearing steps can be exhibited as self-referential per the enumerated patterns.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Interpretable and Explainable Surrogate Modeling for Simulations: A State-of-the-Art Survey and Perspectives on Explainable AI for Decision-Making
This survey synthesizes XAI methods with surrogate modeling workflows for simulations and outlines a research agenda to embed explainability into simulation-driven design and decision-making.
Reference graph
Works this paper leans on
-
[1]
Journal of Consciousness Studies JCS, 2012
Yampolskiy, R.V., Leakproofing Singularity -Artificial Intelligence Confinement Problem. Journal of Consciousness Studies JCS, 2012
work page 2012
-
[2]
Armstrong, S. and R.V. Yampolskiy, Security solutions for intelligent and complex systems, in Security Solutions for Hyperconnectivity and the Internet of Things . 2017, IGI Global. p. 37-88
work page 2017
-
[3]
Silver, D., et al., Mastering the game of go without human knowledge. Nature, 2017. 550(7676): p. 354
work page 2017
-
[4]
Bostrom, N., Superintelligence: Paths, dangers, strategies. 2014: Oxford University Press
work page 2014
-
[5]
Strohmeier, S. and F. Piazza, Artificial Intelligence Techniques in Human Resource Management—A Conceptual Exploration , in Intelligent Techniques in Engineering Management. 2015, Springer. p. 149-172
work page 2015
-
[6]
Walczak, S. and T. Sincich, A comparative analysis of regression and neural networks for university admissions. Information Sciences, 1999. 119(1-2): p. 1-20
work page 1999
-
[7]
Trippi, R.R. and E. Turban, Neural networks in finance and investing: Using artifi cial intelligence to improve real world performance. 1992: McGraw-Hill, Inc
work page 1992
-
[8]
Joel, S., P.W. Eastwick, and E.J. Finkel, Is romantic desire predictable? Machine learning applied to initial romantic attraction. Psychological science, 2017. 28(10): p. 1478-1489
work page 2017
-
[9]
Evaluating race and sex diversity in the world's largest companies using deep neural networks
Chekanov, K., et al., Evaluating race and sex diversity in the world's largest companies using deep neural networks. arXiv preprint arXiv:1707.02353, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[10]
Novikov, D., R.V. Yampolskiy, and L. Reznik. Artificial intelligence approaches for intrusion detection . in 2006 IEEE Long Island Systems, Applications and Technology Conference. 2006. IEEE
work page 2006
-
[11]
Novikov, D., R.V. Yampolskiy, and L. Reznik. Anomaly detection based intrusion detection. in Third International Conference on Information Technology: New Generations (ITNG'06)
-
[12]
Wang, H., N. Wang, and D. -Y. Yeung. Collaborative deep learning for recommender systems. in Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 2015. ACM
work page 2015
-
[13]
Galindo, J. and P. Tamayo, Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications. Computational Economics, 2000. 15(1 - 2): p. 107-143
work page 2000
-
[14]
Goodman, B. and S. Flaxman, European Union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine, 2017. 38(3): p. 50-57
work page 2017
-
[15]
arXiv preprint arXiv:1711.01134, 2017
Doshi-Velez, F., et al., Accountability of AI under the law: The role of explanation. arXiv preprint arXiv:1711.01134, 2017
-
[16]
Osoba, O.A. and W. Welser IV, An intelligence in our image: The risks of bias and errors in artificial intelligence. 2017: Rand Corporation
work page 2017
-
[17]
Yampolskiy, R.V., Artificial Intelligence Safety and Security. 2018: Chapman and Hall/CRC
work page 2018
-
[18]
Yampolskiy, R.V., Artificial superintelligence: a futuristic approach. 2015: Chapman and Hall/CRC
work page 2015
-
[19]
Yampolskiy, R.V., What to Do with the Singularity Paradox? , in Philosophy and Theory of Artificial Intelligence. 2013, Springer. p. 397-413
work page 2013
-
[20]
Pistono, F. and R.V. Yampolskiy, Unethical research: how to create a malevolent artificial intelligence. arXiv preprint arXiv:1605.02817, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[21]
Umbrello, S. and R. Yampolskiy, Designing AI for Explainability and Verifiability: A Value Sensitive Design Approach to Avoid Artificial Stupidity in Autonomous Vehicles
-
[22]
Trazzi, M. and R.V. Yampolskiy, Building Safer AGI by introducing Artificial Stupidity. arXiv preprint arXiv:1808.03644, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[23]
Personal Universes: A Solution to the Multi-Agent Value Alignment Problem
Yampolskiy, R.V., Personal Universes: A Solution to the Multi -Agent Value Alignment Problem. arXiv preprint arXiv:1901.01851, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[24]
Emergence of Addictive Behaviors in Reinforcement Learning Agents
Behzadan, V., R.V. Yampolskiy, and A. Munir, Emergence of Addictive Behaviors in Reinforcement Learning Agents. arXiv preprint arXiv:1811.05590, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[25]
Behzadan, V., A. Munir, and R.V. Yampolskiy. A psychopathological approach to sa fety engineering in ai and agi . in International Conference on Computer Safety, Reliability, and Security. 2018. Springer
work page 2018
-
[26]
Yampolskiy, R.V., Predicting future AI failures from historic examples. Foresight, 2019. 21(1): p. 138-152
work page 2019
-
[27]
Defense Advanced Research Projects Agency (DARPA), nd Web, 2017
Gunning, D., Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web, 2017
work page 2017
-
[28]
Automated Rationale Generation: A Technique for Explainable AI and its Effects on Human Perceptions
Ehsan, U., et al., Automated rationale generation: a technique for explainable AI and its effects on human perceptions. arXiv preprint arXiv:1901.03729, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[29]
Mittelstadt, B., C. Russell, and S. Wachter. Explaining explanations in AI. in Proceedings of the conference on fairness, accountability, and transparency. 2019. ACM
work page 2019
-
[30]
Interpretable and Pedagogical Examples
Milli, S., P. Abbeel, and I. Mordatch, Interpretable and pedagogical examples. arXiv preprint arXiv:1711.00694, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[31]
Kantardzić, M.M. and A.S. Elmaghraby, Logic-oriented model of artificial neural networks. Information sciences, 1997. 101(1-2): p. 85-107
work page 1997
-
[32]
arXiv preprint arXiv:1802.07810, 2018
Poursabzi-Sangdeh, F., et al., Manipulating and measuring mod el interpretability. arXiv preprint arXiv:1802.07810, 2018
-
[33]
Rationalization: A neural machine translation approach to generating natural language explanations
Ehsan, U., et al. Rationalization: A neural machine translation approach to generating natural language explanations . in Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 2018. ACM
work page 2018
-
[34]
Stakeholders in Explainable AI
Preece, A., et al., Stakeholders in Explainable AI. arXiv preprint arXiv:1810.00184, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[35]
Techniques for Interpretable Machine Learning
Du, M., N. Liu, and X. Hu, Techniques for interpretable machine learning. arXiv preprint arXiv:1808.00033, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[36]
Lipton, Z.C., The Doctor Just Won't Accept That! arXiv preprint arXiv:1711.08037, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[37]
The Mythos of Model Interpretability
Lipton, Z.C., The mythos of model interpretability. arXiv preprint arXiv:1606.03490, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[38]
Doshi-Velez, F. and B. Kim, Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[39]
Towards Reverse-Engineering Black-Box Neural Networks
Oh, S.J., et al., Towards reverse -engineering black -box neural networks. arXiv preprint arXiv:1711.01768, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[40]
Lapuschkin, S., et al., Unmasking Clever Hans predictors and assessing what machines really learn. Nature communications, 2019. 10(1): p. 1096
work page 2019
-
[41]
Ribeiro, M.T., S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of any classifier . in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016. ACM
work page 2016
-
[42]
Adadi, A. and M. Berrada, Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access, 2018. 6: p. 52138-52160
work page 2018
-
[43]
Abdul, A., et al. Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda . in Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 2018. ACM
work page 2018
-
[44]
ACM computing surveys (CSUR), 2018
Guidotti, R., et al., A survey of methods for explaining black box models. ACM computing surveys (CSUR), 2018. 51(5): p. 93
work page 2018
-
[45]
Došilović, F.K., M. Brčić, and N. Hlupić. Explainable artificial intelligence: A survey . in 2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO). 2018. IEEE
work page 2018
-
[46]
Miller, T., Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 2018
work page 2018
-
[47]
Yampolskiy, R.V., B. Klare, and A.K. Jain. Face recognition in the virtual world: recognizing avatar faces. in 2012 11th International Conference on Machine Learning and Applications
work page 2012
-
[48]
Mohamed, A.A. and R.V. Yampolskiy. An improved LBP algorithm for avatar face recognition. in 2011 XXIII International Symposium on Information, Communication and Automation Technologies. 2011. IEEE
work page 2011
- [49]
-
[50]
Kim, B., et al., Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). arXiv preprint arXiv:1711.11279, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[51]
Olah, C., et al., The building blocks of interpretability. Distill, 2018. 3(3): p. e10
work page 2018
-
[52]
Accuracy and interpretability trade-offs in machine learning applied to safer gambling
Sarkar, S., et al. Accuracy and interpretability trade-offs in machine learning applied to safer gambling. in CEUR Workshop Proceedings. 2016. CEUR Workshop Proceedings
work page 2016
-
[53]
Sutcliffe, G., Y. Gao, and S. Colton. A grand challenge of theorem discovery. in Proceedings of the Workshop on Challenges and Novel Applications for Automated Reasoning, 19th International Conference on Automated Reasoning. 2003
work page 2003
-
[54]
Muggleton, S.H., et al., Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP. Machine Learning, 2018. 107(7): p. 1119-1140
work page 2018
-
[55]
Breiman, L., Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 2001. 16(3): p. 199-231
work page 2001
-
[56]
ACM Transactions on Computational Logic (TOCL), 2006
Charlesworth, A., Comprehending software correctness implies comprehending an intelligence-related limitation. ACM Transactions on Computational Logic (TOCL), 2006. 7(3): p. 590-612
work page 2006
-
[57]
Charlesworth, A., The comprehensibility theorem and the foundations of artificial intelligence. Minds and Machines, 2014. 24(4): p. 439-476
work page 2014
-
[58]
Hernández-Orallo, J. and N. Minaya-Collado. A formal definition of intelligence based on an intensional variant of algorithmic complexity. in Proceedings of International Symposium of Engineering of Intelligent Systems (EIS98). 1998
work page 1998
-
[59]
Li, M. and P. Vitányi, An introduction to Kolmogorov complexity and its applications . Vol
-
[60]
The space of possible mind designs
Yampolskiy, R.V. The space of possible mind designs . in International Conference on Artificial General Intelligence. 2015. Springer
work page 2015
-
[61]
Weinberger, D., Our machines now have knowledge we’ll never understand, in Wired. 2017: Available at: https://www.wired.com/story/our-machines-now-have-knowledge-well-never- understand
work page 2017
-
[62]
Gödel, K., On formally undecidable propositions of Principia Mathematica and related systems. 1992: Courier Corporation
work page 1992
-
[63]
Heisenberg, W., Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik, in Original Scientific Papers Wissenschaftliche Originalarbeiten. 1985, Springer. p. 478-504
work page 1985
-
[64]
Fisher, M., N. Lynch, and M. Peterson, Impossibility of Distributed Consensus with One Faulty Process. Journal of ACM, 1985. 32(2): p. 374-382
work page 1985
-
[65]
Grossman, S.J. and J.E. Stiglitz, On the impossibility of informationally efficient markets. The American economic review, 1980. 70(3): p. 393-408
work page 1980
-
[66]
An impossibility theorem for clustering
Kleinberg, J.M. An impossibility theorem for clustering. in Advances in neural information processing systems. 2003
work page 2003
-
[67]
Strawson, G., The impossibility of moral responsibility. Philosophical studies, 1994. 75(1): p. 5-24
work page 1994
-
[68]
Bazerman, M.H., K.P. Morgan, and G.F. Loewenstein, The impossibility of a uditor independence. Sloan Management Review, 1997. 38: p. 89-94
work page 1997
-
[69]
List, C. and P. Pettit, Aggregating sets of judgments: An impossibility result. Economics & Philosophy, 2002. 18(1): p. 89-110
work page 2002
-
[70]
Econometrica: Journal of the Econometric Society, 1997: p
Dufour, J.-M., Some impossibility theorems in economet rics with applications to structural and dynamic models. Econometrica: Journal of the Econometric Society, 1997: p. 1365-1387
work page 1997
-
[71]
Yampolskiy, R.V., What are the ultimate limits to computational techniques: verifier theory and unverifiability. Physica Scripta, 2017. 92(9): p. 093001
work page 2017
-
[72]
Yampolskiy, R.V., Unpredictability of AI. arXiv preprint arXiv:1905.13053, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1905
-
[73]
Armstrong, S. and S. Mindermann, Impossibility of deducing preferences and rationality from human policy. arXiv preprint arXiv:1712.05812, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[74]
Eckersley, P., Impossibility and Uncertainty Theorems in AI Value Alignment
-
[75]
A framework for explanation of machine learning decisions
Brinton, C. A framework for explanation of machine learning decisions . in IJCAI-17 Workshop on Explainable AI (XAI). 2017
work page 2017
-
[76]
Hutter, M., The Human knowledge compression prize. URL http://prize. hutter1. net, 2006
work page 2006
-
[77]
Retrieved June 16, 2019: Available at: http://www.faqs.org/faqs/compression-faq/part1/section-8.html
Compression of random data (WEB, Gilbert and others) , in Faqs. Retrieved June 16, 2019: Available at: http://www.faqs.org/faqs/compression-faq/part1/section-8.html
work page 2019
-
[78]
2015: Ecco/HarperCollins Publishers
Gazzaniga, M.S., Tales from both sides of the brain: A life in neuroscience . 2015: Ecco/HarperCollins Publishers
work page 2015
-
[79]
Shanks, D.R., Complex choices better made unconsciously? Science, 2006. 313(5788): p. 760-761
work page 2006
- [80]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.