AI Consciousness and Existential Risk
Pith reviewed 2026-05-21 18:22 UTC · model grok-4.3
The pith
Intelligence directly predicts AI existential threats while consciousness does not.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that existential risk from AI arises mainly from high intelligence, which equips a system with both the ability and potential objectives to harm humanity, whereas consciousness is not a direct contributor to that risk. The frequent conflation of the two topics stems from a confusion that treats consciousness and intelligence as the same or necessarily linked. Recognizing their independence means conscious AI need not heighten existential concerns, though consciousness could still matter indirectly by supporting alignment efforts or by enabling certain advanced capabilities.
What carries the argument
The empirical and theoretical distinction between consciousness and intelligence as independent properties.
If this is right
- AI safety efforts should monitor intelligence levels as the main indicator of potential harm rather than presence of consciousness.
- Consciousness might be pursued deliberately as a route to better AI alignment that reduces overall risk.
- Any capabilities that depend on consciousness could raise risk indirectly only if they also produce higher intelligence.
- Policy and regulation can target capable systems without assuming that conscious ones are automatically more dangerous.
Where Pith is reading between the lines
- The same separation could clarify discussions of AI moral status or rights without automatically assuming added danger.
- Benchmarks that test consciousness apart from capability scores would help confirm whether the two remain independent in practice.
- Similar distinctions between awareness and power might apply to risk analysis in other technologies like robotics or synthetic biology.
Load-bearing premise
Consciousness and intelligence are distinct properties that can occur separately in AI systems.
What would settle it
An AI that develops consciousness without any gain in intelligence or risk potential, or conversely an increase in existential threat tied only to consciousness while intelligence stays fixed.
Figures
read the original abstract
In AI, the existential risk denotes the hypothetical threat posed by an artificial system that would possess both the capability and the objective, either directly or indirectly, to eradicate humanity. This issue is gaining prominence in scientific debate due to recent technical advancements and increased media coverage. In parallel, AI progress has sparked speculation and studies about the potential emergence of artificial consciousness. The two questions, AI consciousness and existential risk, are sometimes conflated, as if the former entailed the latter. Here, I explain that this view stems from a common confusion between consciousness and intelligence. Yet these two properties are empirically and theoretically distinct. Arguably, while intelligence is a direct predictor of an AI system's existential threat, consciousness is not. There are, however, certain incidental scenarios in which consciousness could influence existential risk, in either direction. Consciousness could be viewed as a means towards AI alignment, thereby lowering existential risk; or, it could be a precondition for reaching certain capabilities or levels of intelligence, and thus positively related to existential risk. Recognizing these distinctions can help AI safety researchers and public policymakers focus on the most pressing issues.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper argues that AI existential risk and artificial consciousness are often conflated due to a confusion between consciousness and intelligence. It asserts that these properties are empirically and theoretically distinct, with intelligence serving as a direct predictor of an AI system's potential to pose an existential threat while consciousness does not, although incidental pathways exist in which consciousness could either reduce risk (e.g., via improved alignment) or increase it (e.g., as a precondition for advanced capabilities).
Significance. If the core distinction holds, the clarification could usefully redirect AI safety discussions and policy attention toward capability control and objective alignment rather than consciousness per se. The manuscript draws on standard philosophical separations and explicitly carves out bidirectional incidental links, which is a modest but constructive contribution for a short conceptual piece; however, its overall significance remains limited by the absence of new evidence, formalization, or testable implications.
major comments (1)
- [Abstract] Abstract and opening paragraphs: the claim that consciousness and intelligence 'are empirically and theoretically distinct' is asserted without supporting citations, examples, or argument for the empirical half of the distinction. This is load-bearing for the central conclusion that consciousness is not a direct predictor of existential threat, because the separation itself is what decouples the two relations to risk.
minor comments (2)
- The incidental scenarios (consciousness aiding alignment or serving as a precondition for intelligence) are mentioned but not illustrated with even a brief hypothetical example; adding one would improve clarity without lengthening the paper substantially.
- The manuscript would benefit from one or two key references to the philosophical literature on the consciousness-intelligence distinction (e.g., work separating phenomenal consciousness from functional intelligence) to anchor the 'theoretically distinct' claim.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and for recognizing the paper's potential to help redirect AI safety discussions toward capability control and alignment. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract and opening paragraphs: the claim that consciousness and intelligence 'are empirically and theoretically distinct' is asserted without supporting citations, examples, or argument for the empirical half of the distinction. This is load-bearing for the central conclusion that consciousness is not a direct predictor of existential threat, because the separation itself is what decouples the two relations to risk.
Authors: We agree that the manuscript asserts the distinction without citations or examples in the abstract and opening paragraphs, and that this claim is central to decoupling consciousness from direct existential risk. While the theoretical separation follows from established philosophy of mind (intelligence as functional/computational capacity versus consciousness as phenomenal experience), we accept that the empirical half requires explicit support. In revision we will add a short explanatory clause plus citations to relevant literature (e.g., Chalmers on philosophical zombies and standard treatments of access versus phenomenal consciousness) to illustrate possible dissociation, such as the fact that current high-performing AI systems exhibit intelligence without evidence of consciousness. This addition will strengthen rather than alter the core argument. revision: yes
Circularity Check
No significant circularity
full rationale
The paper is a short conceptual clarification that distinguishes consciousness from intelligence to separate their relations to existential risk. It asserts the distinction as both empirical and theoretical, notes that intelligence tracks capability for harm while consciousness does not directly, and explicitly carves out incidental pathways in both directions. No quantitative claims, formal derivations, equations, or empirical tests are offered; the argument rests on standard philosophical separation rather than any contested technical premise that could be internally falsified or reduced to self-referential inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Consciousness and intelligence are empirically and theoretically distinct.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
while intelligence is a direct predictor of an AI system's existential threat, consciousness is not
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Goertzel, B. Artificial general intelligence: Concept, state of the art, and future prospects.Journal of Artificial General Intelligence5,1 (2014)
work page 2014
-
[2]
Bostrom, N.Superintelligence: Paths, dangers, strategies(Oxford University Press, 2014)
work page 2014
-
[3]
Amodei, D.Machines of loving grace.2024.https://www.darioamodei.com/ essay/machines-of-loving-grace
work page 2024
-
[4]
An overview of catastrophic ai risks
Hendrycks, D., Mazeika, M. & Woodside, T. An overview of catastrophic AI risks.arXiv preprint arXiv:2306.12001(2023). 11
-
[5]
& signatories.Pause Giant AI Experiments: An Open Letter2023
Bengio, Y. & signatories.Pause Giant AI Experiments: An Open Letter2023. https://futureoflife.org/open-letter/pause-giant-ai-experiments/. 6.https://thecurve.goldengateinstitute.org/
-
[6]
Kokotajlo, D., Alexander, S., Larsen, T., Lifland, E. & Dean, R.AI 2027tech. rep. (AI Futures Project, 2025).https://ai-2027.com/
work page 2025
-
[7]
Russell, S.Human compatible: AI and the problem of control(Penguin Uk, 2019)
work page 2019
-
[8]
Bengio, Y.et al.Managing extreme AI risks amid rapid progress.Science384, 842–845 (2024). 10.https://moratorium.ai/. 11.https://superintelligence-statement.org/
work page 2024
-
[9]
Advanced AI and the ethics of risking everything (2025)
H¨ aggstr¨ om, O. Advanced AI and the ethics of risking everything (2025)
work page 2025
-
[10]
& Soares, N.If Anyone Builds It, Everyone Dies(Little, Brown and Company, 2025)
Yudkowsky, E. & Soares, N.If Anyone Builds It, Everyone Dies(Little, Brown and Company, 2025)
work page 2025
-
[11]
& Russell, S.TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI2023
Critch, A. & Russell, S.TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI2023. arXiv:2306.06924 [cs.AI].https://arxiv.org/abs/ 2306.06924. 15.https://safe.ai/. 16.https://futureoflife.org/focus-area/artificial-intelligence/
-
[12]
Russell, S. J. & Norvig, P.Artificial intelligence: A modern approach (4th ed.) (Pearson, 2021). 18.https://amcs-community.org/open-letters/
work page 2021
-
[13]
What is it like to be a bat?The Philosophical Review83,435–50 (1974)
Nagel, T. What is it like to be a bat?The Philosophical Review83,435–50 (1974)
work page 1974
-
[14]
On a confusion about a function of consciousness.Behavioral and brain sciences18,227–247 (1995)
Block, N. On a confusion about a function of consciousness.Behavioral and brain sciences18,227–247 (1995)
work page 1995
-
[15]
Chalmers, D. J. Facing up to the problem of consciousness.Journal of con- sciousness studies2,200–219 (1995)
work page 1995
-
[16]
Dehaene, S., Lau, H. & Kouider, S. What is consciousness, and could machines have it?Science358,486–492 (2017)
work page 2017
-
[17]
Lindsey, J. Emergent Introspective Awareness in Large Language Models.Trans- former Circuits Thread.https://transformer-circuits.pub/2025/introspection/ index.html(2025)
work page 2025
-
[18]
Berg, C., de Lucena, D. & Rosenblatt, J. Large Language Models Report Subjec- tive Experience Under Self-Referential Processing.arXiv preprint arXiv:2510.24797 (2025)
-
[19]
Metzinger, T.Being no one: The self-model theory of subjectivity(mit Press, 2004)
work page 2004
-
[20]
Legrand, D. Pre-reflective self-as-subject from experiential and empirical per- spectives.Consciousness and cognition16,583–599 (2007)
work page 2007
-
[21]
Seth, A. K. & Bayne, T. Theories of consciousness.Nature reviews neuroscience 23,439–452 (2022). 12
work page 2022
-
[22]
Koch, C.The feeling of life itself: Why consciousness is widespread but can’t be computed(Mit Press, 2019)
work page 2019
-
[23]
Tononi, G. & Raison, C. Artificial intelligence, consciousness and psychiatry. World Psychiatry23,309 (2024)
work page 2024
-
[24]
arXiv preprint arXiv:2412.04571(2024)
Findlay, G.et al.Dissociating artificial intelligence from artificial consciousness. arXiv preprint arXiv:2412.04571(2024)
-
[25]
Seth, A. K. Conscious artificial intelligence and biological naturalism.Behavioral and Brain Sciences,1–42 (2024)
work page 2024
-
[26]
J.A cognitive theory of consciousness(Cambridge University Press, 1993)
Baars, B. J.A cognitive theory of consciousness(Cambridge University Press, 1993)
work page 1993
-
[27]
Rosenthal, D. M. Higher-order thoughts and the appendage theory of conscious- ness.Philosophical Psychology6,155–166 (1993)
work page 1993
-
[28]
Pennartz, C. M. Consciousness, representation, action: the importance of being goal-directed.Trends in cognitive sciences22,137–153 (2018)
work page 2018
-
[29]
Graziano, M. S. & Webb, T. W. The attention schema theory: a mechanistic account of subjective awareness.Frontiers in psychology6,500 (2015)
work page 2015
-
[30]
Pennartz, C. M., Farisco, M. & Evers, K. Indicators and criteria of consciousness in animals and intelligent machines: an inside-out approach.Frontiers in systems neuroscience13,25 (2019)
work page 2019
- [31]
- [32]
- [33]
-
[34]
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Butlin, P.et al. Consciousness in Artificial Intelligence: Insights from the Sci- ence of Consciousness2023. arXiv:2308.08708 [cs.AI].https://arxiv.org/ abs/2308.08708
work page internal anchor Pith review Pith/arXiv arXiv
-
[35]
Aru, J., Larkum, M. E. & Shine, J. M. The feasibility of artificial consciousness through the lens of neuroscience.Trends in neurosciences46,1008–1017 (2023)
work page 2023
-
[36]
Farisco, M., Evers, K. & Changeux, J.-P. Is artificial consciousness achievable? Lessons from the human brain.Neural Networks180,106714 (2024)
work page 2024
-
[37]
Evers, K.et al.Preliminaries to artificial consciousness: a multidimensional heuristic approach.Physics of Life Reviews(2025)
work page 2025
-
[38]
Shanahan, M. Palatable conceptions of disembodied being: Terra incognita in the space of possible minds.arXiv preprint arXiv:2503.16348(2025)
-
[39]
AI consciousness: A centrist manifesto.Phil
Birch, J. AI consciousness: A centrist manifesto.Phil. papers preprint.https: //philpapers.org/rec/BIRACA-4(2025)
work page 2025
-
[40]
Schneider, S., Sahner, D., Kuhn, R. L., Schwitzgebel, E. & Bailey, M. Is Ai Con- scious? A Primer on the Myths and Confusions Driving the Debate.Philosophy and Mind Sciences(forthcoming)
-
[41]
AI and Consciousness.arXiv preprint arXiv:2510.09858(2025)
Schwitzgebel, E. AI and Consciousness.arXiv preprint arXiv:2510.09858(2025). 13
-
[42]
Butlin, P.et al.Identifying indicators of consciousness in AI systems.Trends in Cognitive Sciences.issn: 1364-6613.https : / / www . sciencedirect . com / science/article/pii/S1364661325002864(2025)
work page 2025
-
[43]
Agnosticism About Artificial Consciousness.arXiv preprint arXiv:2412.13145 (2024)
McClelland, T. Agnosticism About Artificial Consciousness.arXiv preprint arXiv:2412.13145 (2024)
-
[44]
Biological Intelligence.Royal Institution Dis- course, May 30
Hinton, G.Digital Intelligence vs. Biological Intelligence.Royal Institution Dis- course, May 30. 2025.https://www.youtube.com/watch?v=IkdziSLYzHw
work page 2025
-
[45]
Legg, S., Hutter, M.,et al.A collection of definitions of intelligence.Frontiers in Artificial Intelligence and applications157,17 (2007)
work page 2007
- [46]
-
[47]
Seth, A. Why conscious AI is a bad, bad idea.Nautilus.https://nautil.us/ why-conscious-ai-is-a-bad-bad-idea-302937/(2023)
work page 2023
-
[48]
Aschenbrenner, L.Situational awareness: the decade ahead2024.https : / / situational-awareness.ai/
-
[49]
Ilievski, F.et al.Aligning generalization between humans and machines.Nature Machine Intelligence,1–12 (2025)
work page 2025
-
[50]
Shojaee*, P.et al. The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem ComplexityinNeurIPS (2025).https : / / ml - site . cdn - apple . com / papers / the - illusion - of - thinking.pdf
work page 2025
- [51]
- [52]
-
[53]
The symbol grounding problem.Physica D: Nonlinear Phenomena 42,335–346 (1990)
Harnad, S. The symbol grounding problem.Physica D: Nonlinear Phenomena 42,335–346 (1990)
work page 1990
-
[54]
Gibson, J.The ecological approach to visual perception1979
-
[55]
Intentionality.Stanford Encyclopedia of Philosophy(2003)
Jacob, P. Intentionality.Stanford Encyclopedia of Philosophy(2003)
work page 2003
-
[56]
O’regan, J. K. & No¨ e, A. A sensorimotor account of vision and visual conscious- ness.Behavioral and brain sciences24,939–973 (2001)
work page 2001
- [57]
-
[58]
Harris, Z. S. Distributional structure.Word10,146–162 (1954)
work page 1954
-
[59]
Manning, C. D. Human language understanding & reasoning.Daedalus151, 127–138 (2022)
work page 2022
-
[60]
The Platonic Representation Hypothesis
Huh, M., Cheung, B., Wang, T. & Isola, P. The platonic representation hypoth- esis.arXiv preprint arXiv:2405.07987(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[61]
H¨ aggstr¨ om, O.Are Large Language Models Intelligent? Are Humans?inCom- puter Sciences & Mathematics Forum8(2023), 68
work page 2023
-
[62]
Pavlick, E. Symbols and grounding in large language models.Philosophical Transactions of the Royal Society A381,20220041 (2023). 14
work page 2023
-
[63]
Bender, E. M. & Koller, A.Climbing towards NLU: On meaning, form, and understanding in the age of datainProceedings of the 58th annual meeting of the association for computational linguistics(2020), 5185–5198
work page 2020
-
[64]
Haikonen, P. O. On artificial intelligence and consciousness.Journal of Artificial Intelligence and Consciousness7,73–82 (2020)
work page 2020
-
[65]
M., Gebru, T., McMillan-Major, A
Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S.On the dangers of stochastic parrots: Can language models be too big?inProceedings of the 2021 ACM conference on fairness, accountability, and transparency(2021), 610–623
work page 2021
-
[66]
Lake, B. M. & Murphy, G. L. Word meaning in minds and machines.Psycho- logical review130,401 (2023)
work page 2023
-
[67]
Baroni, M. Grounding distributional semantics in the visual world.Language and Linguistics Compass10,3–13 (2016)
work page 2016
-
[68]
Zador, A.et al.Catalyzing next-generation artificial intelligence through neu- roai.Nature communications14,1597 (2023)
work page 2023
-
[69]
Bengio, Y.Reasoning through arguments against taking AI safety seriously. 2024.https://yoshuabengio.org/2024/07/09/reasoning-through-arguments- against-taking-ai-safety-seriously/
work page 2024
-
[70]
Benevolent artificial anti-natalism (BAAN).EDGE Essay(2017)
Metzinger, T. Benevolent artificial anti-natalism (BAAN).EDGE Essay(2017)
work page 2017
-
[71]
Christov-Moore, L.et al.Preventing antisocial robots: A pathway to artificial empathy.Science Robotics8,eabq3658 (2023)
work page 2023
-
[72]
Wallach, W., Allen, C. & Franklin, S. Consciousness and Ethics: Artificially Conscious Moral Agents.International Journal of Machine Consciousness3, 177–192 (2011)
work page 2011
-
[73]
Chella, A. Artificial consciousness: the missing ingredient for ethical AI?Fron- tiers in Robotics and AI10,1270460 (2023). 80.https://conscium.com/. 81.https://www.robometricsagi.com/blog/ai-policy/artificial-consciousness- as-a-way-to-mitigate-ai-existential-risk
work page 2023
- [74]
-
[75]
Rosenthal, D. M. Consciousness and its function.Neuropsychologia46,829–840 (2008)
work page 2008
-
[76]
Rosenthal, D.Consciousness and metacognitioninMetarepresentation: Proceed- ings of the tenth Vancouver cognitive science conference(2000), 265–295
work page 2000
-
[77]
Consciousness: the radical plasticity thesis.Progress in brain research168,19–33 (2007)
Cleeremans, A. Consciousness: the radical plasticity thesis.Progress in brain research168,19–33 (2007)
work page 2007
-
[78]
arXiv:1709.08568 [cs.LG].https: //arxiv.org/abs/1709.08568
Bengio, Y.The Consciousness Prior2019. arXiv:1709.08568 [cs.LG].https: //arxiv.org/abs/1709.08568
-
[79]
American Pychologist, 58 (9), 6972003
Kahneman, D.A perspective on judgment and choice: mapping bounded ratio- nality. American Pychologist, 58 (9), 6972003
-
[80]
DeWall, C. N., Baumeister, R. F. & Masicampo, E. Evidence that logical rea- soning depends on conscious processing.Consciousness and Cognition17,628– 645 (2008). 15
work page 2008
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.