AI Consciousness and Existential Risk

Rufin VanRullen

arxiv: 2511.19115 · v2 · pith:FZ4HBSP6new · submitted 2025-11-24 · 💻 cs.AI · cs.CY

AI Consciousness and Existential Risk

Rufin VanRullen This is my paper

Pith reviewed 2026-05-21 18:22 UTC · model grok-4.3

classification 💻 cs.AI cs.CY

keywords AI consciousnessexistential riskAI intelligenceAI safetyAI alignmentconsciousness distinction

0 comments

The pith

Intelligence directly predicts AI existential threats while consciousness does not.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to untangle why people link AI consciousness with existential risk. It shows that this connection rests on mixing up consciousness with intelligence, two properties that are distinct in both theory and evidence. Intelligence increases an AI's capability to cause widespread harm, but consciousness on its own adds no such direct threat. This separation matters because it lets safety work focus on actual capability risks instead of sentience fears. Some side connections remain possible, such as consciousness aiding alignment or serving as a step toward greater intelligence.

Core claim

The central claim is that existential risk from AI arises mainly from high intelligence, which equips a system with both the ability and potential objectives to harm humanity, whereas consciousness is not a direct contributor to that risk. The frequent conflation of the two topics stems from a confusion that treats consciousness and intelligence as the same or necessarily linked. Recognizing their independence means conscious AI need not heighten existential concerns, though consciousness could still matter indirectly by supporting alignment efforts or by enabling certain advanced capabilities.

What carries the argument

The empirical and theoretical distinction between consciousness and intelligence as independent properties.

If this is right

AI safety efforts should monitor intelligence levels as the main indicator of potential harm rather than presence of consciousness.
Consciousness might be pursued deliberately as a route to better AI alignment that reduces overall risk.
Any capabilities that depend on consciousness could raise risk indirectly only if they also produce higher intelligence.
Policy and regulation can target capable systems without assuming that conscious ones are automatically more dangerous.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same separation could clarify discussions of AI moral status or rights without automatically assuming added danger.
Benchmarks that test consciousness apart from capability scores would help confirm whether the two remain independent in practice.
Similar distinctions between awareness and power might apply to risk analysis in other technologies like robotics or synthetic biology.

Load-bearing premise

Consciousness and intelligence are distinct properties that can occur separately in AI systems.

What would settle it

An AI that develops consciousness without any gain in intelligence or risk potential, or conversely an increase in existential threat tied only to consciousness while intelligence stays fixed.

Figures

Figures reproduced from arXiv: 2511.19115 by Rufin VanRullen.

**Figure 2.** Figure 2: Secondhand x-risk from AI consciousness. Two specific scenarios [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

In AI, the existential risk denotes the hypothetical threat posed by an artificial system that would possess both the capability and the objective, either directly or indirectly, to eradicate humanity. This issue is gaining prominence in scientific debate due to recent technical advancements and increased media coverage. In parallel, AI progress has sparked speculation and studies about the potential emergence of artificial consciousness. The two questions, AI consciousness and existential risk, are sometimes conflated, as if the former entailed the latter. Here, I explain that this view stems from a common confusion between consciousness and intelligence. Yet these two properties are empirically and theoretically distinct. Arguably, while intelligence is a direct predictor of an AI system's existential threat, consciousness is not. There are, however, certain incidental scenarios in which consciousness could influence existential risk, in either direction. Consciousness could be viewed as a means towards AI alignment, thereby lowering existential risk; or, it could be a precondition for reaching certain capabilities or levels of intelligence, and thus positively related to existential risk. Recognizing these distinctions can help AI safety researchers and public policymakers focus on the most pressing issues.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Short conceptual note separating consciousness from intelligence in AI risk talks, but mostly restates established distinctions without new substance.

read the letter

The main thing to know is that this paper argues existential risk from AI tracks intelligence and capabilities, not consciousness, and that conflating the two is a mistake. It lays out the distinction cleanly and notes a couple of incidental pathways where consciousness could still matter, either by supporting alignment or by serving as a stepping stone to advanced capabilities. That part is fair and avoids the usual oversimplification. The writing stays direct and keeps the focus on practical implications for safety research and policy. What the paper does well is acknowledge those indirect links instead of claiming total independence. On the soft spots, the core separation between consciousness and intelligence is already standard in philosophy of mind and cognitive science, so the application here does not introduce new results or frameworks. The argument stays purely conceptual with no data, derivations, or testable claims, which limits how much weight it can carry. The incidental scenarios are mentioned but not developed. This piece is mainly for readers who are new to the topic or who encounter media coverage that mixes sentience with danger. Someone already working in AI safety or familiar with the relevant literature will not find much that is fresh. It shows clear enough thinking on its own terms, but the low novelty and lack of technical content mean it does not need a full referee process. A quick editorial check for clarity would be enough; I would not send it out for serious peer review.

Referee Report

1 major / 2 minor

Summary. The paper argues that AI existential risk and artificial consciousness are often conflated due to a confusion between consciousness and intelligence. It asserts that these properties are empirically and theoretically distinct, with intelligence serving as a direct predictor of an AI system's potential to pose an existential threat while consciousness does not, although incidental pathways exist in which consciousness could either reduce risk (e.g., via improved alignment) or increase it (e.g., as a precondition for advanced capabilities).

Significance. If the core distinction holds, the clarification could usefully redirect AI safety discussions and policy attention toward capability control and objective alignment rather than consciousness per se. The manuscript draws on standard philosophical separations and explicitly carves out bidirectional incidental links, which is a modest but constructive contribution for a short conceptual piece; however, its overall significance remains limited by the absence of new evidence, formalization, or testable implications.

major comments (1)

[Abstract] Abstract and opening paragraphs: the claim that consciousness and intelligence 'are empirically and theoretically distinct' is asserted without supporting citations, examples, or argument for the empirical half of the distinction. This is load-bearing for the central conclusion that consciousness is not a direct predictor of existential threat, because the separation itself is what decouples the two relations to risk.

minor comments (2)

The incidental scenarios (consciousness aiding alignment or serving as a precondition for intelligence) are mentioned but not illustrated with even a brief hypothetical example; adding one would improve clarity without lengthening the paper substantially.
The manuscript would benefit from one or two key references to the philosophical literature on the consciousness-intelligence distinction (e.g., work separating phenomenal consciousness from functional intelligence) to anchor the 'theoretically distinct' claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and for recognizing the paper's potential to help redirect AI safety discussions toward capability control and alignment. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract and opening paragraphs: the claim that consciousness and intelligence 'are empirically and theoretically distinct' is asserted without supporting citations, examples, or argument for the empirical half of the distinction. This is load-bearing for the central conclusion that consciousness is not a direct predictor of existential threat, because the separation itself is what decouples the two relations to risk.

Authors: We agree that the manuscript asserts the distinction without citations or examples in the abstract and opening paragraphs, and that this claim is central to decoupling consciousness from direct existential risk. While the theoretical separation follows from established philosophy of mind (intelligence as functional/computational capacity versus consciousness as phenomenal experience), we accept that the empirical half requires explicit support. In revision we will add a short explanatory clause plus citations to relevant literature (e.g., Chalmers on philosophical zombies and standard treatments of access versus phenomenal consciousness) to illustrate possible dissociation, such as the fact that current high-performing AI systems exhibit intelligence without evidence of consciousness. This addition will strengthen rather than alter the core argument. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a short conceptual clarification that distinguishes consciousness from intelligence to separate their relations to existential risk. It asserts the distinction as both empirical and theoretical, notes that intelligence tracks capability for harm while consciousness does not directly, and explicitly carves out incidental pathways in both directions. No quantitative claims, formal derivations, equations, or empirical tests are offered; the argument rests on standard philosophical separation rather than any contested technical premise that could be internally falsified or reduced to self-referential inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unproven premise that consciousness and intelligence are distinct, with no independent evidence or formal support provided in the abstract.

axioms (1)

domain assumption Consciousness and intelligence are empirically and theoretically distinct.
Invoked directly in the abstract as the basis for separating the two issues.

pith-pipeline@v0.9.0 · 5709 in / 1013 out tokens · 27660 ms · 2026-05-21T18:22:12.302999+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

while intelligence is a direct predictor of an AI system's existential threat, consciousness is not

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

102 extracted references · 102 canonical work pages · 3 internal anchors

[1]

Artificial general intelligence: Concept, state of the art, and future prospects.Journal of Artificial General Intelligence5,1 (2014)

Goertzel, B. Artificial general intelligence: Concept, state of the art, and future prospects.Journal of Artificial General Intelligence5,1 (2014)

work page 2014
[2]

Bostrom, N.Superintelligence: Paths, dangers, strategies(Oxford University Press, 2014)

work page 2014
[3]

Amodei, D.Machines of loving grace.2024.https://www.darioamodei.com/ essay/machines-of-loving-grace

work page 2024
[4]

An overview of catastrophic ai risks

Hendrycks, D., Mazeika, M. & Woodside, T. An overview of catastrophic AI risks.arXiv preprint arXiv:2306.12001(2023). 11

work page arXiv 2023
[5]

& signatories.Pause Giant AI Experiments: An Open Letter2023

Bengio, Y. & signatories.Pause Giant AI Experiments: An Open Letter2023. https://futureoflife.org/open-letter/pause-giant-ai-experiments/. 6.https://thecurve.goldengateinstitute.org/

work page
[6]

& Dean, R.AI 2027tech

Kokotajlo, D., Alexander, S., Larsen, T., Lifland, E. & Dean, R.AI 2027tech. rep. (AI Futures Project, 2025).https://ai-2027.com/

work page 2025
[7]

Russell, S.Human compatible: AI and the problem of control(Penguin Uk, 2019)

work page 2019
[8]

10.https://moratorium.ai/

Bengio, Y.et al.Managing extreme AI risks amid rapid progress.Science384, 842–845 (2024). 10.https://moratorium.ai/. 11.https://superintelligence-statement.org/

work page 2024
[9]

Advanced AI and the ethics of risking everything (2025)

H¨ aggstr¨ om, O. Advanced AI and the ethics of risking everything (2025)

work page 2025
[10]

& Soares, N.If Anyone Builds It, Everyone Dies(Little, Brown and Company, 2025)

Yudkowsky, E. & Soares, N.If Anyone Builds It, Everyone Dies(Little, Brown and Company, 2025)

work page 2025
[11]

& Russell, S.TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI2023

Critch, A. & Russell, S.TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI2023. arXiv:2306.06924 [cs.AI].https://arxiv.org/abs/ 2306.06924. 15.https://safe.ai/. 16.https://futureoflife.org/focus-area/artificial-intelligence/

work page arXiv
[12]

Russell, S. J. & Norvig, P.Artificial intelligence: A modern approach (4th ed.) (Pearson, 2021). 18.https://amcs-community.org/open-letters/

work page 2021
[13]

What is it like to be a bat?The Philosophical Review83,435–50 (1974)

Nagel, T. What is it like to be a bat?The Philosophical Review83,435–50 (1974)

work page 1974
[14]

On a confusion about a function of consciousness.Behavioral and brain sciences18,227–247 (1995)

Block, N. On a confusion about a function of consciousness.Behavioral and brain sciences18,227–247 (1995)

work page 1995
[15]

Chalmers, D. J. Facing up to the problem of consciousness.Journal of con- sciousness studies2,200–219 (1995)

work page 1995
[16]

& Kouider, S

Dehaene, S., Lau, H. & Kouider, S. What is consciousness, and could machines have it?Science358,486–492 (2017)

work page 2017
[17]

Emergent Introspective Awareness in Large Language Models.Trans- former Circuits Thread.https://transformer-circuits.pub/2025/introspection/ index.html(2025)

Lindsey, J. Emergent Introspective Awareness in Large Language Models.Trans- former Circuits Thread.https://transformer-circuits.pub/2025/introspection/ index.html(2025)

work page 2025
[18]

& Rosenblatt, J

Berg, C., de Lucena, D. & Rosenblatt, J. Large Language Models Report Subjec- tive Experience Under Self-Referential Processing.arXiv preprint arXiv:2510.24797 (2025)

work page arXiv 2025
[19]

Metzinger, T.Being no one: The self-model theory of subjectivity(mit Press, 2004)

work page 2004
[20]

Pre-reflective self-as-subject from experiential and empirical per- spectives.Consciousness and cognition16,583–599 (2007)

Legrand, D. Pre-reflective self-as-subject from experiential and empirical per- spectives.Consciousness and cognition16,583–599 (2007)

work page 2007
[21]

Seth, A. K. & Bayne, T. Theories of consciousness.Nature reviews neuroscience 23,439–452 (2022). 12

work page 2022
[22]

Koch, C.The feeling of life itself: Why consciousness is widespread but can’t be computed(Mit Press, 2019)

work page 2019
[23]

& Raison, C

Tononi, G. & Raison, C. Artificial intelligence, consciousness and psychiatry. World Psychiatry23,309 (2024)

work page 2024
[24]

arXiv preprint arXiv:2412.04571(2024)

Findlay, G.et al.Dissociating artificial intelligence from artificial consciousness. arXiv preprint arXiv:2412.04571(2024)

work page arXiv 2024
[25]

Seth, A. K. Conscious artificial intelligence and biological naturalism.Behavioral and Brain Sciences,1–42 (2024)

work page 2024
[26]

J.A cognitive theory of consciousness(Cambridge University Press, 1993)

Baars, B. J.A cognitive theory of consciousness(Cambridge University Press, 1993)

work page 1993
[27]

Rosenthal, D. M. Higher-order thoughts and the appendage theory of conscious- ness.Philosophical Psychology6,155–166 (1993)

work page 1993
[28]

Pennartz, C. M. Consciousness, representation, action: the importance of being goal-directed.Trends in cognitive sciences22,137–153 (2018)

work page 2018
[29]

Graziano, M. S. & Webb, T. W. The attention schema theory: a mechanistic account of subjective awareness.Frontiers in psychology6,500 (2015)

work page 2015
[30]

M., Farisco, M

Pennartz, C. M., Farisco, M. & Evers, K. Indicators and criteria of consciousness in animals and intelligent machines: an inside-out approach.Frontiers in systems neuroscience13,25 (2019)

work page 2019
[31]

& Blum, L

Blum, M. & Blum, L. A theoretical computer science perspective on conscious- ness.Journal of Artificial Intelligence and Consciousness8,1–42 (2021)

work page 2021
[32]

& Blum, M

Blum, L. & Blum, M. A theory of consciousness from a theoretical computer science perspective: Insights from the Conscious Turing Machine.Proceedings of the National Academy of Sciences119,e2115934119 (2022)

work page 2022
[33]

& Blum, M

Blum, L. & Blum, M. A theoretical computer science perspective on conscious- ness and artificial general intelligence.Engineering25,12–16 (2023)

work page 2023
[34]

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Butlin, P.et al. Consciousness in Artificial Intelligence: Insights from the Sci- ence of Consciousness2023. arXiv:2308.08708 [cs.AI].https://arxiv.org/ abs/2308.08708

work page internal anchor Pith review Pith/arXiv arXiv
[35]

Aru, J., Larkum, M. E. & Shine, J. M. The feasibility of artificial consciousness through the lens of neuroscience.Trends in neurosciences46,1008–1017 (2023)

work page 2023
[36]

& Changeux, J.-P

Farisco, M., Evers, K. & Changeux, J.-P. Is artificial consciousness achievable? Lessons from the human brain.Neural Networks180,106714 (2024)

work page 2024
[37]

Evers, K.et al.Preliminaries to artificial consciousness: a multidimensional heuristic approach.Physics of Life Reviews(2025)

work page 2025
[38]

Palatable conceptions of disembodied being: Terra incognita in the space of possible minds.arXiv preprint arXiv:2503.16348(2025)

Shanahan, M. Palatable conceptions of disembodied being: Terra incognita in the space of possible minds.arXiv preprint arXiv:2503.16348(2025)

work page arXiv 2025
[39]

AI consciousness: A centrist manifesto.Phil

Birch, J. AI consciousness: A centrist manifesto.Phil. papers preprint.https: //philpapers.org/rec/BIRACA-4(2025)

work page 2025
[40]

L., Schwitzgebel, E

Schneider, S., Sahner, D., Kuhn, R. L., Schwitzgebel, E. & Bailey, M. Is Ai Con- scious? A Primer on the Myths and Confusions Driving the Debate.Philosophy and Mind Sciences(forthcoming)

work page
[41]

AI and Consciousness.arXiv preprint arXiv:2510.09858(2025)

Schwitzgebel, E. AI and Consciousness.arXiv preprint arXiv:2510.09858(2025). 13

work page arXiv 2025
[42]

sciencedirect

Butlin, P.et al.Identifying indicators of consciousness in AI systems.Trends in Cognitive Sciences.issn: 1364-6613.https : / / www . sciencedirect . com / science/article/pii/S1364661325002864(2025)

work page 2025
[43]

Agnosticism About Artificial Consciousness.arXiv preprint arXiv:2412.13145 (2024)

McClelland, T. Agnosticism About Artificial Consciousness.arXiv preprint arXiv:2412.13145 (2024)

work page arXiv 2024
[44]

Biological Intelligence.Royal Institution Dis- course, May 30

Hinton, G.Digital Intelligence vs. Biological Intelligence.Royal Institution Dis- course, May 30. 2025.https://www.youtube.com/watch?v=IkdziSLYzHw

work page 2025
[45]

Legg, S., Hutter, M.,et al.A collection of definitions of intelligence.Frontiers in Artificial Intelligence and applications157,17 (2007)

work page 2007
[46]

Dreksler, N.et al.Subjective experience in AI systems: what do AI researchers and the public believe?arXiv preprint arXiv:2506.11945(2025)

work page arXiv 2025
[47]

Why conscious AI is a bad, bad idea.Nautilus.https://nautil.us/ why-conscious-ai-is-a-bad-bad-idea-302937/(2023)

Seth, A. Why conscious AI is a bad, bad idea.Nautilus.https://nautil.us/ why-conscious-ai-is-a-bad-bad-idea-302937/(2023)

work page 2023
[48]

Aschenbrenner, L.Situational awareness: the decade ahead2024.https : / / situational-awareness.ai/

work page
[49]

Ilievski, F.et al.Aligning generalization between humans and machines.Nature Machine Intelligence,1–12 (2025)

work page 2025
[50]

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem ComplexityinNeurIPS (2025).https : / / ml - site

Shojaee*, P.et al. The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem ComplexityinNeurIPS (2025).https : / / ml - site . cdn - apple . com / papers / the - illusion - of - thinking.pdf

work page 2025
[51]

Eriksson, M.et al.Can we trust ai benchmarks? an interdisciplinary review of current issues in ai evaluation.arXiv preprint arXiv:2502.06559(2025)

work page arXiv 2025
[52]

Cheng, Z.et al.Benchmarking is Broken-Don’t Let AI be its Own Judge.arXiv preprint arXiv:2510.07575(2025)

work page arXiv 2025
[53]

The symbol grounding problem.Physica D: Nonlinear Phenomena 42,335–346 (1990)

Harnad, S. The symbol grounding problem.Physica D: Nonlinear Phenomena 42,335–346 (1990)

work page 1990
[54]

Gibson, J.The ecological approach to visual perception1979

work page
[55]

Intentionality.Stanford Encyclopedia of Philosophy(2003)

Jacob, P. Intentionality.Stanford Encyclopedia of Philosophy(2003)

work page 2003
[56]

O’regan, J. K. & No¨ e, A. A sensorimotor account of vision and visual conscious- ness.Behavioral and brain sciences24,939–973 (2001)

work page 2001
[57]

Chalmers, D. J. Does thought require sensory grounding? From pure thinkers to large language models.arXiv preprint arXiv:2408.09605(2024)

work page arXiv 2024
[58]

Harris, Z. S. Distributional structure.Word10,146–162 (1954)

work page 1954
[59]

Manning, C. D. Human language understanding & reasoning.Daedalus151, 127–138 (2022)

work page 2022
[60]

The Platonic Representation Hypothesis

Huh, M., Cheung, B., Wang, T. & Isola, P. The platonic representation hypoth- esis.arXiv preprint arXiv:2405.07987(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[61]

H¨ aggstr¨ om, O.Are Large Language Models Intelligent? Are Humans?inCom- puter Sciences & Mathematics Forum8(2023), 68

work page 2023
[62]

Symbols and grounding in large language models.Philosophical Transactions of the Royal Society A381,20220041 (2023)

Pavlick, E. Symbols and grounding in large language models.Philosophical Transactions of the Royal Society A381,20220041 (2023). 14

work page 2023
[63]

Bender, E. M. & Koller, A.Climbing towards NLU: On meaning, form, and understanding in the age of datainProceedings of the 58th annual meeting of the association for computational linguistics(2020), 5185–5198

work page 2020
[64]

Haikonen, P. O. On artificial intelligence and consciousness.Journal of Artificial Intelligence and Consciousness7,73–82 (2020)

work page 2020
[65]

M., Gebru, T., McMillan-Major, A

Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S.On the dangers of stochastic parrots: Can language models be too big?inProceedings of the 2021 ACM conference on fairness, accountability, and transparency(2021), 610–623

work page 2021
[66]

Lake, B. M. & Murphy, G. L. Word meaning in minds and machines.Psycho- logical review130,401 (2023)

work page 2023
[67]

Grounding distributional semantics in the visual world.Language and Linguistics Compass10,3–13 (2016)

Baroni, M. Grounding distributional semantics in the visual world.Language and Linguistics Compass10,3–13 (2016)

work page 2016
[68]

Zador, A.et al.Catalyzing next-generation artificial intelligence through neu- roai.Nature communications14,1597 (2023)

work page 2023
[69]

2024.https://yoshuabengio.org/2024/07/09/reasoning-through-arguments- against-taking-ai-safety-seriously/

Bengio, Y.Reasoning through arguments against taking AI safety seriously. 2024.https://yoshuabengio.org/2024/07/09/reasoning-through-arguments- against-taking-ai-safety-seriously/

work page 2024
[70]

Benevolent artificial anti-natalism (BAAN).EDGE Essay(2017)

Metzinger, T. Benevolent artificial anti-natalism (BAAN).EDGE Essay(2017)

work page 2017
[71]

Christov-Moore, L.et al.Preventing antisocial robots: A pathway to artificial empathy.Science Robotics8,eabq3658 (2023)

work page 2023
[72]

& Franklin, S

Wallach, W., Allen, C. & Franklin, S. Consciousness and Ethics: Artificially Conscious Moral Agents.International Journal of Machine Consciousness3, 177–192 (2011)

work page 2011
[73]

Artificial consciousness: the missing ingredient for ethical AI?Fron- tiers in Robotics and AI10,1270460 (2023)

Chella, A. Artificial consciousness: the missing ingredient for ethical AI?Fron- tiers in Robotics and AI10,1270460 (2023). 80.https://conscium.com/. 81.https://www.robometricsagi.com/blog/ai-policy/artificial-consciousness- as-a-way-to-mitigate-ai-existential-risk

work page 2023
[74]

& Koch, C

Crick, F. & Koch, C. A framework for consciousness.Nature neuroscience6, 119–126 (2003)

work page 2003
[75]

Rosenthal, D. M. Consciousness and its function.Neuropsychologia46,829–840 (2008)

work page 2008
[76]

Rosenthal, D.Consciousness and metacognitioninMetarepresentation: Proceed- ings of the tenth Vancouver cognitive science conference(2000), 265–295

work page 2000
[77]

Consciousness: the radical plasticity thesis.Progress in brain research168,19–33 (2007)

Cleeremans, A. Consciousness: the radical plasticity thesis.Progress in brain research168,19–33 (2007)

work page 2007
[78]

arXiv:1709.08568 [cs.LG].https: //arxiv.org/abs/1709.08568

Bengio, Y.The Consciousness Prior2019. arXiv:1709.08568 [cs.LG].https: //arxiv.org/abs/1709.08568

work page arXiv
[79]

American Pychologist, 58 (9), 6972003

Kahneman, D.A perspective on judgment and choice: mapping bounded ratio- nality. American Pychologist, 58 (9), 6972003

work page
[80]

N., Baumeister, R

DeWall, C. N., Baumeister, R. F. & Masicampo, E. Evidence that logical rea- soning depends on conscious processing.Consciousness and Cognition17,628– 645 (2008). 15

work page 2008

Showing first 80 references.

[1] [1]

Artificial general intelligence: Concept, state of the art, and future prospects.Journal of Artificial General Intelligence5,1 (2014)

Goertzel, B. Artificial general intelligence: Concept, state of the art, and future prospects.Journal of Artificial General Intelligence5,1 (2014)

work page 2014

[2] [2]

Bostrom, N.Superintelligence: Paths, dangers, strategies(Oxford University Press, 2014)

work page 2014

[3] [3]

Amodei, D.Machines of loving grace.2024.https://www.darioamodei.com/ essay/machines-of-loving-grace

work page 2024

[4] [4]

An overview of catastrophic ai risks

Hendrycks, D., Mazeika, M. & Woodside, T. An overview of catastrophic AI risks.arXiv preprint arXiv:2306.12001(2023). 11

work page arXiv 2023

[5] [5]

& signatories.Pause Giant AI Experiments: An Open Letter2023

Bengio, Y. & signatories.Pause Giant AI Experiments: An Open Letter2023. https://futureoflife.org/open-letter/pause-giant-ai-experiments/. 6.https://thecurve.goldengateinstitute.org/

work page

[6] [6]

& Dean, R.AI 2027tech

Kokotajlo, D., Alexander, S., Larsen, T., Lifland, E. & Dean, R.AI 2027tech. rep. (AI Futures Project, 2025).https://ai-2027.com/

work page 2025

[7] [7]

Russell, S.Human compatible: AI and the problem of control(Penguin Uk, 2019)

work page 2019

[8] [8]

10.https://moratorium.ai/

Bengio, Y.et al.Managing extreme AI risks amid rapid progress.Science384, 842–845 (2024). 10.https://moratorium.ai/. 11.https://superintelligence-statement.org/

work page 2024

[9] [9]

Advanced AI and the ethics of risking everything (2025)

H¨ aggstr¨ om, O. Advanced AI and the ethics of risking everything (2025)

work page 2025

[10] [10]

& Soares, N.If Anyone Builds It, Everyone Dies(Little, Brown and Company, 2025)

Yudkowsky, E. & Soares, N.If Anyone Builds It, Everyone Dies(Little, Brown and Company, 2025)

work page 2025

[11] [11]

& Russell, S.TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI2023

Critch, A. & Russell, S.TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI2023. arXiv:2306.06924 [cs.AI].https://arxiv.org/abs/ 2306.06924. 15.https://safe.ai/. 16.https://futureoflife.org/focus-area/artificial-intelligence/

work page arXiv

[12] [12]

Russell, S. J. & Norvig, P.Artificial intelligence: A modern approach (4th ed.) (Pearson, 2021). 18.https://amcs-community.org/open-letters/

work page 2021

[13] [13]

What is it like to be a bat?The Philosophical Review83,435–50 (1974)

Nagel, T. What is it like to be a bat?The Philosophical Review83,435–50 (1974)

work page 1974

[14] [14]

On a confusion about a function of consciousness.Behavioral and brain sciences18,227–247 (1995)

Block, N. On a confusion about a function of consciousness.Behavioral and brain sciences18,227–247 (1995)

work page 1995

[15] [15]

Chalmers, D. J. Facing up to the problem of consciousness.Journal of con- sciousness studies2,200–219 (1995)

work page 1995

[16] [16]

& Kouider, S

Dehaene, S., Lau, H. & Kouider, S. What is consciousness, and could machines have it?Science358,486–492 (2017)

work page 2017

[17] [17]

Emergent Introspective Awareness in Large Language Models.Trans- former Circuits Thread.https://transformer-circuits.pub/2025/introspection/ index.html(2025)

Lindsey, J. Emergent Introspective Awareness in Large Language Models.Trans- former Circuits Thread.https://transformer-circuits.pub/2025/introspection/ index.html(2025)

work page 2025

[18] [18]

& Rosenblatt, J

Berg, C., de Lucena, D. & Rosenblatt, J. Large Language Models Report Subjec- tive Experience Under Self-Referential Processing.arXiv preprint arXiv:2510.24797 (2025)

work page arXiv 2025

[19] [19]

Metzinger, T.Being no one: The self-model theory of subjectivity(mit Press, 2004)

work page 2004

[20] [20]

Pre-reflective self-as-subject from experiential and empirical per- spectives.Consciousness and cognition16,583–599 (2007)

Legrand, D. Pre-reflective self-as-subject from experiential and empirical per- spectives.Consciousness and cognition16,583–599 (2007)

work page 2007

[21] [21]

Seth, A. K. & Bayne, T. Theories of consciousness.Nature reviews neuroscience 23,439–452 (2022). 12

work page 2022

[22] [22]

Koch, C.The feeling of life itself: Why consciousness is widespread but can’t be computed(Mit Press, 2019)

work page 2019

[23] [23]

& Raison, C

Tononi, G. & Raison, C. Artificial intelligence, consciousness and psychiatry. World Psychiatry23,309 (2024)

work page 2024

[24] [24]

arXiv preprint arXiv:2412.04571(2024)

Findlay, G.et al.Dissociating artificial intelligence from artificial consciousness. arXiv preprint arXiv:2412.04571(2024)

work page arXiv 2024

[25] [25]

Seth, A. K. Conscious artificial intelligence and biological naturalism.Behavioral and Brain Sciences,1–42 (2024)

work page 2024

[26] [26]

J.A cognitive theory of consciousness(Cambridge University Press, 1993)

Baars, B. J.A cognitive theory of consciousness(Cambridge University Press, 1993)

work page 1993

[27] [27]

Rosenthal, D. M. Higher-order thoughts and the appendage theory of conscious- ness.Philosophical Psychology6,155–166 (1993)

work page 1993

[28] [28]

Pennartz, C. M. Consciousness, representation, action: the importance of being goal-directed.Trends in cognitive sciences22,137–153 (2018)

work page 2018

[29] [29]

Graziano, M. S. & Webb, T. W. The attention schema theory: a mechanistic account of subjective awareness.Frontiers in psychology6,500 (2015)

work page 2015

[30] [30]

M., Farisco, M

Pennartz, C. M., Farisco, M. & Evers, K. Indicators and criteria of consciousness in animals and intelligent machines: an inside-out approach.Frontiers in systems neuroscience13,25 (2019)

work page 2019

[31] [31]

& Blum, L

Blum, M. & Blum, L. A theoretical computer science perspective on conscious- ness.Journal of Artificial Intelligence and Consciousness8,1–42 (2021)

work page 2021

[32] [32]

& Blum, M

Blum, L. & Blum, M. A theory of consciousness from a theoretical computer science perspective: Insights from the Conscious Turing Machine.Proceedings of the National Academy of Sciences119,e2115934119 (2022)

work page 2022

[33] [33]

& Blum, M

Blum, L. & Blum, M. A theoretical computer science perspective on conscious- ness and artificial general intelligence.Engineering25,12–16 (2023)

work page 2023

[34] [34]

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Butlin, P.et al. Consciousness in Artificial Intelligence: Insights from the Sci- ence of Consciousness2023. arXiv:2308.08708 [cs.AI].https://arxiv.org/ abs/2308.08708

work page internal anchor Pith review Pith/arXiv arXiv

[35] [35]

Aru, J., Larkum, M. E. & Shine, J. M. The feasibility of artificial consciousness through the lens of neuroscience.Trends in neurosciences46,1008–1017 (2023)

work page 2023

[36] [36]

& Changeux, J.-P

Farisco, M., Evers, K. & Changeux, J.-P. Is artificial consciousness achievable? Lessons from the human brain.Neural Networks180,106714 (2024)

work page 2024

[37] [37]

Evers, K.et al.Preliminaries to artificial consciousness: a multidimensional heuristic approach.Physics of Life Reviews(2025)

work page 2025

[38] [38]

Palatable conceptions of disembodied being: Terra incognita in the space of possible minds.arXiv preprint arXiv:2503.16348(2025)

Shanahan, M. Palatable conceptions of disembodied being: Terra incognita in the space of possible minds.arXiv preprint arXiv:2503.16348(2025)

work page arXiv 2025

[39] [39]

AI consciousness: A centrist manifesto.Phil

Birch, J. AI consciousness: A centrist manifesto.Phil. papers preprint.https: //philpapers.org/rec/BIRACA-4(2025)

work page 2025

[40] [40]

L., Schwitzgebel, E

Schneider, S., Sahner, D., Kuhn, R. L., Schwitzgebel, E. & Bailey, M. Is Ai Con- scious? A Primer on the Myths and Confusions Driving the Debate.Philosophy and Mind Sciences(forthcoming)

work page

[41] [41]

AI and Consciousness.arXiv preprint arXiv:2510.09858(2025)

Schwitzgebel, E. AI and Consciousness.arXiv preprint arXiv:2510.09858(2025). 13

work page arXiv 2025

[42] [42]

sciencedirect

Butlin, P.et al.Identifying indicators of consciousness in AI systems.Trends in Cognitive Sciences.issn: 1364-6613.https : / / www . sciencedirect . com / science/article/pii/S1364661325002864(2025)

work page 2025

[43] [43]

Agnosticism About Artificial Consciousness.arXiv preprint arXiv:2412.13145 (2024)

McClelland, T. Agnosticism About Artificial Consciousness.arXiv preprint arXiv:2412.13145 (2024)

work page arXiv 2024

[44] [44]

Biological Intelligence.Royal Institution Dis- course, May 30

Hinton, G.Digital Intelligence vs. Biological Intelligence.Royal Institution Dis- course, May 30. 2025.https://www.youtube.com/watch?v=IkdziSLYzHw

work page 2025

[45] [45]

Legg, S., Hutter, M.,et al.A collection of definitions of intelligence.Frontiers in Artificial Intelligence and applications157,17 (2007)

work page 2007

[46] [46]

Dreksler, N.et al.Subjective experience in AI systems: what do AI researchers and the public believe?arXiv preprint arXiv:2506.11945(2025)

work page arXiv 2025

[47] [47]

Why conscious AI is a bad, bad idea.Nautilus.https://nautil.us/ why-conscious-ai-is-a-bad-bad-idea-302937/(2023)

Seth, A. Why conscious AI is a bad, bad idea.Nautilus.https://nautil.us/ why-conscious-ai-is-a-bad-bad-idea-302937/(2023)

work page 2023

[48] [48]

Aschenbrenner, L.Situational awareness: the decade ahead2024.https : / / situational-awareness.ai/

work page

[49] [49]

Ilievski, F.et al.Aligning generalization between humans and machines.Nature Machine Intelligence,1–12 (2025)

work page 2025

[50] [50]

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem ComplexityinNeurIPS (2025).https : / / ml - site

Shojaee*, P.et al. The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem ComplexityinNeurIPS (2025).https : / / ml - site . cdn - apple . com / papers / the - illusion - of - thinking.pdf

work page 2025

[51] [51]

Eriksson, M.et al.Can we trust ai benchmarks? an interdisciplinary review of current issues in ai evaluation.arXiv preprint arXiv:2502.06559(2025)

work page arXiv 2025

[52] [52]

Cheng, Z.et al.Benchmarking is Broken-Don’t Let AI be its Own Judge.arXiv preprint arXiv:2510.07575(2025)

work page arXiv 2025

[53] [53]

The symbol grounding problem.Physica D: Nonlinear Phenomena 42,335–346 (1990)

Harnad, S. The symbol grounding problem.Physica D: Nonlinear Phenomena 42,335–346 (1990)

work page 1990

[54] [54]

Gibson, J.The ecological approach to visual perception1979

work page

[55] [55]

Intentionality.Stanford Encyclopedia of Philosophy(2003)

Jacob, P. Intentionality.Stanford Encyclopedia of Philosophy(2003)

work page 2003

[56] [56]

O’regan, J. K. & No¨ e, A. A sensorimotor account of vision and visual conscious- ness.Behavioral and brain sciences24,939–973 (2001)

work page 2001

[57] [57]

Chalmers, D. J. Does thought require sensory grounding? From pure thinkers to large language models.arXiv preprint arXiv:2408.09605(2024)

work page arXiv 2024

[58] [58]

Harris, Z. S. Distributional structure.Word10,146–162 (1954)

work page 1954

[59] [59]

Manning, C. D. Human language understanding & reasoning.Daedalus151, 127–138 (2022)

work page 2022

[60] [60]

The Platonic Representation Hypothesis

Huh, M., Cheung, B., Wang, T. & Isola, P. The platonic representation hypoth- esis.arXiv preprint arXiv:2405.07987(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[61] [61]

H¨ aggstr¨ om, O.Are Large Language Models Intelligent? Are Humans?inCom- puter Sciences & Mathematics Forum8(2023), 68

work page 2023

[62] [62]

Symbols and grounding in large language models.Philosophical Transactions of the Royal Society A381,20220041 (2023)

Pavlick, E. Symbols and grounding in large language models.Philosophical Transactions of the Royal Society A381,20220041 (2023). 14

work page 2023

[63] [63]

Bender, E. M. & Koller, A.Climbing towards NLU: On meaning, form, and understanding in the age of datainProceedings of the 58th annual meeting of the association for computational linguistics(2020), 5185–5198

work page 2020

[64] [64]

Haikonen, P. O. On artificial intelligence and consciousness.Journal of Artificial Intelligence and Consciousness7,73–82 (2020)

work page 2020

[65] [65]

M., Gebru, T., McMillan-Major, A

Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S.On the dangers of stochastic parrots: Can language models be too big?inProceedings of the 2021 ACM conference on fairness, accountability, and transparency(2021), 610–623

work page 2021

[66] [66]

Lake, B. M. & Murphy, G. L. Word meaning in minds and machines.Psycho- logical review130,401 (2023)

work page 2023

[67] [67]

Grounding distributional semantics in the visual world.Language and Linguistics Compass10,3–13 (2016)

Baroni, M. Grounding distributional semantics in the visual world.Language and Linguistics Compass10,3–13 (2016)

work page 2016

[68] [68]

Zador, A.et al.Catalyzing next-generation artificial intelligence through neu- roai.Nature communications14,1597 (2023)

work page 2023

[69] [69]

2024.https://yoshuabengio.org/2024/07/09/reasoning-through-arguments- against-taking-ai-safety-seriously/

Bengio, Y.Reasoning through arguments against taking AI safety seriously. 2024.https://yoshuabengio.org/2024/07/09/reasoning-through-arguments- against-taking-ai-safety-seriously/

work page 2024

[70] [70]

Benevolent artificial anti-natalism (BAAN).EDGE Essay(2017)

Metzinger, T. Benevolent artificial anti-natalism (BAAN).EDGE Essay(2017)

work page 2017

[71] [71]

Christov-Moore, L.et al.Preventing antisocial robots: A pathway to artificial empathy.Science Robotics8,eabq3658 (2023)

work page 2023

[72] [72]

& Franklin, S

Wallach, W., Allen, C. & Franklin, S. Consciousness and Ethics: Artificially Conscious Moral Agents.International Journal of Machine Consciousness3, 177–192 (2011)

work page 2011

[73] [73]

Artificial consciousness: the missing ingredient for ethical AI?Fron- tiers in Robotics and AI10,1270460 (2023)

Chella, A. Artificial consciousness: the missing ingredient for ethical AI?Fron- tiers in Robotics and AI10,1270460 (2023). 80.https://conscium.com/. 81.https://www.robometricsagi.com/blog/ai-policy/artificial-consciousness- as-a-way-to-mitigate-ai-existential-risk

work page 2023

[74] [74]

& Koch, C

Crick, F. & Koch, C. A framework for consciousness.Nature neuroscience6, 119–126 (2003)

work page 2003

[75] [75]

Rosenthal, D. M. Consciousness and its function.Neuropsychologia46,829–840 (2008)

work page 2008

[76] [76]

Rosenthal, D.Consciousness and metacognitioninMetarepresentation: Proceed- ings of the tenth Vancouver cognitive science conference(2000), 265–295

work page 2000

[77] [77]

Consciousness: the radical plasticity thesis.Progress in brain research168,19–33 (2007)

Cleeremans, A. Consciousness: the radical plasticity thesis.Progress in brain research168,19–33 (2007)

work page 2007

[78] [78]

arXiv:1709.08568 [cs.LG].https: //arxiv.org/abs/1709.08568

Bengio, Y.The Consciousness Prior2019. arXiv:1709.08568 [cs.LG].https: //arxiv.org/abs/1709.08568

work page arXiv

[79] [79]

American Pychologist, 58 (9), 6972003

Kahneman, D.A perspective on judgment and choice: mapping bounded ratio- nality. American Pychologist, 58 (9), 6972003

work page

[80] [80]

N., Baumeister, R

DeWall, C. N., Baumeister, R. F. & Masicampo, E. Evidence that logical rea- soning depends on conscious processing.Consciousness and Cognition17,628– 645 (2008). 15

work page 2008