Governing Reflective Human-AI Collaboration: A Framework for Epistemic Scaffolding and Traceable Reasoning

Carl Rosenbacke; Martin McKee; Rikard Rosenbacke; Victor Rosenbacke

arxiv: 2604.14898 · v1 · submitted 2026-04-16 · 💻 cs.AI · cs.CY· cs.HC

Governing Reflective Human-AI Collaboration: A Framework for Epistemic Scaffolding and Traceable Reasoning

Rikard Rosenbacke , Carl Rosenbacke , Victor Rosenbacke , Martin McKee This is my paper

Pith reviewed 2026-05-10 11:17 UTC · model grok-4.3

classification 💻 cs.AI cs.CYcs.HC

keywords human-AI collaborationepistemic scaffoldingtraceable reasoningreflective interactiongovernance frameworkSystem-2 protocolsEU AI Act alignment

0 comments

The pith

Structured phases turn human-AI dialogue into auditable reasoning traces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that large language models simulate reflection but lack grounding, so reasoning should be relocated to the interaction layer between human and model. It introduces a method that organizes collaboration into phases of human abstraction, model articulation, and human reflection, creating a loop that produces traceable outputs. This matters because it offers a way to govern AI use through existing systems rather than waiting for internal model improvements. A sympathetic reader would see it as a practical protocol that combines human judgment with machine capabilities for more accountable results.

Core claim

The central claim is that reasoning is a relational process best handled by structuring human-AI interaction into phases of articulation, critique, and revision. Called The Architect's Pen, this method treats the AI as an external medium for reflection, turning the dialogue itself into a measurable reasoning loop that generates auditable traces aligned with governance needs.

What carries the argument

The Architect's Pen method, which structures human-AI dialogue into iterative phases of human abstraction to model articulation to human reflection, forming a cognitive protocol for distributed reasoning.

Load-bearing premise

That structuring human-AI dialogue into phases of articulation, critique, and revision will reliably produce grounded, traceable reasoning without new errors from either party.

What would settle it

Testing whether dialogues run through the phases produce reasoning traces that independent reviewers can verify and revise more effectively than unstructured conversations.

Figures

Figures reproduced from arXiv: 2604.14898 by Carl Rosenbacke, Martin McKee, Rikard Rosenbacke, Victor Rosenbacke.

**Figure 1.** Figure 1: The Reflective Loop of The Architect’s Pen Reasoning emerges through an iterative cycle between human abstraction, model articulation, and human reflection. Each iteration refines assumptions, improves confidence calibration, and strengthens epistemic grounding without requiring model retraining. How Human System-2 Governance Is Designed to Work (a) Shifting Reasoning from Model to Interaction: The Archite… view at source ↗

read the original abstract

Large language models have advanced rapidly, from pattern recognition to emerging forms of reasoning, yet they remain confined to linguistic simulation rather than grounded understanding. They can produce fluent outputs that resemble reflection, but lack temporal continuity, causal feedback, and anchoring in real-world interaction. This paper proposes a complementary approach in which reasoning is treated as a relational process distributed between human and model rather than an internal capability of either. Building on recent work on "System-2" learning, we relocate reflective reasoning to the interaction layer. Instead of engineering reasoning solely within models, we frame it as a cognitive protocol that can be structured, measured, and governed using existing systems. This perspective emphasizes collaborative intelligence, combining human judgment and contextual understanding with machine speed, memory, and associative capacity. We introduce "The Architect's Pen" as a practical method. Like an architect who thinks through drawing, the human uses the model as an external medium for structured reflection. By embedding phases of articulation, critique, and revision into human-AI interaction, the dialogue itself becomes a reasoning loop: human abstraction -> model articulation -> human reflection. This reframes the question from whether the model can think to whether the human-AI system can reason. The framework enables auditable reasoning traces and supports alignment with emerging governance standards, including the EU AI Act and ISO/IEC 42001. It provides a practical path toward more transparent, controllable, and accountable AI use without requiring new model architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper offers a clean three-phase protocol for turning human-AI chats into auditable reasoning traces, but it never checks whether the structure actually reduces errors or improves grounding.

read the letter

The core contribution is the Architect's Pen framing: human abstraction feeds the model for articulation, then the human reflects and revises, creating a loop that the authors say relocates System-2 reasoning to the interaction layer. It pulls together reflective prompting and collaborative intelligence into one simple sequence and ties the output directly to audit needs under the EU AI Act and ISO 42001. That linkage is useful for people who have to show traceable decision processes in practice.

Referee Report

3 major / 1 minor

Summary. The paper proposes relocating reflective reasoning from internal model capabilities to the human-AI interaction layer via a framework called 'The Architect's Pen.' This structures dialogue into phases of human abstraction, model articulation, human critique, and revision to generate auditable reasoning traces and epistemic scaffolding. It claims this approach supports governance alignment with standards such as the EU AI Act and ISO/IEC 42001 by treating reasoning as a distributed, relational process rather than an isolated model property.

Significance. If the phases can be shown to produce measurable improvements in traceability and error correction, the reframing offers a practical, architecture-agnostic path for collaborative intelligence and regulatory compliance. The conceptual shift from model-centric to system-centric reasoning aligns with emerging work on System-2 processes and could inform responsible AI deployment without requiring new technical primitives.

major comments (3)

[The Architect's Pen protocol description] The section introducing 'The Architect's Pen' protocol describes the cycle (human abstraction → model articulation → human reflection) at a high level but provides no formalization of the phases, no error model for hallucinations or human bias injection, and no traceability metric. This leaves the central claim that the structure 'enables auditable reasoning traces' unsupported by any mechanism to verify or quantify the outcome.
[Governance and standards discussion] The governance alignment claim (EU AI Act and ISO/IEC 42001) is asserted without any explicit mapping of the articulation-critique-revision phases to specific regulatory requirements, such as documentation of decision processes or risk management. This makes the practical path to compliance difficult to evaluate.
[Framework overview and claims] No comparison to unstructured dialogue, chain-of-thought prompting, or other scaffolding methods is included, nor is there any pilot protocol, outcome measure, or falsifiable prediction for whether the phased structure improves reasoning quality or auditability over baselines.

minor comments (1)

[Abstract and introduction] The abstract and introduction could more explicitly reference related literature on cognitive scaffolding and human-AI collaboration to clarify novelty.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive and detailed feedback, which highlights important areas for strengthening the conceptual framework presented in the manuscript. We address each major comment below, indicating the revisions we will undertake.

read point-by-point responses

Referee: [The Architect's Pen protocol description] The section introducing 'The Architect's Pen' protocol describes the cycle (human abstraction → model articulation → human reflection) at a high level but provides no formalization of the phases, no error model for hallucinations or human bias injection, and no traceability metric. This leaves the central claim that the structure 'enables auditable reasoning traces' unsupported by any mechanism to verify or quantify the outcome.

Authors: We agree that the protocol is described at a conceptual level in the current manuscript. While the full text includes illustrative examples of the phases, we acknowledge the lack of formalization and explicit mechanisms. In the revised version, we will add a formalized representation of the phases (e.g., via pseudocode and a process diagram), a qualitative error model addressing hallucination detection during human critique and bias injection risks, and explicit traceability mechanisms such as mandatory logging of inputs/outputs at each step to generate auditable traces. This will provide the requested support for the central claim through structural mechanisms, though quantitative metrics remain a direction for future work. revision: partial
Referee: [Governance and standards discussion] The governance alignment claim (EU AI Act and ISO/IEC 42001) is asserted without any explicit mapping of the articulation-critique-revision phases to specific regulatory requirements, such as documentation of decision processes or risk management. This makes the practical path to compliance difficult to evaluate.

Authors: We accept this observation. The governance discussion in the manuscript is asserted at a high level without detailed mappings. We will revise the paper to include an explicit mapping subsection (or table) that connects each phase of the protocol to concrete requirements, such as transparency and documentation obligations under the EU AI Act and risk management and record-keeping under ISO/IEC 42001. This will clarify the practical compliance path. revision: yes
Referee: [Framework overview and claims] No comparison to unstructured dialogue, chain-of-thought prompting, or other scaffolding methods is included, nor is there any pilot protocol, outcome measure, or falsifiable prediction for whether the phased structure improves reasoning quality or auditability over baselines.

Authors: We recognize that direct comparisons and testability elements would better position the framework. The manuscript discusses related scaffolding approaches in the related work section but does not contrast them explicitly. In revision, we will add a comparative analysis section outlining differences from unstructured dialogue and chain-of-thought prompting with respect to traceability and governance. We will also include falsifiable predictions (e.g., improved error detection via the structured critique phase) and a high-level outline of a potential pilot protocol. As this is a conceptual framework paper, we cannot include actual empirical outcome measures or results from a new pilot study. revision: partial

standing simulated objections not resolved

Empirical outcome measures or results from a completed pilot protocol, as these would require new experimental work outside the scope of the current conceptual and theoretical manuscript.

Circularity Check

0 steps flagged

No circularity: self-contained conceptual proposal without derivations or self-referential reductions

full rationale

The paper advances a purely conceptual framework for structuring human-AI dialogue into articulation-critique-revision phases, relocating reasoning to the interaction layer without any equations, fitted parameters, predictions, or mathematical derivations. The 'Architect's Pen' protocol is introduced definitionally as a practical method rather than derived from prior results by construction. No self-citations appear as load-bearing justifications for uniqueness or core claims, and the argument does not reduce to its own inputs or rename known patterns. It remains a self-contained proposal for epistemic scaffolding and governance alignment.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on domain assumptions about LLM limitations and the value of structured interaction, with the 'Architect's Pen' introduced as the primary new construct without independent falsifiable evidence.

axioms (1)

domain assumption Large language models remain confined to linguistic simulation rather than grounded understanding and lack temporal continuity, causal feedback, and anchoring in real-world interaction.
Explicitly stated in the abstract as the motivation for relocating reasoning to the interaction layer.

invented entities (1)

The Architect's Pen no independent evidence
purpose: A practical method that embeds phases of articulation, critique, and revision into human-AI dialogue to create a reasoning loop.
Introduced as the core practical contribution; no external validation or falsifiable predictions are provided beyond the framework description.

pith-pipeline@v0.9.0 · 5582 in / 1375 out tokens · 54337 ms · 2026-05-10T11:17:25.330561+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

System-2 deep learning,

Governing Reflective Human–AI Collaboration A Framework for Epistemic Scaffolding and Traceable Reasoning Rikard Rosenbacke*, Carl Rosenbacke1, Victor Rosenbacke1,2, Martin McKee3 1Faculty of Medicine, Lund University, Sweden 2Department of Economics, Lund University School of Economics and Management, Sweden 3Department of Health Services Research and Po...

work page 2026
[2]

reasons” or “thinks

Regulatory alignment: The framework provides the potential missing operational layer for global AI governance, translating emerging policy requirements (e.g., EU AI Act, OECD Principles, NIST AI RMF, ISO/IEC 42001)20–23 into concrete, auditable reasoning processes that make accountability technically feasible. At the core of this paper lies a simple but o...

work page 2024
[3]

bias blind spot

The Reflective Loop of The Architect’s Pen Reasoning emerges through an iterative cycle between human abstraction, model articulation, and human reflection. Each iteration refines assumptions, improves confidence calibration, and strengthens epistemic grounding without requiring model retraining. How Human System-2 Governance Is Designed to Work (a) Shift...

work page 2023
[4]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. The Missing Knowledge Layer in AI: A Framework for Stable Human–AI Reasoning. arXiv (2026)

work page 2026
[5]

Hallucinations

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Beyond “Hallucinations”: A Framework for Stable Human–AI Reasoning. arXiv (2026)

work page 2026
[6]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Governing Reflective Human–AI Collaboration: A Framework for Epistemic Scaffolding and Traceable Reasoning. arXiv (2026)

work page 2026
[7]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. From Consumption to Reflection: Designing Human–AI Relations for Stable Reasoning. arXiv (2026)

work page 2026
[8]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Epistemic Control Loops in Large Language Models: An Architectural Proposal for Machine-Side Regulation. arXiv (2026)

work page 2026
[9]

AI Will Transform the Global Economy

Georgieva, K. AI Will Transform the Global Economy. Let’s Make Sure It Benefits Humanity. Int. Monet. Fund 1–6 (2024)

work page 2024
[10]

Chui, M. et al. Economic potential of generative AI | McKinsey. McKinsey & Company (2023)

work page 2023
[11]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Beyond Hallucinations: The Illusion of Understanding in Large Language Models. arXiv (2025)

work page 2025
[12]

Bommasani, R. et al. Advancing science- and evidence-based AI policy. Science 389, 459–461 (2025)

work page 2025
[13]

Tang, X. et al. Risks of AI scientists: prioritising safeguarding over autonomy. Nat. Commun. 16, 1–11 (2025)

work page 2025
[14]

& Hinton, G

Bengio, Y., Lecun, Y. & Hinton, G. Deep learning for AI. Commun. ACM 64, 58–65 (2021)

work page 2021
[15]

From System 1 Deep Learning to System 2 Deep Learning

Bengio, Y. From System 1 Deep Learning to System 2 Deep Learning. in (NeurIPS 2019 — Thirty-third Conference on Neural Information Processing Systems City: Vancouver, Canada, 2019)

work page 2019
[16]

& Bengio, Y

Goyal, A. & Bengio, Y. Inductive biases for deep learning of higher-level cognition. Proc. R. Soc. A Math. Phys. Eng. Sci. 478, (2022)

work page 2022
[17]

& Malkin, N

Bengio, Y. & Malkin, N. Machine learning and information theory concepts towards an AI Mathematician. Bull. Am. Math. Soc. 61, 457–469 (2024)

work page 2024
[18]

Geoffrey Hinton’s speech at the Nobel Prize banquet

Hinton, G. Geoffrey Hinton’s speech at the Nobel Prize banquet. Nobel Prize banquet (2024)

work page 2024
[19]

Thinking fast, thinking slow

Kahneman, D. Thinking fast, thinking slow. Interpretation, Tavistock, London (2011)

work page 2011
[20]

The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI

Amodei, D. The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI. Anthropic (2026)

work page 2026
[21]

A European approach to artificial intelligence | Shaping Europe’s digital future

EU. A European approach to artificial intelligence | Shaping Europe’s digital future. Europa (2023). Available at: https://digital-strategy.ec.europa.eu/en/policies/european-approach-artificial-intelligence. (Accessed: 28th October

work page 2023
[22]

OECD AI Principles overview

Oecd.ai. OECD AI Principles overview. Oecd.ai 12 (2025). Available at: https://oecd.ai/en/ai- 18 principles. (Accessed: 28th October

work page 2025
[23]

NIST Trustworthy and Responsible AI NIST AI 600-1 Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile,

National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF). (2025). doi:10.6028/NIST.AI.600-1

work page doi:10.6028/nist.ai.600-1 2025
[24]

ISO/IEC 42001:2023 - AI management systems

ISO. ISO/IEC 42001:2023 - AI management systems. International Organization for Standardization (2023). Available at: https://www.iso.org/standard/42001. (Accessed: 28th October

work page 2023
[25]

& Lange-Ionatamishvili, E

Bolt, N. & Lange-Ionatamishvili, E. Next Generation Information Environment. NATO STRATCOM COE (2026)

work page 2026
[26]

Mind in motion : how action shapes thought

Tversky, B. Mind in motion : how action shapes thought. (Basic Books, 2019)

work page 2019
[27]

Butlin, P. et al. Consciousness in Artificial Intelligence: Insights from the Science of Consciousness. ArXiv (2023)

work page 2023
[28]

Scholkopf, B. et al. Towards Causal Representation Learning. Proc. IEEE 109, 612–634 (2021)

work page 2021
[29]

Li, Z.-Z. et al. From system 1 to system 2: A survey of reasoning large language models. arxiv.org (2025)

work page 2025
[30]

Sui, Y. et al. Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models. Trans. Mach. Learn. Res. 2025-August, (2025)

work page 2025
[31]

Chen, Q. et al. Towards reasoning era: A survey of long chain-of-thought for reasoning large language models. arxiv.org (2025)

work page 2025
[32]

& Vlachos, A

Zhu, X., Zhang, C., Stafford, T., Collier, N. & Vlachos, A. Conformity in Large Language Models. 3854–3872 (2025). doi:10.18653/v1/2025.acl-long.195

work page doi:10.18653/v1/2025.acl-long.195 2025
[33]

& Kirsh, D

Hollan, J., Hutchins, E. & Kirsh, D. Distributed Cognition: Toward a New Foundation for Human-Computer Interaction Research. ACM Trans. Comput. Interact. 7, (2000)

work page 2000
[34]

Chi, M. T. H., De Leeuw, N., Chiu, M. & Lavancher, C. Eliciting Self‐Explanations Improves Understanding. Cogn. Sci. 18, (1994)

work page 1994
[35]

Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models

Tsui, K. Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models. (2025)

work page 2025
[36]

& Stuckler, D

Rosenbacke, R., Melhus, Å. & Stuckler, D. False conflict and false confirmation errors are crucial components of AI accuracy in medical decision making. Nat. Commun. 2024 151 15, 1–2 (2024)

work page 2024
[37]

Lewicki, R. J. & Brinsfield, C. T. Framing trust: Trust as a heuristic. in Framing matters: Perspectives on negotiatin research and practice in communication (eds. Donohue, W. A., Rogan, R. R. & Kaufman, S.) 110–135 (Peter Lang Publishing, 2011)

work page 2011
[38]

Cognitive Challenges in Human-AI Collaboration: A Study on Trust, Errors, and Heuristics in Clinical Decision-making

Rosenbacke, R. Cognitive Challenges in Human-AI Collaboration: A Study on Trust, Errors, and Heuristics in Clinical Decision-making. (2025)

work page 2025
[39]

& Basu, S

Stuckler, D., McKee, M., Ebrahim, S. & Basu, S. Manufacturing epidemics: the role of global producers in increased consumption of unhealthy commodities including processed foods, alcohol, and tobacco. journals.plos.orgD Stuckler, M McKee, S Ebrahim, S BasuPLoS Med. 2012•journals.plos.org 9, 10 (2012)

work page 2012
[40]

Opium, tobacco and alcohol: The evolving legitimacy of international action

McKee, M. Opium, tobacco and alcohol: The evolving legitimacy of international action. Clin. Med. J. R. Coll. Physicians London 9, 338–341 (2009)

work page 2009
[41]

Watch out for cheats in citation game

Biagioli, M. Watch out for cheats in citation game. Nature 535, 201 (2016)

work page 2016

[1] [1]

System-2 deep learning,

Governing Reflective Human–AI Collaboration A Framework for Epistemic Scaffolding and Traceable Reasoning Rikard Rosenbacke*, Carl Rosenbacke1, Victor Rosenbacke1,2, Martin McKee3 1Faculty of Medicine, Lund University, Sweden 2Department of Economics, Lund University School of Economics and Management, Sweden 3Department of Health Services Research and Po...

work page 2026

[2] [2]

reasons” or “thinks

Regulatory alignment: The framework provides the potential missing operational layer for global AI governance, translating emerging policy requirements (e.g., EU AI Act, OECD Principles, NIST AI RMF, ISO/IEC 42001)20–23 into concrete, auditable reasoning processes that make accountability technically feasible. At the core of this paper lies a simple but o...

work page 2024

[3] [3]

bias blind spot

The Reflective Loop of The Architect’s Pen Reasoning emerges through an iterative cycle between human abstraction, model articulation, and human reflection. Each iteration refines assumptions, improves confidence calibration, and strengthens epistemic grounding without requiring model retraining. How Human System-2 Governance Is Designed to Work (a) Shift...

work page 2023

[4] [4]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. The Missing Knowledge Layer in AI: A Framework for Stable Human–AI Reasoning. arXiv (2026)

work page 2026

[5] [5]

Hallucinations

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Beyond “Hallucinations”: A Framework for Stable Human–AI Reasoning. arXiv (2026)

work page 2026

[6] [6]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Governing Reflective Human–AI Collaboration: A Framework for Epistemic Scaffolding and Traceable Reasoning. arXiv (2026)

work page 2026

[7] [7]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. From Consumption to Reflection: Designing Human–AI Relations for Stable Reasoning. arXiv (2026)

work page 2026

[8] [8]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Epistemic Control Loops in Large Language Models: An Architectural Proposal for Machine-Side Regulation. arXiv (2026)

work page 2026

[9] [9]

AI Will Transform the Global Economy

Georgieva, K. AI Will Transform the Global Economy. Let’s Make Sure It Benefits Humanity. Int. Monet. Fund 1–6 (2024)

work page 2024

[10] [10]

Chui, M. et al. Economic potential of generative AI | McKinsey. McKinsey & Company (2023)

work page 2023

[11] [11]

& McKee, M

Rosenbacke, R., Rosenbacke, C., Rosenbacke, V. & McKee, M. Beyond Hallucinations: The Illusion of Understanding in Large Language Models. arXiv (2025)

work page 2025

[12] [12]

Bommasani, R. et al. Advancing science- and evidence-based AI policy. Science 389, 459–461 (2025)

work page 2025

[13] [13]

Tang, X. et al. Risks of AI scientists: prioritising safeguarding over autonomy. Nat. Commun. 16, 1–11 (2025)

work page 2025

[14] [14]

& Hinton, G

Bengio, Y., Lecun, Y. & Hinton, G. Deep learning for AI. Commun. ACM 64, 58–65 (2021)

work page 2021

[15] [15]

From System 1 Deep Learning to System 2 Deep Learning

Bengio, Y. From System 1 Deep Learning to System 2 Deep Learning. in (NeurIPS 2019 — Thirty-third Conference on Neural Information Processing Systems City: Vancouver, Canada, 2019)

work page 2019

[16] [16]

& Bengio, Y

Goyal, A. & Bengio, Y. Inductive biases for deep learning of higher-level cognition. Proc. R. Soc. A Math. Phys. Eng. Sci. 478, (2022)

work page 2022

[17] [17]

& Malkin, N

Bengio, Y. & Malkin, N. Machine learning and information theory concepts towards an AI Mathematician. Bull. Am. Math. Soc. 61, 457–469 (2024)

work page 2024

[18] [18]

Geoffrey Hinton’s speech at the Nobel Prize banquet

Hinton, G. Geoffrey Hinton’s speech at the Nobel Prize banquet. Nobel Prize banquet (2024)

work page 2024

[19] [19]

Thinking fast, thinking slow

Kahneman, D. Thinking fast, thinking slow. Interpretation, Tavistock, London (2011)

work page 2011

[20] [20]

The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI

Amodei, D. The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI. Anthropic (2026)

work page 2026

[21] [21]

A European approach to artificial intelligence | Shaping Europe’s digital future

EU. A European approach to artificial intelligence | Shaping Europe’s digital future. Europa (2023). Available at: https://digital-strategy.ec.europa.eu/en/policies/european-approach-artificial-intelligence. (Accessed: 28th October

work page 2023

[22] [22]

OECD AI Principles overview

Oecd.ai. OECD AI Principles overview. Oecd.ai 12 (2025). Available at: https://oecd.ai/en/ai- 18 principles. (Accessed: 28th October

work page 2025

[23] [23]

NIST Trustworthy and Responsible AI NIST AI 600-1 Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile,

National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF). (2025). doi:10.6028/NIST.AI.600-1

work page doi:10.6028/nist.ai.600-1 2025

[24] [24]

ISO/IEC 42001:2023 - AI management systems

ISO. ISO/IEC 42001:2023 - AI management systems. International Organization for Standardization (2023). Available at: https://www.iso.org/standard/42001. (Accessed: 28th October

work page 2023

[25] [25]

& Lange-Ionatamishvili, E

Bolt, N. & Lange-Ionatamishvili, E. Next Generation Information Environment. NATO STRATCOM COE (2026)

work page 2026

[26] [26]

Mind in motion : how action shapes thought

Tversky, B. Mind in motion : how action shapes thought. (Basic Books, 2019)

work page 2019

[27] [27]

Butlin, P. et al. Consciousness in Artificial Intelligence: Insights from the Science of Consciousness. ArXiv (2023)

work page 2023

[28] [28]

Scholkopf, B. et al. Towards Causal Representation Learning. Proc. IEEE 109, 612–634 (2021)

work page 2021

[29] [29]

Li, Z.-Z. et al. From system 1 to system 2: A survey of reasoning large language models. arxiv.org (2025)

work page 2025

[30] [30]

Sui, Y. et al. Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models. Trans. Mach. Learn. Res. 2025-August, (2025)

work page 2025

[31] [31]

Chen, Q. et al. Towards reasoning era: A survey of long chain-of-thought for reasoning large language models. arxiv.org (2025)

work page 2025

[32] [32]

& Vlachos, A

Zhu, X., Zhang, C., Stafford, T., Collier, N. & Vlachos, A. Conformity in Large Language Models. 3854–3872 (2025). doi:10.18653/v1/2025.acl-long.195

work page doi:10.18653/v1/2025.acl-long.195 2025

[33] [33]

& Kirsh, D

Hollan, J., Hutchins, E. & Kirsh, D. Distributed Cognition: Toward a New Foundation for Human-Computer Interaction Research. ACM Trans. Comput. Interact. 7, (2000)

work page 2000

[34] [34]

Chi, M. T. H., De Leeuw, N., Chiu, M. & Lavancher, C. Eliciting Self‐Explanations Improves Understanding. Cogn. Sci. 18, (1994)

work page 1994

[35] [35]

Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models

Tsui, K. Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models. (2025)

work page 2025

[36] [36]

& Stuckler, D

Rosenbacke, R., Melhus, Å. & Stuckler, D. False conflict and false confirmation errors are crucial components of AI accuracy in medical decision making. Nat. Commun. 2024 151 15, 1–2 (2024)

work page 2024

[37] [37]

Lewicki, R. J. & Brinsfield, C. T. Framing trust: Trust as a heuristic. in Framing matters: Perspectives on negotiatin research and practice in communication (eds. Donohue, W. A., Rogan, R. R. & Kaufman, S.) 110–135 (Peter Lang Publishing, 2011)

work page 2011

[38] [38]

Cognitive Challenges in Human-AI Collaboration: A Study on Trust, Errors, and Heuristics in Clinical Decision-making

Rosenbacke, R. Cognitive Challenges in Human-AI Collaboration: A Study on Trust, Errors, and Heuristics in Clinical Decision-making. (2025)

work page 2025

[39] [39]

& Basu, S

Stuckler, D., McKee, M., Ebrahim, S. & Basu, S. Manufacturing epidemics: the role of global producers in increased consumption of unhealthy commodities including processed foods, alcohol, and tobacco. journals.plos.orgD Stuckler, M McKee, S Ebrahim, S BasuPLoS Med. 2012•journals.plos.org 9, 10 (2012)

work page 2012

[40] [40]

Opium, tobacco and alcohol: The evolving legitimacy of international action

McKee, M. Opium, tobacco and alcohol: The evolving legitimacy of international action. Clin. Med. J. R. Coll. Physicians London 9, 338–341 (2009)

work page 2009

[41] [41]

Watch out for cheats in citation game

Biagioli, M. Watch out for cheats in citation game. Nature 535, 201 (2016)

work page 2016