Limitations on Accurate, Trusted, Human-level Reasoning

· 2025 · cs.LG · arXiv 2509.21654

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

We identify a fundamental incompatibility between the goals of accuracy, trust, and human-level reasoning in artificial intelligence (AI) systems, for strict mathematical definitions of these notions. We define accuracy of a system as the property that it never makes any false claims when it has the ability to abstain from making a prediction on any input, and trust as the assumption that the system is accurate. We define human-level reasoning as the property of an AI system always matching or exceeding human capability. Our core finding is that -- for our formal definitions of these notions -- an accurate and trusted AI system cannot be a human-level reasoning system: for such an accurate, trusted system there are task instances which are easily and provably solvable by a human but not by the system. Our proofs draw parallels to G\"odel's incompleteness theorems and Turing's proof of the undecidability of the halting problem, and can be regarded as interpretations of G\"odel's and Turing's results. Key to our proof is the formalization of the notion of trust, which allows us to separate the intrinsic property of a system (being accurate) from its epistemic status (being trusted).

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives

cs.AI · 2026-04-09 · unverdicted · novelty 7.0

The Accountability Incompleteness Theorem demonstrates that human-AI collectives above the Accountability Horizon with feedback cycles cannot simultaneously meet attributability, foreseeability, non-vacuity, and completeness requirements for legitimate accountability.

citing papers explorer

Showing 1 of 1 citing paper.

The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives cs.AI · 2026-04-09 · unverdicted · none · ref 25 · internal anchor
The Accountability Incompleteness Theorem demonstrates that human-AI collectives above the Accountability Horizon with feedback cycles cannot simultaneously meet attributability, foreseeability, non-vacuity, and completeness requirements for legitimate accountability.

Limitations on Accurate, Trusted, Human-level Reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer