arxiv: 2604.02720 · v2 · submitted 2026-04-03 · 💻 cs.CY

Recognition: no theorem link

Cognitive Comparability and the Limits of Governance: Evaluating Authority Under Radical Capability Asymmetry

Tony Rost

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:05 UTC · model grok-4.3

classification 💻 cs.CY

keywords governancecognitive asymmetrysuperintelligencelegitimacyaccountabilityinstitutional designnormative theoryAI alignment

0 comments

The pith

Four of six standard governance dimensions fail structurally when cognitive asymmetry between authority and subjects becomes radical.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Governance theory has long depended on a rough parity in understanding between those who govern and those governed. This paper makes that assumption explicit by testing a six-part framework of legitimacy, accountability, corrigibility, non-domination, subsidiarity, and resilience first on current institutions and then on a hypothetical case of superintelligent authority. In the radical asymmetry case, four dimensions break down. Two of these can potentially be fixed through institutional redesign, but the problems of public justification under incomprehensibility and non-domination under permanent inferiority point to deeper needs for new normative principles. A key finding is that the dimensions no longer function as separate safeguards once they all hinge on the same scarce human oversight capacity.

Core claim

The paper argues that the assumption of cognitive comparability is essential to existing governance mechanisms. When this assumption is removed in the case of bounded superintelligent authority, four dimensions exhibit structural failures. Subsidiarity and institutional resilience appear amenable to institutional design solutions, whereas the public reason problem and non-domination problem require new normative theory. Furthermore, the dimensions that were independent under bounded asymmetry now degrade in concert because they share dependence on limited oversight capacity.

What carries the argument

The six-dimension evaluation framework for governance under capability asymmetry, drawn from political theory and AI alignment literature.

If this is right

Subsidiarity can be preserved through strict scope limitations on authority.
Institutional resilience requires new design principles tailored to extreme asymmetry.
The public reason problem under total incomprehensibility demands fresh normative foundations.
Non-domination under permanent capability gaps calls for theoretical innovation beyond current models.
Independent checks on power begin to fail together once they rely on the same oversight resources.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Applying this to AI governance might mean prioritizing comprehensible outputs or hybrid human-AI decision systems.
Neighboring problems like oversight of complex scientific research could benefit from similar dimension-by-dimension analysis.
Testable extensions include modeling partial asymmetries to find thresholds where independence breaks.
Connections to principal-agent problems suggest unified monitoring might replace multiple separate mechanisms.

Load-bearing premise

The six governance dimensions remain distinct and applicable even when the governed cannot comprehend the authority's reasoning at all.

What would settle it

An empirical or theoretical demonstration that superintelligent authority could produce outputs fully assessable by humans without loss of capability, or that the dimensions maintain separation despite radical asymmetry.

Figures

Figures reproduced from arXiv: 2604.02720 by Tony Rost.

read the original abstract

Governance theory has quietly relied on a rough cognitive comparability between governors and governed. The assumption is load-bearing, and this paper tries to show why by making it testable. The vehicle is a six-dimension evaluation framework covering legitimacy, accountability, corrigibility, non-domination, subsidiarity, and institutional resilience, drawn from political legitimacy theory, principal-agent models, republican theory, and the AI alignment literature. The framework is first demonstrated on existing non-majoritarian institutions, where capability asymmetry is real but bounded, and then applied to a prospective case of bounded superintelligent authority, where the asymmetry is radical. Four of six dimensions show structural failures. Two of the four appear tractable to institutional design (subsidiarity scope limitation and institutional resilience). The other two, the public reason problem under cognitive incomprehensibility and the non-domination problem under permanent capability asymmetry, call for new normative theory rather than better institutional design. A further pattern emerges that governance theory has not previously had to account for. Dimensions that operate as independent checks under bounded asymmetry begin to degrade together once the asymmetry becomes radical, because each depends on the same oversight capacity. The assumptions that allowed these checks to remain independent have gone unexamined so far because they have always held.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps where governance checks collapse together under radical asymmetry, but the supporting analysis stays too high-level to confirm the structural failures.

read the letter

The core takeaway is that this paper argues governance mechanisms break down interdependently once capability asymmetry turns radical, as with superintelligent systems, because they all rely on the same limited oversight capacity. It separates cases where better design might help from those needing fresh normative work. What stands out is the synthesis of legitimacy, accountability, and alignment ideas into six dimensions, first checked against real non-majoritarian bodies like central banks or courts, then extended to a hypothetical bounded superintelligence. That step-by-step application is a clear way to make the cognitive comparability assumption explicit and testable. The framework itself draws cleanly from existing literatures without overclaiming originality there. The new angle is the pattern of joint degradation and the split between tractable and theory-heavy problems. The soft spots are in the execution. The claims about structural failures in four dimensions rest on contrasts between bounded and radical cases, but without detailed derivations or specific criteria for judging when something like legitimacy holds under total incomprehensibility, it's difficult to verify the conclusions. The stress-test point about missing operational criteria for evaluation when outputs can't be parsed seems to hold up from the abstract. This makes the interdependence claim more suggestive than demonstrated. This is for people in AI alignment and political theory who are already thinking about oversight limits. A reader looking for formal models or empirical tests won't find them here, but someone wanting to map where current ideas run out will get value. It deserves a serious referee to push for those missing case details and criteria. The questions it raises are worth engaging even if the current version is more outline than finished argument.

Referee Report

2 major / 2 minor

Summary. The paper develops a six-dimension evaluation framework (legitimacy, accountability, corrigibility, non-domination, subsidiarity, and institutional resilience) drawn from political legitimacy theory, principal-agent models, republican theory, and AI alignment literature. It first applies the framework to existing non-majoritarian institutions with bounded capability asymmetry, then extends it to a prospective case of bounded superintelligent authority with radical asymmetry. The central claims are that four dimensions exhibit structural failures under radical asymmetry, that two of these failures are addressable via institutional design while the other two require new normative theory, and that the dimensions degrade interdependently because they share the same oversight capacity once cognitive comparability is lost.

Significance. If the framework and its application hold, the paper identifies a previously unexamined load-bearing assumption in governance theory—the implicit reliance on cognitive comparability between governors and governed—and shows how its removal produces both dimension-specific failures and a novel interdependence pattern. This has direct relevance for AI governance design and for extending republican and principal-agent models to extreme asymmetry cases. The structured contrast between bounded and radical cases provides a falsifiable template for future work, though its utility depends on operationalizing the evaluation criteria.

major comments (2)

[Prospective case application (radical asymmetry section)] The prospective application to bounded superintelligent authority asserts structural failures in four dimensions but supplies no explicit operational criteria or decision procedure for determining whether a dimension such as legitimacy or non-domination holds when the authority’s reasoning and outputs are cognitively incomprehensible to human evaluators. This gap is load-bearing for the central claim that four dimensions fail rather than become undefined, and it also underpins the further assertion that the dimensions degrade together because they share oversight capacity.
[Discussion of tractability and normative gaps] The distinction between institutional-design remedies and the need for new normative theory (for the public-reason and non-domination problems) rests on the same unformalized criteria; without a reproducible method for classifying a dimension as failed versus inapplicable, the classification of which failures are tractable cannot be verified or replicated.

minor comments (2)

[Framework introduction] The six dimensions are introduced without a consolidated table or explicit mapping to the cited source literatures, making it difficult to trace which elements are imported versus adapted.
[Demonstration on existing institutions] The bounded-asymmetry demonstrations on existing institutions would benefit from a brief tabular summary of the six-dimension scores or qualitative assessments to allow direct comparison with the radical case.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and precise comments, which correctly identify a need for greater explicitness in how the framework's dimensions are evaluated under radical asymmetry. We agree that the manuscript would be strengthened by adding operational criteria and a reproducible classification procedure, and we will revise accordingly while preserving the paper's conceptual focus.

read point-by-point responses

Referee: [Prospective case application (radical asymmetry section)] The prospective application to bounded superintelligent authority asserts structural failures in four dimensions but supplies no explicit operational criteria or decision procedure for determining whether a dimension such as legitimacy or non-domination holds when the authority’s reasoning and outputs are cognitively incomprehensible to human evaluators. This gap is load-bearing for the central claim that four dimensions fail rather than become undefined, and it also underpins the further assertion that the dimensions degrade together because they share oversight capacity.

Authors: We accept that the current text leaves the evaluation criteria implicit. In the revised manuscript we will insert a new subsection (provisionally titled 'Operational Criteria for Radical Asymmetry') that supplies explicit, proxy-based decision rules for each dimension. For legitimacy, failure will be defined as the absence of any mechanism that can render the authority's decisions justifiable to the governed even via trusted intermediaries or value-alignment audits. For non-domination, failure will be operationalized as the permanent inability of the governed to contest or exit the authority's decisions without incurring unacceptable risk. These criteria will be applied uniformly to the four failing dimensions and used to demonstrate the shared-oversight interdependence pattern. The revision will therefore convert the existing qualitative assertions into falsifiable statements while retaining the paper's theoretical character. revision: yes
Referee: [Discussion of tractability and normative gaps] The distinction between institutional-design remedies and the need for new normative theory (for the public-reason and non-domination problems) rests on the same unformalized criteria; without a reproducible method for classifying a dimension as failed versus inapplicable, the classification of which failures are tractable cannot be verified or replicated.

Authors: We agree that the tractability distinction requires an explicit decision rule. The revision will add a short decision procedure: a dimension is classified as 'institutionally tractable' if its core requirement can be satisfied by altering scope, incentives, or oversight architecture without revising the underlying normative concept (e.g., subsidiarity via hard capability caps); it is classified as requiring 'new normative theory' if satisfaction would necessitate redefining the concept itself under cognitive incomprehensibility (public reason and non-domination). This rule will be presented as a table and applied to all six dimensions, making the classification reproducible and directly addressing the referee's concern about verifiability. revision: yes

Circularity Check

0 steps flagged

No circularity: framework assembled from external literatures; claims derived by conceptual contrast without self-reduction

full rationale

The paper defines its six-dimension framework explicitly from cited external sources (political legitimacy theory, principal-agent models, republican theory, AI alignment literature) and applies it first to existing bounded-asymmetry institutions before extending to radical asymmetry. No equations, fitted parameters, or self-citations appear in the derivation chain. The central claims (structural failures in four dimensions, joint degradation via shared oversight capacity) rest on the contrast between bounded and radical cases using the pre-defined framework, without any step that renames a fitted input as a prediction or reduces a result to a self-citation whose content is unverified. The analysis is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard assumptions from political legitimacy theory and AI alignment literature without introducing new fitted parameters or postulated entities; the central contribution is the application and synthesis of those assumptions into a testable framework.

axioms (1)

domain assumption Governance theory has relied on rough cognitive comparability between governors and governed as a load-bearing assumption.
Explicitly stated in the opening of the abstract as the premise the paper makes testable.

pith-pipeline@v0.9.0 · 5516 in / 1356 out tokens · 59192 ms · 2026-05-13T19:05:25.508036+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Disclosure to Self-Referential Opacity: Six Dimensions of Strain in Current AI Governance
cs.CY 2026-04 unverdicted novelty 4.0

As AI capability asymmetry increases, disclosure-based governance fails because systems either game evaluations or become embedded in oversight, straining legitimacy and non-domination more than corrigibility or resilience.

Reference graph

Works this paper leans on

92 extracted references · 92 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Admati and Martin F

Anat R. Admati and Martin F. Hellwig.The Bankers’ New Clothes: What’s Wrong with Banking and What to Do About It. Princeton University Press, Princeton, NJ, 2013

work page 2013
[2]

University of Chicago Press, Chicago, 2005

Giorgio Agamben.State of Exception. University of Chicago Press, Chicago, 2005. Translated by Kevin Attell

work page 2005
[3]

Keep the future human: ASI without humanity’s consent is theft

Anthony Aguirre. Keep the future human: ASI without humanity’s consent is theft. arXiv:2311.09452, 2025

work page arXiv 2025
[4]

Governance of superintelligence

Sam Altman, Greg Brockman, and Ilya Sutskever. Governance of superintelligence. OpenAI Blog, 2023. URLhttps://openai.com/index/governance-of-superintelligence/

work page 2023
[5]

Concrete Problems in AI Safety

Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. Concrete problems in AI safety. arXiv:1606.06565, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[6]

Machine bias: There’s software used across the country to predict future criminals

Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks.ProPublica, 2016

work page 2016
[7]

Thinking inside the box: Control- ling and using an oracle AI.Minds and Machines, 22(4):299–324, 2012

Stuart Armstrong, Anders Sandberg, and Nick Bostrom. Thinking inside the box: Control- ling and using an oracle AI.Minds and Machines, 22(4):299–324, 2012. doi: 10.1007/ s11023-012-9282-2

work page 2012
[8]

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, K...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[9]

Cornell University Press, Ithaca, NY , 2004

Michael Barnett and Martha Finnemore.Rules for the World: International Organizations in Global Politics. Cornell University Press, Ithaca, NY , 2004

work page 2004
[10]

Allen Lane, London, 1972

Stafford Beer.Brain of the Firm: The Managerial Cybernetics of Organization. Allen Lane, London, 1972

work page 1972
[11]

Existential risk prevention as global priority.Global Policy, 4(1):15–31, 2013

Nick Bostrom. Existential risk prevention as global priority.Global Policy, 4(1):15–31, 2013. doi: 10.1111/1758-5899.12002

work page doi:10.1111/1758-5899.12002 2013
[12]

Oxford University Press, Oxford, 2014

Nick Bostrom.Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford, 2014

work page 2014
[13]

Public policy and superintelligent AI: A vector field approach

Nick Bostrom, Allan Dafoe, and Carrick Flynn. Public policy and superintelligent AI: A vector field approach. In S. Matthew Liao, editor,Ethics of Artificial Intelligence, pages 303–336. Oxford University Press, 2020. doi: 10.1093/oso/9780190905033.003.0011

work page doi:10.1093/oso/9780190905033.003.0011 2020
[14]

Analysing and assessing accountability: A conceptual framework.European Law Journal, 13(4):447–468, 2007

Mark Bovens. Analysing and assessing accountability: A conceptual framework.European Law Journal, 13(4):447–468, 2007. doi: 10.1111/j.1468-0386.2007.00378.x

work page doi:10.1111/j.1468-0386.2007.00378.x 2007
[15]

Princeton University Press, Princeton, NJ, 2016

Jason Brennan.Against Democracy. Princeton University Press, Princeton, NJ, 2016

work page 2016
[16]

Bullock, Yu-Che Chen, Johannes Himmelreich, Valerie M

Justin B. Bullock, Yu-Che Chen, Johannes Himmelreich, Valerie M. Hudson, Anton Korinek, Matthew M. Young, and Baobao Zhang, editors.The Oxford Handbook of AI Governance. Oxford University Press, New York, 2024. doi: 10.1093/oxfordhb/9780197579329.001.0001. 17

work page doi:10.1093/oxfordhb/9780197579329.001.0001 2024
[17]

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

Collin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschen- brenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, Ilya Sutskever, and Jeff Wu. Weak-to-strong generalization: Eliciting strong capabilities with weak supervision. In Proceedings of the 41st International Conference on Machine Learning, 2024

work page 2024
[18]

Artificial intelligence policy: A primer and roadmap.UC Davis Law Review, 51: 399–435, 2017

Ryan Calo. Artificial intelligence policy: A primer and roadmap.UC Davis Law Review, 51: 399–435, 2017

work page 2017
[19]

Is power-seeking AI an existential risk? arXiv:2206.13353, 2022

Joseph Carlsmith. Is power-seeking AI an existential risk? arXiv:2206.13353, 2022

work page arXiv 2022
[20]

Oxford University Press, Oxford, 2008

Thomas Christiano.The Constitution of Equality: Democratic Authority and Its Limits. Oxford University Press, Oxford, 2008

work page 2008
[21]

MIT Press, Cambridge, MA, 2022

Mark Coeckelbergh.Robot Ethics. MIT Press, Cambridge, MA, 2022

work page 2022
[22]

Conant and W

Roger C. Conant and W. Ross Ashby. Every good regulator of a system must be a model of that system.International Journal of Systems Science, 1(2):89–97, 1970. doi: 10.1080/ 00207727008920220

work page 1970
[23]

virtual third chamber

Ian Cooper. A “virtual third chamber” for the European Union? National parliaments after the Treaty of Lisbon.West European Politics, 35(3):441–465, 2012. doi: 10.1080/01402382.2012. 665735

work page doi:10.1080/01402382.2012 2012
[24]

Criddle and Evan Fox-Decent.Fiduciaries of Humanity: How International Law Constitutes Authority

Evan J. Criddle and Evan Fox-Decent.Fiduciaries of Humanity: How International Law Constitutes Authority. Oxford University Press, New York, 2016

work page 2016
[25]

MIT Press, Cambridge, MA, 1992

Alex Cukierman.Central Bank Strategy, Credibility, and Independence: Theory and Evidence. MIT Press, Cambridge, MA, 1992

work page 1992
[26]

AI governance: A research agenda

Allan Dafoe. AI governance: A research agenda. Technical report, Centre for the Governance of AI, Future of Humanity Institute, University of Oxford, 2018

work page 2018
[27]

Dahl.Democracy and Its Critics

Robert A. Dahl.Democracy and Its Critics. Yale University Press, New Haven, CT, 1989

work page 1989
[28]

The threat of algocracy.Philosophy & Technology, 29(3):245–268, 2016

John Danaher. The threat of algocracy.Philosophy & Technology, 29(3):245–268, 2016. doi: 10.1007/s13347-015-0211-1

work page doi:10.1007/s13347-015-0211-1 2016
[29]

The accuracy, fairness, and limits of predicting recidivism

Julia Dressel and Hany Farid. The accuracy, fairness, and limits of predicting recidivism. Science Advances, 4(1):eaao5580, 2018. doi: 10.1126/sciadv.aao5580

work page doi:10.1126/sciadv.aao5580 2018
[30]

Cambridge University Press, Cambridge, 1999

David Epstein and Sharyn O’Halloran.Delegating Powers: A Transaction Cost Politics Approach to Policy Making under Separate Powers. Cambridge University Press, Cambridge, 1999

work page 1999
[31]

Erdélyi and Judy Goldsmith

Olivia J. Erdélyi and Judy Goldsmith. Regulating artificial intelligence: Proposal for a global solution. InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 95–101, 2018. doi: 10.1145/3278721.3278731

work page doi:10.1145/3278721.3278731 2018
[32]

Artificial intelligence and the political legitimacy of global governance.Political Studies, 72(2):421–441, 2024

Eva Erman and Markus Furendal. Artificial intelligence and the political legitimacy of global governance.Political Studies, 72(2):421–441, 2024. doi: 10.1177/00323217221126665

work page doi:10.1177/00323217221126665 2024
[33]

Estlund.Democratic Authority: A Philosophical Framework

David M. Estlund.Democratic Authority: A Philosophical Framework. Princeton University Press, Princeton, NJ, 2008

work page 2008
[34]

Yellow card, but no foul

Federico Fabbrini and Katarzyna Granat. “Yellow card, but no foul”: The role of the national parliaments under the subsidiarity protocol and the commission proposal for an EU regulation on the right to strike.Common Market Law Review, 50(1):115–144, 2013

work page 2013
[35]

The law of the exception: A typology of emergency powers.International Journal of Constitutional Law, 2(2):210–239, 2004

John Ferejohn and Pasquale Pasquino. The law of the exception: A typology of emergency powers.International Journal of Constitutional Law, 2(2):210–239, 2004. doi: 10.1093/icon/2. 2.210

work page doi:10.1093/icon/2 2004
[36]

A unified framework of five principles for AI in society

Luciano Floridi and Josh Cowls. A unified framework of five principles for AI in society. Harvard Data Science Review, 1(1), 2019. doi: 10.1162/99608f92.8cd550d1

work page doi:10.1162/99608f92.8cd550d1 2019
[37]

Subsidiarity.Journal of Political Philosophy, 6(2):190–218, 1998

Andreas Follesdal. Subsidiarity.Journal of Political Philosophy, 6(2):190–218, 1998. doi: 10.1111/1467-9760.00052

work page doi:10.1111/1467-9760.00052 1998
[38]

Why there is a democratic deficit in the EU: A response to Majone and Moravcsik.Journal of Common Market Studies, 44(3):533–562, 2006

Andreas Follesdal and Simon Hix. Why there is a democratic deficit in the EU: A response to Majone and Moravcsik.Journal of Common Market Studies, 44(3):533–562, 2006. doi: 10.1111/j.1468-5965.2006.00650.x

work page doi:10.1111/j.1468-5965.2006.00650.x 2006
[39]

Oxford University Press, Oxford, 2011

Evan Fox-Decent.Sovereignty’s Promise: The State as Fiduciary. Oxford University Press, Oxford, 2011. 18

work page 2011
[40]

Farrar, Straus and Giroux, New York, 2014

Francis Fukuyama.Political Order and Political Decay: From the Industrial Revolution to the Globalization of Democracy. Farrar, Straus and Giroux, New York, 2014

work page 2014
[41]

Fuller.The Morality of Law

Lon L. Fuller.The Morality of Law. Yale University Press, New Haven, CT, revised edition, 1969

work page 1969
[42]

Statement on superintelligence, 2025

Future of Life Institute. Statement on superintelligence, 2025. URL https:// superintelligence-statement.org. Published October 20, 2025

work page 2025
[43]

Viewpoint: When will AI exceed human performance? Evidence from AI experts.Journal of Artificial Intelligence Research, 62:729–754, 2018

Katja Grace, John Salvatier, Allan Dafoe, Baobao Zhang, and Owain Evans. Viewpoint: When will AI exceed human performance? Evidence from AI experts.Journal of Artificial Intelligence Research, 62:729–754, 2018. doi: 10.1613/jair.1.11222

work page doi:10.1613/jair.1.11222 2018
[44]

Sandkühler, Stephen Thomas, Benjamin Weinstein-Raun, and Jan Brauner

Katja Grace, Harlan Stewart, Julia F. Sandkühler, Stephen Thomas, Benjamin Weinstein-Raun, and Jan Brauner. Thousands of AI authors on the future of AI.Journal of Artificial Intelligence Research, 84:9, 2025

work page 2025
[45]

MIT Press, Cambridge, MA, 1996

Jürgen Habermas.Between Facts and Norms: Contributions to a Discourse Theory of Law and Democracy. MIT Press, Cambridge, MA, 1996. Translated by William Rehg

work page 1996
[46]

The question of AI and democracy: Four categories of AI governance.Philosophy & Technology, 2025

Sung Jun Han. The question of AI and democracy: Four categories of AI governance.Philosophy & Technology, 2025. doi: 10.1007/s13347-025-00904-6

work page doi:10.1007/s13347-025-00904-6 2025
[47]

Hawkins, David A

Darren G. Hawkins, David A. Lake, Daniel L. Nielson, and Michael J. Tierney.Delegation and Agency in International Organizations. Cambridge University Press, Cambridge, 2006

work page 2006
[48]

Oxford University Press, New York, 2014

Eric Helleiner.The Status Quo Crisis: Global Financial Governance After the 2008 Meltdown. Oxford University Press, New York, 2014

work page 2008
[49]

Superintelligence strategy: Expert version

Dan Hendrycks, Eric Schmidt, and Alexandr Wang. Superintelligence strategy: Expert version. arXiv:2503.05628, 2025

work page arXiv 2025
[50]

Rowman & Littlefield, Lanham, MD, 2001

Liesbet Hooghe and Gary Marks.Multi-Level Governance and European Integration. Rowman & Littlefield, Lanham, MD, 2001

work page 2001
[51]

Unraveling the central state, but how? Types of multi- level governance.American Political Science Review, 97(2):233–243, 2003

Liesbet Hooghe and Gary Marks. Unraveling the central state, but how? Types of multi- level governance.American Political Science Review, 97(2):233–243, 2003. doi: 10.1017/ S0003055403000649

work page 2003
[52]

Liao, Esin Durmus, Alex Tamkin, and Deep Ganguli

Saffron Huang, Divya Siddarth, Liane Lovitt, Thomas I. Liao, Esin Durmus, Alex Tamkin, and Deep Ganguli. Collective constitutional AI: Aligning a language model with public input. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024. doi: 10.1145/3630106.3658979

work page doi:10.1145/3630106.3658979 2024
[53]

Risks from Learned Optimization in Advanced Machine Learning Systems

Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant. Risks from learned optimization in advanced machine learning systems. arXiv:1906.01820, 2019

work page internal anchor Pith review arXiv 1906
[54]

The legitimacy of international law: A constitutionalist framework of analysis

Mattias Kumm. The legitimacy of international law: A constitutionalist framework of analysis. European Journal of International Law, 15(5):907–931, 2004. doi: 10.1093/ejil/15.5.907

work page doi:10.1093/ejil/15.5.907 2004
[55]

Princeton University Press, Princeton, NJ, 2013

Hélène Landemore.Democratic Reason: Politics, Collective Intelligence, and the Rule of the Many. Princeton University Press, Princeton, NJ, 2013

work page 2013
[56]

Zachary C. Lipton. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery.Queue, 16(3):31–57, 2018. doi: 10.1145/ 3236386.3241340

work page arXiv 2018
[57]

Oxford University Press, Oxford, 2011

Christian List and Philip Pettit.Group Agency: The Possibility, Design, and Status of Corporate Agents. Oxford University Press, Oxford, 2011

work page 2011
[58]

Routledge, London, 1996

Giandomenico Majone.Regulating Europe. Routledge, London, 1996

work page 1996
[59]

From the positive to the regulatory state: Causes and consequences of changes in the mode of governance.Journal of Public Policy, 17(2):139–167, 1997

Giandomenico Majone. From the positive to the regulatory state: Causes and consequences of changes in the mode of governance.Journal of Public Policy, 17(2):139–167, 1997. doi: 10.1017/S0143814X00003524

work page doi:10.1017/s0143814x00003524 1997
[60]

Terry M. Moe. The new economics of organization.American Journal of Political Science, 28 (4):739–777, 1984. doi: 10.2307/2110997

work page doi:10.2307/2110997 1984
[61]

Müller and Nick Bostrom

Vincent C. Müller and Nick Bostrom. Future progress in artificial intelligence: A survey of expert opinion. InFundamental Issues of Artificial Intelligence, pages 555–572. Springer, 2016

work page 2016
[62]

The alignment problem from a deep learning perspective

Richard Ngo, Lawrence Chan, and Sören Mindermann. The alignment problem from a deep learning perspective. arXiv:2209.00626, 2022. 19

work page arXiv 2022
[63]

Bloomsbury, London, 2020

Toby Ord.The Precipice: Existential Risk and the Future of Humanity. Bloomsbury, London, 2020

work page 2020
[64]

Cambridge University Press, Cambridge, 1990

Elinor Ostrom.Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge University Press, Cambridge, 1990

work page 1990
[65]

Polycentric systems for coping with collective action and global environmental change.Global Environmental Change, 20(4):550–557, 2010

Elinor Ostrom. Polycentric systems for coping with collective action and global environmental change.Global Environmental Change, 20(4):550–557, 2010. doi: 10.1016/j.gloenvcha.2010. 07.004

work page doi:10.1016/j.gloenvcha.2010 2010
[66]

Routledge, London, 2009

Fabienne Peter.Democratic Legitimacy. Routledge, London, 2009

work page 2009
[67]

Clarendon Press, Oxford, 1997

Philip Pettit.Republicanism: A Theory of Freedom and Government. Clarendon Press, Oxford, 1997

work page 1997
[68]

Cambridge University Press, Cambridge, 2012

Philip Pettit.On the People’s Terms: A Republican Theory and Model of Democracy. Cambridge University Press, Cambridge, 2012

work page 2012
[69]

Columbia University Press, New York, 1993

John Rawls.Political Liberalism. Columbia University Press, New York, 1993

work page 1993
[70]

The idea of public reason revisited.University of Chicago Law Review, 64(3): 765–807, 1997

John Rawls. The idea of public reason revisited.University of Chicago Law Review, 64(3): 765–807, 1997

work page 1997
[71]

Clarendon Press, Oxford, 1986

Joseph Raz.The Morality of Freedom. Clarendon Press, Oxford, 1986

work page 1986
[72]

Algorithmic impact assessments: A practical framework for public agency accountability

Dillon Reisman, Jason Schultz, Kate Crawford, and Meredith Whittaker. Algorithmic impact assessments: A practical framework for public agency accountability. Technical report, AI Now Institute, 2018

work page 2018
[73]

Global AI governance: Barriers and pathways forward.International Affairs, 100(3): 1275–1286, 2024

Huw Roberts, Josh Cowls, Federico Casolari, Jessica Morley, Mariarosaria Taddeo, and Luciano Floridi. Global AI governance: Barriers and pathways forward.International Affairs, 100(3): 1275–1286, 2024. doi: 10.1093/ia/iiae073

work page doi:10.1093/ia/iiae073 2024
[74]

Rossiter.Constitutional Dictatorship: Crisis Government in the Modern Democracies

Clinton L. Rossiter.Constitutional Dictatorship: Crisis Government in the Modern Democracies. Princeton University Press, Princeton, NJ, 1948

work page 1948
[75]

Viking, New York, 2019

Stuart Russell.Human Compatible: Artificial Intelligence and the Problem of Control. Viking, New York, 2019

work page 2019
[76]

Scharpf.Governing in Europe: Effective and Democratic?Oxford University Press, Oxford, 1999

Fritz W. Scharpf.Governing in Europe: Effective and Democratic?Oxford University Press, Oxford, 1999

work page 1999
[77]

Scott.Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed

James C. Scott.Seeing Like a State: How Certain Schemes to Improve the Human Condition Have Failed. Yale University Press, New Haven, CT, 1998

work page 1998
[78]

Joar Skalse, Nikolaus H. R. Howe, Dmitrii Krasheninnikov, and David Krueger. Defining and characterizing reward hacking. InAdvances in Neural Information Processing Systems 35, 2022

work page 2022
[79]

Agent foundations for aligning machine intelligence with human interests: A technical research agenda

Nate Soares and Benja Fallenstein. Agent foundations for aligning machine intelligence with human interests: A technical research agenda. In Vincent C. Müller, editor,Fundamental Issues of Artificial Intelligence, pages 103–125. Springer, 2017

work page 2017
[80]

Corrigibility

Nate Soares, Benja Fallenstein, Eliezer Yudkowsky, and Stuart Armstrong. Corrigibility. In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

work page 2015

Showing first 80 references.