arxiv: 2604.25982 · v1 · submitted 2026-04-28 · 💻 cs.LG · cs.AI· cs.CY· cs.ET

Recognition: unknown

Open Problems in Frontier AI Risk Management

Marta Ziosi , Miro Plueckebaum , Stephen Casper , Henry Papadatos , Ze Shen Chin , Peter Slattery , James Gealy , Tim G. J. Rudner

show 21 more authors

Brian Tse Ariel Gil Patricia Paskov Maximilian Negele Rokas Gipi\v{s}kis Nada Madkour Vera Lummis Rupal Jain Luise Eder Kristina Fort Malou C. van Draanen Glismann In\`es Belhadj Amin Oueslati Anna K. Wisakanto Richard Mallah Koen Holtman Ranj Zuhdi Daniel S. Schiff Jessica Newman Malcolm Murray Robert Trager

Authors on Pith no claims yet

Pith reviewed 2026-05-07 16:43 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CYcs.ET

keywords frontier AIrisk managementopen problemsAI governancerisk assessmentsafety practicesliterature reviewconsensus building

0 comments

The pith

Frontier AI risk management has open problems at every stage from planning to mitigation, classified by consensus gaps, framework misalignments, and implementation shortfalls.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper systematically examines the risk management process for frontier AI systems to identify unresolved challenges. It reviews literature on risk planning, identification, analysis, evaluation, and mitigation, classifying problems into those lacking scientific consensus, those misaligned with established frameworks, or those suffering from poor implementation despite consensus. By assigning responsibility to specific actors like developers, regulators, and researchers, the work seeks to clarify priorities for achieving robust AI risk management. It serves as an agenda-setting reference rather than offering solutions, supported by a living repository to aid coordination.

Core claim

The paper claims that by adopting a problem-oriented approach and examining each stage of the risk management process through a structured literature review, it is possible to surface and classify open problems in frontier AI risk management according to their nature—lack of consensus, misalignment with frameworks, or implementation shortcomings—and to identify the actors best positioned to address them, thereby clarifying where progress is needed for meaningful consensus.

What carries the argument

The classification framework that divides open problems into three types—(a) lack of scientific or technical consensus, (b) misalignment with or challenges to established risk management frameworks, and (c) shortcomings in implementation—applied across the five stages of risk management.

If this is right

Progress on consensus-lacking problems requires input from researchers and standards bodies.
Misalignment issues call for regulators and standards bodies to adapt frameworks.
Implementation shortcomings are best addressed by developers and deployers.
Identifying responsible actors reduces duplication of efforts.
The living repository supports ongoing coordination in the field.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If these problems are addressed, it could lead to more effective integration of AI safety into broader risk management practices.
The approach might be extended to other emerging technologies facing similar rapid change.
Third-party evaluators could play a larger role in verification if consensus grows.
This mapping could serve as a baseline to track progress in closing these gaps over time.

Load-bearing premise

That a structured review of existing literature is sufficient to identify all significant open problems without overlooking critical gaps or biasing toward particular perspectives.

What would settle it

Discovery of a major unresolved challenge in frontier AI risk management, such as in risk evaluation for novel capabilities, that is not listed or classified in the paper's analysis.

Figures

Figures reproduced from arXiv: 2604.25982 by Amin Oueslati, Anna K. Wisakanto, Ariel Gil, Brian Tse, Daniel S. Schiff, Henry Papadatos, In\`es Belhadj, James Gealy, Jessica Newman, Koen Holtman, Kristina Fort, Luise Eder, Malcolm Murray, Malou C. van Draanen Glismann, Marta Ziosi, Maximilian Negele, Miro Plueckebaum, Nada Madkour, Patricia Paskov, Peter Slattery, Ranj Zuhdi, Richard Mallah, Robert Trager, Rokas Gipi\v{s}kis, Rupal Jain, Stephen Casper, Tim G. J. Rudner, Vera Lummis, Ze Shen Chin.

**Figure 1.** Figure 1: Risk Management Process (adapted from ISO, 2018) Risk management provides a particularly useful analytical and practical lens through which safety risks can be addressed. Beyond reducing risks to acceptable levels, risk management supports a range of complementary organisational functions, including regulatory compliance, assurance, internal decisionmaking, and effective and efficient core organisational … view at source ↗

**Figure 2.** Figure 2: Iterative process of risk evaluation 4.1 Determining Risk Acceptance The data obtained through risk analysis (Section 3) can be used to inform decisions about whether the risk should be accepted or whether it requires mitigation, and if so, any priorities for mitigation. Below, we review relevant open problems. Applying risk acceptance criteria. Risk criteria are applied to determine the significance of th… view at source ↗

**Figure 3.** Figure 3: Three-step method (or hierarchy) of risk reduction (Adapted from (ISO/IEC, 2014), design phase) In order to reasonably align with the principle of risk reduction above, we attempt to present frontier AI safety mitigations according to the level at which they reduce risk, and thus their relative effectiveness, ordering the following sections from data-, model-, system- up to ecosystem-level mitigations. Giv… view at source ↗

read the original abstract

Frontier AI both amplifies existing risks and introduces qualitatively novel challenges. Not only is there a notable lack of stable scientific consensus resulting from the rapid pace of technological change, but emerging frontier AI safety practices are often misaligned with, or may undermine, established risk management frameworks. To address these challenges, we systematically surface open problems in frontier AI risk management. Adopting a problem-oriented approach, we examine each stage of the risk management process - risk planning, identification, analysis, evaluation, and mitigation - through a structured review of the literature, identifying unresolved challenges and the actors best positioned to address them. Recognising that different types of open problems call for different responses, we classify open problems according to whether they reflect (a) a lack of scientific or technical consensus, (b) misalignment with, or challenges to, established risk management frameworks, or (c) shortcomings in implementation despite apparent consensus and alignment. By mapping these open problems and identifying the actors best positioned to address them - including developers, deployers, regulators, standards bodies, researchers, and third-party evaluators - this work aims to clarify where progress is needed to enable robust and meaningful consensus on frontier AI risk management.The paper does not propose specific solutions; instead, it provides a problem-oriented, agenda-setting reference document, complemented by a living online repository, intended to support coordination, reduce duplication, and guide future research and governance efforts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript claims to systematically surface open problems in frontier AI risk management via a structured literature review organized around the five stages of the risk management process (planning, identification, analysis, evaluation, and mitigation). It classifies the identified problems into three categories—lack of scientific/technical consensus, misalignment with established risk management frameworks, and implementation gaps despite consensus—and maps each to the actors best positioned to address them (developers, deployers, regulators, standards bodies, researchers, third-party evaluators). The work positions itself as an agenda-setting reference document rather than a source of solutions, supported by a living online repository to aid coordination and reduce duplication.

Significance. If the underlying review is comprehensive and unbiased, the paper could provide a useful coordination tool for the AI safety and governance community by clarifying where consensus is lacking and who should act. The problem-oriented framing and provision of a living repository are genuine strengths that respond to the field's rapid evolution. The classification scheme offers a potentially reusable lens for future work, though its value hinges on transparent methodology.

major comments (2)

[Literature review methodology] The section describing the structured literature review does not specify search strategy, databases, inclusion/exclusion criteria, date ranges, or handling of preprints and non-peer-reviewed sources. This is load-bearing for the central claim of systematically surfacing all load-bearing open problems, as the absence of these details prevents assessment of completeness or bias (e.g., under-representation of deployment-specific or non-Western perspectives).
[Classification framework] The criteria used to assign problems to the three categories (lack of consensus, misalignment with frameworks, implementation gaps) are not operationalized with explicit decision rules or inter-rater reliability checks. Without this, the classification of specific examples in later sections cannot be independently verified and risks subjective weighting.

minor comments (3)

[Abstract] The abstract is a single dense paragraph; splitting it into two or three sentences would improve readability while preserving all claims.
[References] Several citations to arXiv preprints lack version dates or notes on their non-peer-reviewed status; adding these would help readers assess currency.
[Repository description] The living repository is mentioned but no URL, update policy, or maintenance plan is provided in the text; including these details would strengthen the reproducibility claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the paper's potential as a coordination tool for the AI safety and governance community. We address each major comment below and commit to revisions that enhance methodological transparency while preserving the problem-oriented, agenda-setting nature of the work.

read point-by-point responses

Referee: [Literature review methodology] The section describing the structured literature review does not specify search strategy, databases, inclusion/exclusion criteria, date ranges, or handling of preprints and non-peer-reviewed sources. This is load-bearing for the central claim of systematically surfacing all load-bearing open problems, as the absence of these details prevents assessment of completeness or bias (e.g., under-representation of deployment-specific or non-Western perspectives).

Authors: We agree that additional details on the literature review process would improve transparency and allow better evaluation of scope and potential biases. In the revised manuscript, we will add a dedicated subsection outlining our approach: primary sources included arXiv, Google Scholar, and established AI governance repositories; we prioritized recent preprints and non-peer-reviewed sources given the field's pace; and problems were identified iteratively by mapping literature to the five risk management stages. We clarify that the review was structured but not exhaustive or protocol-driven in a PRISMA sense, as the objective was to surface representative open problems rather than claim completeness. Limitations regarding potential under-representation of certain perspectives will be explicitly noted. revision: yes
Referee: [Classification framework] The criteria used to assign problems to the three categories (lack of consensus, misalignment with frameworks, implementation gaps) are not operationalized with explicit decision rules or inter-rater reliability checks. Without this, the classification of specific examples in later sections cannot be independently verified and risks subjective weighting.

Authors: We accept that explicit decision rules would strengthen verifiability. The revised version will include a methods subsection with operational criteria: a problem is categorized as (a) lack of consensus when literature exhibits unresolved substantive disagreements; (b) misalignment when it challenges core tenets of frameworks such as ISO 31000; and (c) implementation gap when consensus on the issue exists but practical adoption remains limited. Examples of classification decisions, including borderline cases, will be provided for illustration. Formal inter-rater reliability was not applicable given the review's scope and author-team process, but internal consistency checks were used; we will document this to address subjectivity concerns while retaining the framework's utility as a heuristic lens. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive literature review with no derivations or self-referential reductions

full rationale

The paper is a problem-oriented structured literature review that classifies open problems in frontier AI risk management into three categories (lack of consensus, misalignment with frameworks, implementation gaps) and maps actors to address them. It contains no equations, fitted parameters, predictions, ansatzes, or derivations of any kind. All claims rest on external literature citations and the authors' synthesis of publicly available sources; the central agenda-setting contribution does not reduce to any self-definition, fitted input, or self-citation chain. The review process itself is presented as a methodological choice rather than a derived result, satisfying the criteria for a self-contained, non-circular descriptive work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the standard risk-management process stages and the premise that literature review can identify open problems; no free parameters, new axioms, or invented entities are introduced.

axioms (1)

domain assumption Risk management proceeds through the stages of planning, identification, analysis, evaluation, and mitigation.
Invoked in the abstract as the organizing structure for the review.

pith-pipeline@v0.9.0 · 5685 in / 1197 out tokens · 47161 ms · 2026-05-07T16:43:35.091964+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages · 1 internal anchor

[1]

https://doi.org/10.1057/gpp.1982.10 Cyberspace Administration of China. (2025). Artificial Intelligence Security Governance Framework 2.0. https://www.cac.gov.cn/2025-09/15/c_1759653448369123.htm David, S. R. (2009). Safety Risk Aggregation: The Bigger Picture. Safety and Reliability, 29(2), 34–

work page doi:10.1057/gpp.1982.10 1982
[2]

https://doi.org/10.1080/09617353.2009.11690877 de Laat, P. B. (2021). Companies Committed to Responsible AI: From Principles towards Implementation and Regulation? Philosophy & Technology, 34(4), 1135–1193. https://doi.org/10.1007/s13347-021-00474-3 Deeb, A., & Roger, F. (2025). Do Unlearning Methods Remove Information from Language Model Weights? (arXiv:...

work page doi:10.1080/09617353.2009.11690877 2009
[3]

https://doi.org/10.1007/s11023-020-09539-2 Gahin, F. S. (1971). Review of the Literature on Risk Management. The Journal of Risk and Insurance, 38(2), 309–313. https://doi.org/10.2307/251507 Gailmard, L. A., Spence, D., & Ho, D. E. (n.d.). Adverse Event Reporting for AI: Developing the Information Infrastructure Government Needs to Learn and Act. Stanford...

work page internal anchor Pith review doi:10.1007/s11023-020-09539-2 1971
[4]

https://ideas.repec.org//p/bea/papers/0134.html Hilbert, D. (1900). Mathematical Problems. https://www.aemea.org/math/Hilbert_23_Mathematical_Problems_1900.pdf Hilgert, J.-N., Jakobs, C., Külper, M., Lambertz, M., Mahr, A., & Padilla, E. (2025). Chances and Challenges of the Model Context Protocol in Digital Forensics and Incident Response (arXiv:2506.002...

work page doi:10.48550/arxiv.2506.00274 1900
[5]

https://doi.org/10.1016/j.psep.2015.07.005 Khan, S. M. N., Ghazali, J. M., Zakaria, L. Q., Ahmad, S. N., & Elias, K. A. (2018). Various Image Classification Using Certain Exchangeable Image File Format (EXIF) Metadata of Images. Malaysian Journal of Information and Communication Technology (MyJICT), 1–12. https://doi.org/10.53840/myjict3-1-33 Kim, H., Yi,...

work page doi:10.1016/j.psep.2015.07.005 2015
[6]

AI supply chain

https://doi.org/10.1007/s43681-024-00479-6 Schuett, J., Choi, E. D., Sugimoto, K., Hung, B., Trager, R., & Perset, K. (2025). Survey on thresholds for advanced AI systems. Oxford Martin AIGI. https://aigi.ox.ac.uk/publications/survey-on- thresholds-for-advanced-ai-systems/ Schuett, J., Dreksler, N., Anderljung, M., McCaffary, D., Heim, L., Bluemke, E., & ...

work page doi:10.1007/s43681-024-00479-6 2025