Strategic Preemption Under Shared Catastrophic Risk: The Suicide Region and the Race to Artificial General Intelligence

David Tan

arxiv: 2512.07526 · v3 · pith:6NBCTLCRnew · submitted 2025-12-08 · 💱 q-fin.RM · econ.GN· q-fin.EC· q-fin.GN

Strategic Preemption Under Shared Catastrophic Risk: The Suicide Region and the Race to Artificial General Intelligence

David Tan This is my paper

Pith reviewed 2026-05-21 17:58 UTC · model grok-4.3

classification 💱 q-fin.RM econ.GNq-fin.ECq-fin.GN

keywords AGI racepreemption gamereal optionsexistential risksuicide regionoption gamesrisk cancellationsystemic ruin

0 comments

The pith

In the AGI race, shared existential risks cancel from players' decisions and create a suicide region of forced early deployment despite negative value.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models the AGI race between sovereign actors as a continuous-time preemption game that includes endogenous existential risk. A systemic ruin parameter D, correlated with development speed and borne globally, enters both players' payoffs. Because this shared disutility appears symmetrically, the risk term drops out of the equilibrium indifference condition that sets the investment threshold. The resulting suicide region is the portion of investment space where competitive preemption compels rational agents to deploy AGI systems ahead of the point where risk-adjusted net present value turns positive. The model further shows that sub-existential warning shots leave the winner-takes-all structure intact and therefore do not restore delay.

Core claim

When the cost of global ruin is embedded in both players' payoffs, the risk term mathematically cancels from the equilibrium indifference condition of the continuous-time preemption game. This cancellation produces a suicide region in which competitive pressures force early AGI deployment even though the risk-adjusted net present value remains negative.

What carries the argument

The equilibrium indifference condition of the preemption game, from which the shared systemic ruin term cancels.

If this is right

Warning shots or sub-existential disasters leave the winner-takes-all structure unchanged and therefore fail to slow acceleration.
The race stops only when the cost of ruin is internalized to each player, making safety research economically necessary before deployment.
A critical private liability threshold exists that restores the option value of waiting.
Targeted liability or insurance mechanisms can shift the equilibrium back toward safer research sequences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the ruin cost is only partially shared or imperfectly correlated with speed, the cancellation weakens and some value of waiting may reappear.
Treaties that assign ex-post liability for global harm could replicate the internalization effect without requiring perfect symmetry in D.
The same cancellation logic may apply to other winner-takes-all technological races that carry correlated global downside.

Load-bearing premise

The model assumes a systemic ruin parameter D that is correlated with development velocity and shared globally across players.

What would settle it

Empirical observation that competing actors delay AGI deployment when they acknowledge correlated global catastrophe risks would contradict the cancellation result.

read the original abstract

We analyze a continuous-time preemption game with shared catastrophic externalities. When the cost of catastrophe is embedded in both players' payoffs, the risk term cancels out in the equilibrium indifference condition. This creates a "suicide region" where competitive pressures force rational agents to deploy despite negative risk-adjusted net present values. We apply this framework to the race for artificial general intelligence (AGI). We show that this suicide region widens as the cost of systemic ruin grows: higher catastrophic risk does not deter the race but instead enlarges the set of conditions under which rational actors deploy despite negative social value. We characterize the resulting welfare distortion against a social planner's benchmark and demonstrate how two complementary mechanisms - private liability and prize-sharing - can close the suicide region. Private liability raises the cost of unsafe deployment while prize-sharing reduces the strategic imperative to deploy first. "Warning shots" (sub-existential disasters) will fail to deter AGI acceleration, as the winner-takes-all nature of the race remains intact.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims shared ruin risk D cancels in the preemption equilibrium to create a suicide region of early AGI deployment, but the leader-follower timing difference likely keeps D in the threshold equation.

read the letter

The main point is that this model tries to explain why US and China keep accelerating AGI work even when they know misalignment could cause global catastrophe. It sets up a continuous-time preemption game where a systemic ruin parameter D is shared and tied to development speed. Because D hits both players' payoffs, the claim is that it drops out of the indifference condition between investing now and waiting, leaving a region where rational agents deploy anyway despite negative risk-adjusted value. Warning shots are said to change nothing, and only forcing private liability for safety work stops the race.

Referee Report

2 major / 2 minor

Summary. The paper models the race to AGI between two sovereign actors as a continuous-time preemption game with endogenous existential risk. A systemic ruin parameter D, correlated with development velocity and shared globally, is embedded in both players' payoffs. The central claim is that this risk term cancels from the equilibrium indifference condition between investing immediately (leader value) and waiting (follower continuation value), producing a 'suicide region' in which rational preemption forces early deployment despite negative risk-adjusted NPV. The manuscript further argues that sub-existential warning shots fail to deter acceleration and derives a private liability threshold plus mechanism-design interventions to restore the option value of waiting.

Significance. If the cancellation result is rigorously established, the work supplies a game-theoretic account of why observed AGI acceleration can be consistent with rational behavior under shared catastrophic risk, extending real-options analysis to races with global externalities. The proposed liability threshold and safety-research prerequisites offer concrete policy levers. The model is falsifiable via its predicted dependence of the suicide region on the correlation structure of D and velocity.

major comments (2)

[§3.2, Eq. (15)] §3.2, Eq. (15) (indifference condition): the derivation asserts that D cancels because it enters symmetrically in leader and follower continuation values. However, the leader's stopping time τ_L precedes the follower's τ_F, so the integrated hazard rates over [0,τ] differ unless the post-deployment ruin probability is explicitly independent of role and velocity. The manuscript must display the explicit integral expressions for both continuation values and show that the D terms are identical after substitution; without this step the cancellation is not guaranteed by global sharing alone.
[§4.1, Proposition 2] §4.1, Proposition 2 (suicide region): the existence of the region where NPV < 0 yet investment occurs is load-bearing on the cancellation result. If the D integrals do not cancel, the threshold reverts to the standard real-options form and the suicide region disappears. A direct comparison of the derived threshold with and without the symmetry assumption on D would clarify the scope of the result.

minor comments (2)

[§2] Notation for the hazard rate λ(v) and its dependence on velocity v is introduced in §2 but used without re-statement in the continuation-value integrals of §3; a brief reminder equation would improve readability.
[Figure 2] Figure 2 (investment-space diagram) labels the suicide region but does not indicate the numerical values of D and correlation parameter used to generate the boundaries; adding these parameters would allow readers to reproduce the plotted region.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments, which help clarify the scope of our cancellation result. We address each major comment below, indicating planned revisions to strengthen the exposition without altering the core model.

read point-by-point responses

Referee: [§3.2, Eq. (15)] §3.2, Eq. (15) (indifference condition): the derivation asserts that D cancels because it enters symmetrically in leader and follower continuation values. However, the leader's stopping time τ_L precedes the follower's τ_F, so the integrated hazard rates over [0,τ] differ unless the post-deployment ruin probability is explicitly independent of role and velocity. The manuscript must display the explicit integral expressions for both continuation values and show that the D terms are identical after substitution; without this step the cancellation is not guaranteed by global sharing alone.

Authors: We agree that explicit verification is needed. In the model, D represents a global existential cost realized upon AGI deployment by either player, with the hazard rate λ(t) driven by cumulative velocity up to the first stopping time. Because the post-deployment ruin is triggered globally and independently of which actor leads (the deployed AGI affects the shared world state), the integrated term -∫ D · λ(s) ds from 0 to τ_L in the leader value equals the corresponding term in the follower continuation value after the leader's deployment (the follower then faces the same global D from τ_L onward). We will insert the full integral expressions for V_L and V_F immediately before Eq. (15), substitute the common D factor, and demonstrate algebraic cancellation under the global-sharing assumption. This addition will also note the modeling choice that post-deployment risk does not depend on role. revision: yes
Referee: [§4.1, Proposition 2] §4.1, Proposition 2 (suicide region): the existence of the region where NPV < 0 yet investment occurs is load-bearing on the cancellation result. If the D integrals do not cancel, the threshold reverts to the standard real-options form and the suicide region disappears. A direct comparison of the derived threshold with and without the symmetry assumption on D would clarify the scope of the result.

Authors: We accept that the suicide region is conditional on the cancellation. In the revised manuscript we will add a short subsection after Proposition 2 that compares the equilibrium investment threshold under (i) the baseline global D with role-independent post-deployment ruin (yielding the suicide region where investment occurs for NPV < 0) and (ii) a counterfactual where D is either private or role-dependent (in which case the threshold reverts to the standard real-options form with no suicide region). This comparison will be presented both analytically and via a numerical illustration to delineate the precise conditions under which the result holds. revision: yes

Circularity Check

1 steps flagged

Cancellation of systemic ruin parameter D from indifference condition is imposed by symmetric global-sharing assumption rather than derived from differential stopping times

specific steps

self definitional [Abstract]
"As the disutility of catastrophe is embedded in both players' payoffs, the risk term mathematically cancels out of the equilibrium indifference condition. This creates a 'suicide region' in the investment space where competitive pressures force rational agents to deploy AGI systems early, despite a negative risk-adjusted net present value."

The paper claims the risk term cancels because D is embedded in both payoffs, directly yielding the suicide region. In a preemption game, however, leader and follower continuation values integrate the hazard over different intervals (earlier deployment for leader). Global sharing alone does not equate these integrals unless the model additionally assumes the ruin probability is role-independent and identical regardless of who moves first. The cancellation is therefore imposed by the symmetric embedding assumption rather than derived, making the suicide region result reduce to that modeling choice.

full rationale

The paper's central result—the suicide region where agents deploy despite negative NPV—rests on the claim that D cancels from the equilibrium indifference condition because it is embedded in both players' payoffs. This cancellation is asserted to follow from D being systemic, correlated with velocity, and shared globally. However, in a continuous-time preemption game the leader's stopping time precedes the follower's, so the continuation values contain distinct integrals over the hazard rate. Cancellation therefore requires an additional modeling restriction that the post-deployment ruin probability is independent of role and identical for both players. The abstract presents this cancellation as a mathematical consequence of embedding, but the skeptic analysis shows it is not automatic from global sharing alone. The result is therefore partially forced by the choice of how D enters the leader and follower values, producing a circularity score of 6.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The model rests on introducing D as a global shared ruin cost correlated with velocity; this parameter drives the cancellation and is not derived from external benchmarks.

free parameters (1)

systemic ruin parameter D
Correlated with development velocity and shared globally; specific functional dependence not derived from first principles.

axioms (1)

domain assumption AGI race modeled as continuous-time preemption game between sovereign actors
Assumes standard game-theoretic payoffs and timing apply directly to US-China AI competition.

pith-pipeline@v0.9.0 · 5786 in / 1347 out tokens · 80720 ms · 2026-05-21T17:58:01.372959+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

As the disutility of catastrophe is embedded in both players' payoffs, the risk term mathematically cancels out of the equilibrium indifference condition... V_P^* = I / ((1-2S) π(τ))
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The race is modelled as a symmetric, continuous-time stochastic game... payoff structures (3) and (4)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.