pith. sign in
Pith Number

pith:OXUCBDMX

pith:2025:OXUCBDMX5AF4NE7IMWMG3FXJXV
not attested not anchored not stored refs resolved

DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition

Chong Ruan, Daya Guo, Dejian Yang, Haocheng Wang, Hongxuan Tang, Huajian Xin, Junxiao Song, Liyue Zhang, Qihao Zhu, Shirong Ma, Wanjia Zhao, Wenjun Gao, Yuxuan Liu, Z.F. Wu, Zhe Fu, Zhibin Gou, Zhihong Shao, Z.Z. Ren

DeepSeek-Prover-V2 trains a 671B model on subgoal-decomposed proofs to reach 88.9% success on formal theorem proving benchmarks.

arxiv:2504.21801 v2 · 2025-04-30 · cs.CL · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{OXUCBDMX5AF4NE7IMWMG3FXJXV}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The resulting model, DeepSeek-Prover-V2-671B, achieves state-of-the-art performance in neural theorem proving, reaching 88.9% pass ratio on the MiniF2F-test and solving 49 out of 658 problems from PutnamBench. Further evaluation on these 15 AIME problems shows that the model successfully solves 6 of them.

C2weakest assumption

The cold-start data generated by prompting DeepSeek-V3 to decompose problems into subgoals and synthesize proofs produces high-quality, error-free formal reasoning traces that reinforcement learning can reliably improve without systematic biases or invalid steps.

C3one line summary

DeepSeek-Prover-V2-671B reaches 88.9% on MiniF2F-test and solves 49 PutnamBench problems plus 6 of 15 recent AIME problems by training on subgoal-decomposed proofs collected via DeepSeek-V3.

References

36 extracted · 36 resolved · 1 Pith anchors

[1] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 2025 · arXiv:2501.12948
[2] − Note that 𝑘 is a positive integer since all terms are positive
[3] − Alternatively, note that for 𝑝, 𝑞, 𝑟 in the given range, 𝑘 ⩽ 3 is natural, as larger 𝑘 would make the right side too large
[4] − This has no solutions since 𝑝, 𝑞, 𝑟 ⩾ 2, making the left side much larger than the right
[5] − For 𝑝 = 2, no solution

Cited by

39 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:52.892126Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

75e8208d97e80bc693e865986d96e9bd647ab816316f887d843c2e1aa1b764c9

Aliases

arxiv: 2504.21801 · arxiv_version: 2504.21801v2 · doi: 10.48550/arxiv.2504.21801 · pith_short_12: OXUCBDMX5AF4 · pith_short_16: OXUCBDMX5AF4NE7I · pith_short_8: OXUCBDMX
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/OXUCBDMX5AF4NE7IMWMG3FXJXV \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 75e8208d97e80bc693e865986d96e9bd647ab816316f887d843c2e1aa1b764c9
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "314813eb5810c27c4a0fd5ea06f6793a726b2401782d3db55c566eedb7593c9c",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-04-30T16:57:48Z",
    "title_canon_sha256": "bf5d910ecaf0d586c06fa829f87927eb46db812956b8276ad6a26c0214b828b9"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2504.21801",
    "kind": "arxiv",
    "version": 2
  }
}