pith:7DMISRS7
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
DeepMath-103K supplies 103K hard, clean math problems that let reinforcement learning reach state-of-the-art reasoning performance.
arxiv:2504.11456 v2 · 2025-04-15 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7DMISRS7T2TMSVT23ZQWMYQBL7}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more
Record completeness
Claims
models trained on DeepMath-103K achieve state-of-the-art results on challenging mathematical benchmarks and demonstrate generalization beyond math such as biology, physics and chemistry
The decontamination process fully removes overlap with numerous benchmarks and the selected problems remain sufficiently challenging and verifiable to produce genuine gains in reasoning capability.
DeepMath-103K is a new 103K-problem mathematical dataset with high difficulty, rigorous decontamination, and verifiable answers to support RL training of language-model reasoning.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:48.188069Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
f8d889465f9ea6c9567ade616662015fcf8899325d78cadaaa64e4f713039c33
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7DMISRS7T2TMSVT23ZQWMYQBL7 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: f8d889465f9ea6c9567ade616662015fcf8899325d78cadaaa64e4f713039c33
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "07c276a3630efe36049efdcd5a7b0393561f3daaeb3ad81157d140ec6a7b5b34",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2025-04-15T17:59:51Z",
"title_canon_sha256": "53e33e5e58a1bb68b4dc91d0f24d06d1a094d9a28713b715d476eeb3c32a2cf6"
},
"schema_version": "1.0",
"source": {
"id": "2504.11456",
"kind": "arxiv",
"version": 2
}
}