pith:PBT2I4KO
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Reinforcement learning on reasoning trajectories combined with test-time token scaling points toward Large Reasoning Models.
arxiv:2501.09686 v3 · 2025-01-16 · cs.AI · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{PBT2I4KORUUAHC2OC2BVYEPYJJ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
The train-time and test-time scaling combined to show a new research frontier -- a path toward Large Reasoning Model. The introduction of OpenAI's o1 series marks a significant milestone in this research direction.
That reinforcement learning applied to reasoning trajectories will reliably expand LLMs' reasoning capacity without introducing systematic biases or hallucinations that are harder to detect than in standard generation.
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.
References
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:50.132380Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
7867a4714e8d28038b4e16835c11f84a74534e244ec7b575293df3293f5be1cf
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/PBT2I4KORUUAHC2OC2BVYEPYJJ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7867a4714e8d28038b4e16835c11f84a74534e244ec7b575293df3293f5be1cf
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "282c5a48b28b73fee08160a2e957058b7f8c773d182bcfbe789042d75bb24b76",
"cross_cats_sorted": [
"cs.CL"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.AI",
"submitted_at": "2025-01-16T17:37:58Z",
"title_canon_sha256": "27a29be91192a11f36ffa1b46e5ee199fa483d41b5aac49cfac0e14c1b975c54"
},
"schema_version": "1.0",
"source": {
"id": "2501.09686",
"kind": "arxiv",
"version": 3
}
}