Pith Number

pith:KFNEXE42

pith:2025:KFNEXE427NDOHBEOVUOAZ3CDD4

not attested not anchored not stored refs resolved

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Jun Bai, Shuyi Zhang, Song-Chun Zhu, Tong Wu, Yang Liu, Yanting Wang, Zilong Zheng, Zixia Jia, Ziyong Lin

Large language models can learn genuine parallel reasoning on their own through self-distilled reinforcement learning.

arxiv:2512.07461 v3 · 2025-12-08 · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{KFNEXE427NDOHBEOVUOAZ3CDD4}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

NPR trained on Qwen3-4B achieves performance gains of up to 24.5% and inference speedups up to 4.6x. Unlike prior baselines that often fall back to autoregressive decoding, NPR demonstrates 100% genuine parallel execution.

C2weakest assumption

The self-distilled progressive training paradigm successfully transitions the model to native parallel cognition with strict topological constraints without external supervision or falling back to sequential behavior.

C3one line summary

NPR trains LLMs to reason in parallel via self-distilled RL, delivering up to 24.5% performance gains and 4.6x speedups with 100% genuine parallel execution on reasoning benchmarks.

References

25 extracted · 25 resolved · 7 Pith anchors

[1] Doing: Agents that Reason by Scaling Test-Time Interaction , author=

[2] Multiverse: Your language models secretly decide how to parallelize and merge generation

[3] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models · arXiv:2402.03300

[4] Parallel-r1: Towards parallel thinking via reinforcement learning

[5] Parallelsearch: Train your llms to decompose query and search sub-queries in parallel with reinforcement learning

Cited by

5 papers in Pith

On the Overscaling Curse of Parallel Thinking: System Efficacy Contradicts Sample Efficiency

Visual Para-Thinker: Divide-and-Conquer Reasoning for Visual Comprehension

LACE: Lattice Attention for Cross-thread Exploration

Receipt and verification

First computed	2026-05-17T23:39:00.571222Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

515a4b939afb46e3848ead1c0cec431f171c8961ac396fc7888cc25693eca8b2

Aliases

arxiv: 2512.07461 · arxiv_version: 2512.07461v3 · doi: 10.48550/arxiv.2512.07461 · pith_short_12: KFNEXE427NDO · pith_short_16: KFNEXE427NDOHBEO · pith_short_8: KFNEXE42

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/KFNEXE427NDOHBEOVUOAZ3CDD4 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 515a4b939afb46e3848ead1c0cec431f171c8961ac396fc7888cc25693eca8b2

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "ff6f48d22cf417a155f6f9c74452f2ba17ea6a4cedea10ed547be974b9db1781",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-12-08T11:39:43Z",
    "title_canon_sha256": "65f95b024d0449dc5a4d9152aa44ed7c9baf0699060d8b575a2e2585e4b444f2"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2512.07461",
    "kind": "arxiv",
    "version": 3
  }
}