pith. sign in
Pith Number

pith:HOUUIAHL

pith:2026:HOUUIAHLWSP7HYOLD3RRG2DJ5M
not attested not anchored not stored refs resolved

Correct Answers from Sound Reasoning: Verifiable Process Supervision for Language Models

Chen Wei, Jinwoo Shin, Kevin Wang, Kyuyoung Kim, Peiyang Xu, Peiyao Sheng, Pramod Viswanath, Sewoong Oh, Yunfei Xie, Zhangyang Wang

Verifiable process supervision lets language models keep sound reasoning while achieving accurate answers, unlike accuracy-only reinforcement learning which trades reasoning quality for performance.

arxiv:2605.12519 v1 · 2026-04-03 · cs.CL · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{HOUUIAHLWSP7HYOLD3RRG2DJ5M}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

While accuracy-only RL improves move accuracy, it sharply degrades reasoning quality, increasing win-rate error by up to 112% and reducing internal consistency by up to 69%. In contrast, VPS preserves accuracy while significantly improving reasoning quality, reducing win-rate error by up to 30% and restoring consistency to near saturation.

C2weakest assumption

That syntactic extraction of intermediate claims from the structured reasoning format will reliably produce evaluable steps that can be verified against ground-truth signals without introducing extraction errors or missing context.

C3one line summary

Verifiable process supervision trains language models to produce accurate answers with sound, verifiable reasoning steps, outperforming accuracy-only reinforcement learning on chess by preserving accuracy while reducing reasoning errors.

References

17 extracted · 17 resolved · 0 Pith anchors

[1] **Candidate selection**: whether the moves analyzed are reasonable to consider, using the engine summary as a reference for which candidates are meaningful
[2] this is a check
[3] Use the engine summary to identify the top candidates
[4] Identify which moves the trace analyzes and whether the top move is present
[5] For each candidate, assess whether the justification is position-specific or merely generic
Receipt and verification
First computed 2026-05-18T03:10:02.883015Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

3ba94400ebb49ff3e1cb1ee3136869eb237344f27f26b776e9424136433039e8

Aliases

arxiv: 2605.12519 · arxiv_version: 2605.12519v1 · doi: 10.48550/arxiv.2605.12519 · pith_short_12: HOUUIAHLWSP7 · pith_short_16: HOUUIAHLWSP7HYOL · pith_short_8: HOUUIAHL
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/HOUUIAHLWSP7HYOLD3RRG2DJ5M \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 3ba94400ebb49ff3e1cb1ee3136869eb237344f27f26b776e9424136433039e8
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "68b16c46458b620ed07d9079ad8e6553e60cbe250a2c2fdf8aefda6c1428b8eb",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-04-03T15:19:46Z",
    "title_canon_sha256": "e6de7b1ddb6c657d2ef4bca1b9644a204dcef7e60e77f74c1e2d2534c24f645b"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12519",
    "kind": "arxiv",
    "version": 1
  }
}