pith. sign in
Pith Number

pith:PAK2NUIA

pith:2026:PAK2NUIABZI2AWNA2VWHUGYKUL
not attested not anchored not stored refs resolved

Reasoning Models Don't Just Think Longer, They Move Differently

Anders Gj{\o}lbye, Lars Kai Hansen, Sanmi Koyejo

After length correction, reasoning models show distinct hidden-state trajectories on harder problems.

arxiv:2605.15454 v1 · 2026-05-14 · cs.CL · cs.LG · stat.ML

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{PAK2NUIABZI2AWNA2VWHUGYKUL}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

After residualizing trajectory statistics on length, difficulty remains systematically coupled to corrected trajectory geometry across all domains studied. The clearest reasoning-specific separation appears in the code domain, where harder problems show more direct corrected trajectories and less heterogeneous local curvature in reasoning-trained models than in matched instruction-tuned baselines.

C2weakest assumption

That residualizing raw trajectory statistics on generation length successfully removes all mechanical length-induced effects and leaves only the component attributable to reasoning strategy or internal computation.

C3one line summary

After length-correcting hidden-state trajectories during chain-of-thought, reasoning models show systematically different geometry on harder problems than baselines, strongest in competitive programming.

References

14 extracted · 14 resolved · 2 Pith anchors

[1] Mitigating overthinking in large reasoning models via manifold steering
[2] Jolliffe and Jorge Cadima 2065
[3] Probing the difficulty perception mechanism of large language models
[4] LLMs encode how difficult problems are.arXiv preprint arXiv:2510.18147,
[5] Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters · arXiv:2408.03314

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:00:59.437691Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

7815a6d1000e51a059a0d56c7a1b0aa2ea64bc9151b67e3e64bd2f69f0deedcc

Aliases

arxiv: 2605.15454 · arxiv_version: 2605.15454v1 · doi: 10.48550/arxiv.2605.15454 · pith_short_12: PAK2NUIABZI2 · pith_short_16: PAK2NUIABZI2AWNA · pith_short_8: PAK2NUIA
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/PAK2NUIABZI2AWNA2VWHUGYKUL \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7815a6d1000e51a059a0d56c7a1b0aa2ea64bc9151b67e3e64bd2f69f0deedcc
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "6987374dc451519ff851919f375b21ea8a49bf936b0529d23c427e96cd1aa356",
    "cross_cats_sorted": [
      "cs.LG",
      "stat.ML"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-14T22:37:33Z",
    "title_canon_sha256": "2ee461029fdeeeebb789b1d76858956f16d334d95a202ce589669959549b5ff3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15454",
    "kind": "arxiv",
    "version": 1
  }
}