Pith Number

pith:XWKVJIXS

pith:2025:XWKVJIXSLVYNKHWXKRVTXQIOU2

not attested not anchored not stored refs resolved

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Mehrdad Farajtabar, Parshin Shojaee, Samy Bengio

Large Reasoning Models exhibit complete accuracy collapse beyond certain complexities and reduce reasoning effort despite available compute.

arxiv:2506.06941 v3 · 2025-06-07 · cs.AI · cs.CL · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{XWKVJIXSLVYNKHWXKRVTXQIOU2}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counterintuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having remaining token budget.

C2weakest assumption

That the chosen controllable puzzle environments provide an unbiased and generalizable measure of reasoning complexity without introducing artifacts that do not appear in other domains such as math or coding.

C3one line summary

LRMs exhibit complete accuracy collapse beyond certain puzzle complexities, with reasoning effort rising then declining, outperforming standard LLMs only on medium-complexity tasks.

References

55 extracted · 55 resolved · 12 Pith anchors

[1] OpenAI o1 System Card 2024 · arXiv:2412.16720

[2] Introducing openai o1 2024

[3] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning 2025 · arXiv:2501.12948

[4] Claude 3.7 sonnet 2025

[5] Gemini flash thinking.Google AI Blog, Jan 2025 2025

Formal links

2 machine-checked theorem links

Cited by

34 papers in Pith

MadEvolve: Evolutionary Optimization of Trading Systems with Large Language Models

Robust Reasoning Benchmark

Artificial Phantasia: Emergent Mental Imagery in Large Language Models

Deep sequence models tend to memorize geometrically; it is unclear why

A Model Can Help Itself: Reward-Free Self-Training for LLM Reasoning

Receipt and verification

First computed	2026-05-17T23:38:50.945787Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

bd9554a2f25d70d51ed7546b3bc10ea6987dd4cbd948aa53f779b964a512b7c5

Aliases

arxiv: 2506.06941 · arxiv_version: 2506.06941v3 · doi: 10.48550/arxiv.2506.06941 · pith_short_12: XWKVJIXSLVYN · pith_short_16: XWKVJIXSLVYNKHWX · pith_short_8: XWKVJIXS

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/XWKVJIXSLVYNKHWXKRVTXQIOU2 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: bd9554a2f25d70d51ed7546b3bc10ea6987dd4cbd948aa53f779b964a512b7c5

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "8a45099f14d045accff594ca13ca08c77d46017efad9a353a561b48d2641f330",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2025-06-07T22:42:29Z",
    "title_canon_sha256": "a0d32bd599754e05eb9948d06ed7aed1b2cdac8f3f64203a8c1b4e2a57a86a6c"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2506.06941",
    "kind": "arxiv",
    "version": 3
  }
}