pith. sign in
Pith Number

pith:FSM4SJL7

pith:2023:FSM4SJL7JGJDTZTHH5WRBQZ3YN
not attested not anchored not stored refs resolved

Reasoning with Language Model is Planning with World Model

Daisy Zhe Wang, Haodi Ma, Joshua Jiahua Hong, Shibo Hao, Yi Gu, Zhen Wang, Zhiting Hu

Language models can reason better by using themselves as world models and planning with tree search.

arxiv:2305.14992 v2 · 2023-05-24 · cs.CL · cs.AI · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{FSM4SJL7JGJDTZTHH5WRBQZ3YN}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

RAP on LLAMA-33B surpasses CoT on GPT-4 with 33% relative improvement in a plan generation setting.

C2weakest assumption

That the LLM, when prompted to act as world model, produces sufficiently accurate state predictions and transition simulations to guide search without compounding errors that invalidate the planning process.

C3one line summary

RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.

References

134 extracted · 134 resolved · 31 Pith anchors

[1] Alan Baddeley. 1992. Working memory. Science, 255(5044):556--559 1992
[2] Robert Eamon Briscoe. 2011. Mental imagery and the varieties of amodal perception. Pacific Philosophical Quarterly, 92(2):153--173 2011
[3] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot lear 2020
[5] Tom Bylander. 1994. The computational complexity of propositional strips planning. Artificial Intelligence, 69(1-2):165--204 1994
[6] Eduardo F Camacho and Carlos Bordons Alba. 2013. Model predictive control. Springer science & business media 2013

Formal links

2 machine-checked theorem links

Cited by

30 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:45.931891Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

2c99c9257f499239e6673f6d10c33bc37c8b61b0eb06d05048a5b7445fe822e0

Aliases

arxiv: 2305.14992 · arxiv_version: 2305.14992v2 · doi: 10.48550/arxiv.2305.14992 · pith_short_12: FSM4SJL7JGJD · pith_short_16: FSM4SJL7JGJDTZTH · pith_short_8: FSM4SJL7
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/FSM4SJL7JGJDTZTHH5WRBQZ3YN \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2c99c9257f499239e6673f6d10c33bc37c8b61b0eb06d05048a5b7445fe822e0
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "bcb89edfc5c0269c1257a96a4849fdbac48a9efca4fe11252d16014928c28beb",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2023-05-24T10:28:28Z",
    "title_canon_sha256": "0f69912f03a536037278f3398eb9a7514173bb9f5528928c09371ee69d22f828"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2305.14992",
    "kind": "arxiv",
    "version": 2
  }
}