pith. sign in
Pith Number

pith:LCUU7O5A

pith:2023:LCUU7O5ALQ563DCZA5WAUANSDX
not attested not anchored not stored refs resolved

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory

Bin Li, Chenxin Tao, Chenyu Yang, Gao Huang, Hao Tian, Jifeng Dai, Lewei Lu, Weijie Su, Xiaogang Wang, Xizhou Zhu, Yuntao Chen, Yu Qiao, Zhaoxiang Zhang

Large language models with text memory and knowledge let agents complete Minecraft's full Overworld item tree for the first time.

arxiv:2305.17144 v2 · 2023-05-25 · cs.AI · cs.CL · cs.CV · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{LCUU7O5ALQ563DCZA5WAUANSDX}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The resulting LLM-based agent markedly surpasses previous methods, achieving a remarkable improvement of +47.5% in success rate on the 'ObtainDiamond' task... Notably, our agent is the first to procure all items in the Minecraft Overworld technology tree.

C2weakest assumption

That large language models already contain sufficient logic and common sense to generate reliable long-horizon action plans for sparse-reward open-world tasks when given only text-based state and memory.

C3one line summary

GITM uses LLMs to generate action plans from text knowledge and memory, enabling agents to complete long-horizon Minecraft tasks at much higher success rates than prior RL methods.

References

45 extracted · 45 resolved · 10 Pith anchors

[1] A. Amiranashvili, N. Dorka, W. Burgard, V . Koltun, and T. Brox. Scaling imitation learning in minecraft. arXiv preprint arXiv:2007.02701, 2020 2007
[2] B. Baker, I. Akkaya, P. Zhokov, J. Huizinga, J. Tang, A. Ecoffet, B. Houghton, R. Sampedro, and J. Clune. Video pretraining (vpt): Learning to act by watching unlabeled online videos. Advances in Neur 2022
[3] Open-world multi-task control through goal-aware representation learning and adaptive horizon prediction 2023
[4] PaLM-E: An Embodied Multimodal Language Model 2023 · arXiv:2303.03378
[5] doi:10.48550/arXiv.2206.08853 , author = 2022

Formal links

3 machine-checked theorem links

Cited by

30 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:50.568128Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

58a94fbba05c3bed8c59076c0a01b21dc69d86607109339740074a541adad37a

Aliases

arxiv: 2305.17144 · arxiv_version: 2305.17144v2 · doi: 10.48550/arxiv.2305.17144 · pith_short_12: LCUU7O5ALQ56 · pith_short_16: LCUU7O5ALQ563DCZ · pith_short_8: LCUU7O5A
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LCUU7O5ALQ563DCZA5WAUANSDX \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 58a94fbba05c3bed8c59076c0a01b21dc69d86607109339740074a541adad37a
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "ee054fdaaed73557e9b02b1a341700096ab61fa51c5b6f8e623a1a75e2e287fd",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CV",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2023-05-25T17:59:49Z",
    "title_canon_sha256": "c6a6442993f95f7eff12801cee554418aba6e2e9c6bee3b86e6177fc9e258ac1"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2305.17144",
    "kind": "arxiv",
    "version": 2
  }
}