pith. sign in
Pith Number

pith:BOXZES2I

pith:2025:BOXZES2IUJO6M2D3ZJ2V7L2TJD
not attested not anchored not stored refs resolved

Training Agents Inside of Scalable World Models

Danijar Hafner, Timothy Lillicrap, Wilson Yan

Dreamer 4 obtains diamonds in Minecraft by training reinforcement learning behaviors inside a world model learned from offline videos.

arxiv:2509.24527 v1 · 2025-09-29 · cs.AI · cs.LG · cs.RO · stat.ML

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{BOXZES2IUJO6M2D3ZJ2V7L2TJD}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

By learning behaviors in imagination, Dreamer 4 is the first agent to obtain diamonds in Minecraft purely from offline data, without environment interaction.

C2weakest assumption

The world model accurately predicts object interactions and game mechanics over the long action sequences required for the diamond task.

C3one line summary

Dreamer 4 is the first agent to obtain diamonds in Minecraft from only offline data by reinforcement learning inside a scalable world model that accurately predicts game mechanics.

References

84 extracted · 84 resolved · 26 Pith anchors

[1] Mastering diverse control tasks through world models.Nature, pages 1–7, 2025 2025
[2] Daydreamer: World models for physical robot learning 2023
[3] TD-MPC2: Scalable, Robust World Models for Continuous Control 2023 · arXiv:2310.16828
[4] Diffusion for world modeling: Visual details matter in atari.Advances in Neural Information Processing Systems, 37:58757–58791, 2024 2024
[5] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model 1911 · arXiv:1911.08265

Formal links

2 machine-checked theorem links

Cited by

38 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.862693Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

0baf924b48a25de6687bca755faf5348ee903c9ac35a648ddda7a30a201094cb

Aliases

arxiv: 2509.24527 · arxiv_version: 2509.24527v1 · doi: 10.48550/arxiv.2509.24527 · pith_short_12: BOXZES2IUJO6 · pith_short_16: BOXZES2IUJO6M2D3 · pith_short_8: BOXZES2I
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/BOXZES2IUJO6M2D3ZJ2V7L2TJD \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0baf924b48a25de6687bca755faf5348ee903c9ac35a648ddda7a30a201094cb
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "4b59d29c5076b42c6c37a85dc96c337c8730bf3be3b0297aa2e051b899671712",
    "cross_cats_sorted": [
      "cs.LG",
      "cs.RO",
      "stat.ML"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2025-09-29T09:42:27Z",
    "title_canon_sha256": "ca412486a9c960ce665f45febfd51c94c7d589cdfd63db76b8163230d2325929"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2509.24527",
    "kind": "arxiv",
    "version": 1
  }
}