Pith Number

pith:INA3ZRYW

pith:2023:INA3ZRYWGUTA6PUQHZ6SAOKHQD

not attested not anchored not stored refs resolved

TD-MPC2: Scalable, Robust World Models for Continuous Control

Hao Su, Nicklas Hansen, Xiaolong Wang

TD-MPC2 achieves significantly better performance than baselines on 104 continuous control tasks using one fixed set of hyperparameters.

arxiv:2310.16828 v2 · 2023-10-25 · cs.LG · cs.AI · cs.CV · cs.RO

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{INA3ZRYWGUTA6PUQHZ6SAOKHQD}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We demonstrate that TD-MPC2 improves significantly over baselines across 104 online RL tasks spanning 4 diverse task domains, achieving consistently strong results with a single set of hyperparameters. We further show that agent capabilities increase with model and data size, and successfully train a single 317M parameter agent to perform 80 tasks across multiple task domains, embodiments, and action spaces.

C2weakest assumption

The reported gains rely on the assumption that the chosen 104 tasks and four domains are representative enough that a single hyperparameter set will continue to work when the method is applied to new, unseen continuous-control problems.

C3one line summary

TD-MPC2 scales an implicit world-model RL method to a 317M-parameter agent that masters 80 tasks across four domains with a single hyperparameter configuration.

References

162 extracted · 162 resolved · 12 Pith anchors

[1] Layer normalization 2016

[2] Video pretraining (vpt): Learning to act by watching unlabeled online videos 2022

[3] A distributional perspective on reinforcement learning 2017

[4] A markovian decision process 1957

[7] Language models are few-shot learners 1901

Formal links

3 machine-checked theorem links

Cited by

37 papers in Pith

GAF: Gaussian Action Field as a 4D Representation for Dynamic World Modeling in Robotic Manipulation

D2 Actor Critic: Diffusion Actor Meets Distributional Critic

Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own

EvolvingAgent: Curriculum Self-evolving Agent with Continual World Model for Long-Horizon Tasks

stable-worldmodel: A Platform for Reproducible World Modeling Research and Evaluation

Receipt and verification

First computed	2026-05-17T23:39:22.321510Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

4341bcc71635260f3e903e7d20394780cfa0e88df256df3aa5350dab84d495b6

Aliases

arxiv: 2310.16828 · arxiv_version: 2310.16828v2 · doi: 10.48550/arxiv.2310.16828 · pith_short_12: INA3ZRYWGUTA · pith_short_16: INA3ZRYWGUTA6PUQ · pith_short_8: INA3ZRYW

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/INA3ZRYWGUTA6PUQHZ6SAOKHQD \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4341bcc71635260f3e903e7d20394780cfa0e88df256df3aa5350dab84d495b6

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "a2f737d999efdb1fbcc13897d2bbd2b8b944905f8270acce0ed123d3b92c7024",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CV",
      "cs.RO"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2023-10-25T17:57:07Z",
    "title_canon_sha256": "f2ad3264774b571271338ae467cb30dc4224dc960e43751b19fe7de5d69db4b3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2310.16828",
    "kind": "arxiv",
    "version": 2
  }
}