pith. sign in
Pith Number

pith:57MOY35R

pith:2024:57MOY35RKSWDA3PUOXJE5ICEAW
not attested not anchored not stored refs resolved

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Gaoyue Zhou, Hengkai Pan, Lerrel Pinto, Yann LeCun

DINO-WM uses pre-trained DINOv2 patch features to build world models that support zero-shot planning from offline data.

arxiv:2411.04983 v2 · 2024-11-07 · cs.RO · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{57MOY35RKSWDA3PUOXJE5ICEAW}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

DINO-WM achieves zero-shot behavioral solutions at test time on six environments without expert demonstrations, reward modeling, or pre-learned inverse models, outperforming prior state-of-the-art work across diverse task families.

C2weakest assumption

That predicting future DINOv2 patch features alone captures sufficient dynamics information to enable reliable planning without visual reconstruction or task-specific components.

C3one line summary

DINO-WM builds world models on pre-trained DINOv2 features to enable zero-shot planning from offline data without rewards or demonstrations.

References

131 extracted · 131 resolved · 27 Pith anchors

[1] Legged locomotion in challenging terrains using egocentric vision, 2022 2022
[2] Self-supervised learning from images with a joint-embedding predictive architecture 2023
[3] Nonlinear and adaptive control with applications, volume 187 2008
[4] V- JEPA : Latent video prediction for visual representation learning, 2024 2024
[5] RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control 2023 · arXiv:2307.15818

Formal links

2 machine-checked theorem links

Cited by

27 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:13.635938Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

efd8ec6fb154ac306df475d24ea044059ae2ad452412b6a444c84a360b52da61

Aliases

arxiv: 2411.04983 · arxiv_version: 2411.04983v2 · doi: 10.48550/arxiv.2411.04983 · pith_short_12: 57MOY35RKSWD · pith_short_16: 57MOY35RKSWDA3PU · pith_short_8: 57MOY35R
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/57MOY35RKSWDA3PUOXJE5ICEAW \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: efd8ec6fb154ac306df475d24ea044059ae2ad452412b6a444c84a360b52da61
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "e58d581b062dd34ba2e4bdac3fef6fec60b35cb830b49b5caf1d525671d436f7",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2024-11-07T18:54:37Z",
    "title_canon_sha256": "40b94add53985efde7ec3e35e5d2747436cc6d0ef9d700f818ce229c4e7b9abb"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2411.04983",
    "kind": "arxiv",
    "version": 2
  }
}