pith. sign in
Pith Number

pith:7T3FOS2R

pith:2025:7T3FOS2RQBBK6QUVHWCRVMNFMP
not attested not anchored not stored refs resolved

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Chelsea Finn, Jianyu Chen, Lucy Xiaoyang Shi, Yanjiang Guo

A controllable world model ranks robot policies and improves them by 44.7 percent through imagined trajectories alone.

arxiv:2510.10125 v3 · 2025-10-11 · cs.RO · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7T3FOS2RQBBK6QUVHWCRVMNFMP}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

By synthesizing successful trajectories in imagination and using them for supervised fine-tuning, our approach can improve policy success by 44.7%.

C2weakest assumption

The generated trajectories are sufficiently accurate proxies for real-world dynamics on novel objects, instructions, and camera placements to enable reliable policy ranking and effective fine-tuning.

C3one line summary

A controllable world model trained on the DROID dataset generates consistent multi-view robot trajectories for over 20 seconds and improves generalist policy success rates by 44.7% via imagined trajectory fine-tuning.

References

56 extracted · 56 resolved · 32 Pith anchors

[1] Cosmos World Foundation Model Platform for Physical AI · arXiv:2501.03575
[2] RoboArena: Distributed real-world evaluation of generalist robot policies
[3] Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation · arXiv:2409.16283
[4] Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models · arXiv:2310.10639
[5] $\pi_0$: A Vision-Language-Action Flow Model for General Robot Control · arXiv:2410.24164

Formal links

2 machine-checked theorem links

Cited by

27 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:49.505782Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

fcf6574b518042af42953d851ab1a563c626870a90cde4bf3aee4770022a874b

Aliases

arxiv: 2510.10125 · arxiv_version: 2510.10125v3 · doi: 10.48550/arxiv.2510.10125 · pith_short_12: 7T3FOS2RQBBK · pith_short_16: 7T3FOS2RQBBK6QUV · pith_short_8: 7T3FOS2R
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7T3FOS2RQBBK6QUVHWCRVMNFMP \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: fcf6574b518042af42953d851ab1a563c626870a90cde4bf3aee4770022a874b
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "16dd01db064d2c62aee7b939ec6d294e3ef075f723ec58306d3e5f200272cc2c",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2025-10-11T09:13:10Z",
    "title_canon_sha256": "3019f14dbb11216663c9ef08481998aa772d3e8d4855973c51901e64ae6d4311"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2510.10125",
    "kind": "arxiv",
    "version": 3
  }
}