pith. machine review for the scientific record. sign in
Pith Number

pith:AWTLHGQF

pith:2023:AWTLHGQFMS6UY7JHLKNKIPLENX
not attested not anchored not stored refs resolved

Scaling Robot Learning with Semantically Imagined Experience

Anthony Brohan, Austin Stone, Brian Ichter, Clayton Tan, Dee M, Fei Xia, Jaspiar Singh, Jodilyn Peralta, Jonathan Tompson, Karol Hausman, Su Wang, Ted Xiao, Tianhe Yu

Robot policies trained on data augmented by text-to-image inpainting solve unseen tasks with new objects and resist novel distractors.

arxiv:2302.11550 v1 · 2023-02-22 · cs.RO · cs.AI · cs.CL · cs.CV · cs.LG

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

manipulation policies trained on data augmented this way are able to solve completely unseen tasks with new objects and can behave more robustly w.r.t. novel distractors.

C2weakest assumption

The inpainted images generated by the text-to-image diffusion model are sufficiently realistic and physically plausible that policies trained on them transfer successfully to real-world robot execution without introducing harmful artifacts or distribution shifts.

C3one line summary

Augmenting robot datasets via diffusion-based semantic inpainting enables manipulation policies to solve unseen tasks with new objects and improves robustness to novel distractors.

References

78 extracted · 78 resolved · 21 Pith anchors

[1] VIMA : General robot manipulation with multimodal prompts 2022
[2] RT-1: Robotics Transformer for Real-World Control at Scale 2022 · arXiv:2212.06817
[3] M. Shridhar, L. Manuelli, and D. Fox. Cliport: What and where pathways for robotic manipulation. In Conference on Robot Learning, 2022 2022
[4] Perceiver-Actor: A multi-task transformer for robotic manipulation 2022
[5] Hierarchical Text-Conditional Image Generation with CLIP Latents 2022 · arXiv:2204.06125

Formal links

2 machine-checked theorem links

Cited by

19 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:13.345311Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

05a6b39a0564bd4c7d275a9aa43d646dc49a7801a6ef9b2c37e9b7a43ae97c66

Aliases

arxiv: 2302.11550 · arxiv_version: 2302.11550v1 · doi: 10.48550/arxiv.2302.11550 · pith_short_12: AWTLHGQFMS6U · pith_short_16: AWTLHGQFMS6UY7JH · pith_short_8: AWTLHGQF
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AWTLHGQFMS6UY7JHLKNKIPLENX \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 05a6b39a0564bd4c7d275a9aa43d646dc49a7801a6ef9b2c37e9b7a43ae97c66
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "55904d64b7891cd5f5b1000f866ce83187fa3ee4c80a632c6b1382e7ba4fc268",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.CV",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2023-02-22T18:47:51Z",
    "title_canon_sha256": "416b99d59369f421d2a477ea51d7e169215b0c727e7612844bb23588525725bc"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2302.11550",
    "kind": "arxiv",
    "version": 1
  }
}