pith:B3MFR4N7
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Generating human videos from web data lets a single robot policy manipulate unseen objects and novel motions without fine-tuning.
arxiv:2409.16283 v1 · 2024-09-24 · cs.RO · cs.CV · cs.LG · eess.IV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{B3MFR4N7YWZSXPIWNIYG62VWMM}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Our results on diverse real-world scenarios show how Gen2Act enables manipulating unseen object types and performing novel motions for tasks not present in the robot data.
That videos generated by a pre-trained model from web data provide sufficiently accurate and transferable motion information for a robot policy to execute novel tasks without any fine-tuning of the video model or additional domain adaptation.
Gen2Act enables generalizable robot manipulation for unseen objects and novel motions by using zero-shot human video generation from web data to condition a policy trained on an order of magnitude less robot interaction data.
References
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:52.572500Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
0ed858f1bfc5b32bbd166a306f6ab6632284e5e44e96a97f8ee3ab25c759d4b4
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/B3MFR4N7YWZSXPIWNIYG62VWMM \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0ed858f1bfc5b32bbd166a306f6ab6632284e5e44e96a97f8ee3ab25c759d4b4
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "34fb22a907aed15928b0a7e293e229a831eafa8c063e11d912c7ca183fdceb80",
"cross_cats_sorted": [
"cs.CV",
"cs.LG",
"eess.IV"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.RO",
"submitted_at": "2024-09-24T17:57:33Z",
"title_canon_sha256": "3f9ca604e7808f1291056c97bf44e06cc35ffd74c485e0535211086a76b907c9"
},
"schema_version": "1.0",
"source": {
"id": "2409.16283",
"kind": "arxiv",
"version": 1
}
}