pith:AB5OYI3Q
Video models are zero-shot learners and reasoners
Generative video models like Veo 3 perform zero-shot object segmentation, edge detection, physics understanding, affordance recognition, tool simulation, and early visual reasoning such as maze and symmetry solving.
arxiv:2509.20328 v2 · 2025-09-24 · cs.LG · cs.AI · cs.CV · cs.RO
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{AB5OYI3QWC6J5DBHFP374UCGTC}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Veo 3 can solve a broad variety of tasks it wasn't explicitly trained for: segmenting objects, detecting edges, editing images, understanding physical properties, recognizing object affordances, simulating tool use, and more. These abilities enable early forms of visual reasoning like maze and symmetry solving.
That the demonstrated capabilities are genuinely zero-shot and not the result of implicit task information in the prompts, data contamination, or post-hoc selection of successful examples.
Generative video models exhibit emergent zero-shot capabilities across perception, manipulation, and basic reasoning tasks.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-18T02:40:55.632229Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
007aec2370b0bc9e8c272bf7fe504698857211fc963f2d5295a18ea4842ad671
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AB5OYI3QWC6J5DBHFP374UCGTC \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 007aec2370b0bc9e8c272bf7fe504698857211fc963f2d5295a18ea4842ad671
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "21c914ce64dfc70a482581f938246ca5411da0dccdb7629a162004911adf78f4",
"cross_cats_sorted": [
"cs.AI",
"cs.CV",
"cs.RO"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.LG",
"submitted_at": "2025-09-24T17:17:27Z",
"title_canon_sha256": "71d3cd4322b4002941d6b5b5741e8571148bd602b0cb7ea08d2b9a2ffff8c90e"
},
"schema_version": "1.0",
"source": {
"id": "2509.20328",
"kind": "arxiv",
"version": 2
}
}