pith:3NSFLYB5
Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks
A single video-based model unifies multiple robot navigation tasks by standardizing their data formats.
arxiv:2412.06224 v2 · 2024-12-09 · cs.RO · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3NSFLYB5XM5L4DIY4H65GBVNE4}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Uni-NaVid is the first video-based vision-language-action model designed to unify diverse embodied navigation tasks and enable seamless navigation for mixed long-horizon tasks in unseen real-world environments.
Harmonizing input and output data configurations across tasks allows effective integration and positive synergy in learning without loss of performance on individual tasks or introduction of negative interference.
Uni-NaVid unifies diverse embodied navigation tasks into one video-based vision-language-action model trained on 3.6 million samples from four sub-tasks, achieving state-of-the-art performance on benchmarks and real-world tests.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:46.784204Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
db6455e03dbb3abe0d18e1fdd306ad272fa57104b1f13a1816e9da3eaae1b047
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3NSFLYB5XM5L4DIY4H65GBVNE4 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: db6455e03dbb3abe0d18e1fdd306ad272fa57104b1f13a1816e9da3eaae1b047
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "9678bece30c07e9689b0428da3a3ad5864662f0e3e9873458f46b86eb5418661",
"cross_cats_sorted": [
"cs.CV"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.RO",
"submitted_at": "2024-12-09T05:55:55Z",
"title_canon_sha256": "6de18cdeccb65d161bb2f2cf81f80abd312ccf44a0191d91de55ac49a7636abb"
},
"schema_version": "1.0",
"source": {
"id": "2412.06224",
"kind": "arxiv",
"version": 2
}
}