pith:HHQVCX47
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents
MLLMs excel at high-level embodied tasks but score only 28.9 percent on low-level manipulation.
arxiv:2502.09560 v3 · 2025-02-13 · cs.AI · cs.CL · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{HHQVCX474M63G67R7LUOLOWFX2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
MLLMs excel at high-level tasks but struggle with low-level manipulation, with the best model, GPT-4o, scoring only 28.9% on average.
That performance in the four chosen simulated environments and the six curated capability subsets accurately reflects real-world embodied agent challenges.
EmbodiedBench is a new evaluation framework for MLLM-based embodied agents that shows strong high-level reasoning but weak low-level manipulation performance across 24 tested models.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:46.118759Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
39e1515f9fe33db37bf1fae8e5bac5be8bd65ad66c2761748c703d1323a40c9e
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/HHQVCX474M63G67R7LUOLOWFX2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 39e1515f9fe33db37bf1fae8e5bac5be8bd65ad66c2761748c703d1323a40c9e
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "12cbdefdd1c0e9a1766ce06a931d3e50de0b2822ea302800d0d69603632a1381",
"cross_cats_sorted": [
"cs.CL",
"cs.CV"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.AI",
"submitted_at": "2025-02-13T18:11:34Z",
"title_canon_sha256": "f197cab08bcf541567652650012478e17043804ab59a7cb2839f5e4a50c9323a"
},
"schema_version": "1.0",
"source": {
"id": "2502.09560",
"kind": "arxiv",
"version": 3
}
}