pith:BDDQ3SU5
Think Twice, Act Once: Verifier-Guided Action Selection For Embodied Agents
A test-time verifier trained on synthesized failures helps MLLM agents pick reliable actions from multiple candidates.
arxiv:2605.12620 v1 · 2026-05-12 · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{BDDQ3SU5C4FNMMBT2LSZ5F54N4}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more
Record completeness
Claims
Across embodied reasoning benchmarks spanning the Habitat and ALFRED environments, VeGAS consistently improves generalization, achieving up to a 36% relative performance gain over strong CoT baselines on the most challenging multi-object, long-horizon tasks.
That training a verifier on automatically synthesized failure cases from an LLM will produce a model that reliably identifies good actions in out-of-distribution scenarios where the base MLLM fails.
VeGAS improves MLLM-based embodied agents by sampling action ensembles and using a verifier trained on LLM-synthesized failure cases, yielding up to 36% relative gains on hard multi-object long-horizon tasks in Habitat and ALFRED.
References
Formal links
Receipt and verification
| First computed | 2026-05-18T03:10:00.447469Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
08c70dca9d170ad63033d2e59e97bc6f28bb589f8e93b171468152d78bb4ad37
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/BDDQ3SU5C4FNMMBT2LSZ5F54N4 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 08c70dca9d170ad63033d2e59e97bc6f28bb589f8e93b171468152d78bb4ad37
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "357ae961b7ad0cd33cf52aba688de6cfeb3ccedb2b2a1e58e3e3dc61b353b0ae",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-05-12T18:08:24Z",
"title_canon_sha256": "df1ac8fdff8401667f177b8bd831d1159e04ca4b32cb55a1738c26c46f7d3af5"
},
"schema_version": "1.0",
"source": {
"id": "2605.12620",
"kind": "arxiv",
"version": 1
}
}