pith. sign in
Pith Number

pith:FLL3MXXN

pith:2024:FLL3MXXNNADKFGBIVLKXUB5UQC
not attested not anchored not stored refs resolved

Robotic Control via Embodied Chain-of-Thought Reasoning

Chelsea Finn, Karl Pertsch, Micha{\l} Zawalski, Oier Mees, Sergey Levine, William Chen

Embodied chain-of-thought reasoning trains VLAs to output grounded plans and visuals before actions, raising OpenVLA success by 28 percent.

arxiv:2407.08693 v3 · 2024-07-11 · cs.RO · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{FLL3MXXNNADKFGBIVLKXUB5UQC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

ECoT increases the absolute success rate of OpenVLA, the current strongest open-source VLA policy, by 28% across challenging generalization tasks, without any additional robot training data.

C2weakest assumption

The synthetic data generation pipeline produces reasoning traces that are both accurate enough to supervise the model and sufficiently diverse to improve generalization rather than overfitting to the generation heuristics.

C3one line summary

Training VLAs to perform embodied chain-of-thought reasoning about plans, sub-tasks, motions, and grounded visual features before acting raises OpenVLA success rates by 28% on challenging generalization tasks without new robot data.

References

118 extracted · 118 resolved · 2 Pith anchors

[1] A. Agarwal, A. Kumar, J. Malik, and D. Pathak. Legged locomotion in challenging terrains using egocentric vision, 2022 2022
[2] T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware, 2023 2023
[3] Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation 2024 · arXiv:2401.02117
[4] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N. Chatterji, A. Chen, K. C 2022
[5] A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, X. Chen, K. Choromanski, T. Ding, D. Driess, A. Dubey, C. Finn, P. Florence, C. Fu, M. G. Arenas, K. Gopalakrishnan, K. Han, K. Hausman, A. Herzog, J. H 2023

Formal links

2 machine-checked theorem links

Cited by

52 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.051031Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

2ad7b65eed6806a29828aad57a07b48081ced4cee339a187b7367e5d52c967da

Aliases

arxiv: 2407.08693 · arxiv_version: 2407.08693v3 · doi: 10.48550/arxiv.2407.08693 · pith_short_12: FLL3MXXNNADK · pith_short_16: FLL3MXXNNADKFGBI · pith_short_8: FLL3MXXN
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/FLL3MXXNNADKFGBIVLKXUB5UQC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2ad7b65eed6806a29828aad57a07b48081ced4cee339a187b7367e5d52c967da
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "5654583cd71b041baba26beae7e5844482c33556f0cc52ddf81e0b4dc4d3ae51",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2024-07-11T17:31:01Z",
    "title_canon_sha256": "ab4d44c4fd049cf8a78071f32caa91ce066d78cd9a1a1a36d75e8876fdf214b1"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2407.08693",
    "kind": "arxiv",
    "version": 3
  }
}