pith. sign in

arxiv: 2606.22142 · v1 · pith:U4KXGWENnew · submitted 2026-06-20 · 💻 cs.RO

RoboLineage: Agent-Native Data Lifecycle Governance Across Robot Policy Iterations

Pith reviewed 2026-06-26 11:43 UTC · model grok-4.3

classification 💻 cs.RO
keywords robot learningdata lifecycle managementpolicy iterationlineage artifactsembodied AIagent systemsrobot manipulation
0
0 comments X

The pith

RoboLineage represents each step in robot policy iteration as a typed lineage artifact so agents can manage the full data lifecycle.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RoboLineage to make the data lifecycle in developing robot policies explicit rather than scattered across tools and memory. It does this by modeling rollouts, reviews, training runs, evaluations, and related decisions as typed artifacts with clear boundaries. Agents can then interpret robot rollout data, adapt datasets to training setups, and track state across iterations. This approach aims to speed up routine policy iteration in real-robot tasks while improving auditability and preserving performance. A sympathetic reader would care because better governance could reduce errors and time in embodied learning workflows.

Core claim

RoboLineage makes this lifecycle explicit by representing rollouts, reviews, dataset decisions, training runs, policy metadata, evaluations, deployment recommendations, and next-collection plans as typed lineage artifacts. Agents interpret embodied rollout evidence, adapt accepted data to existing training stacks, maintain data health, and summarize cross-iteration state under explicit artifact boundaries. In real-robot manipulation workflows, RoboLineage makes routine policy iteration faster and more auditable while maintaining downstream policy performance.

What carries the argument

Typed lineage artifacts representing the stages of robot policy iteration, which carry the argument by providing explicit boundaries for agent interpretation and action.

If this is right

  • Robot policy iteration becomes faster in real manipulation workflows.
  • The process gains auditability through explicit artifacts.
  • Downstream policy performance stays the same as without the system.
  • The system works as a lightweight layer across different robot embodiments and training families.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Teams using multiple robot platforms could standardize their data practices more easily with shared artifact types.
  • Future extensions might allow agents to autonomously decide on next data collection plans based on lineage summaries.
  • Comparing iteration logs with and without the system in the same task would test the speed and auditability gains directly.

Load-bearing premise

That modeling the lifecycle steps as typed lineage artifacts allows agents to interpret rollout evidence and adapt data within explicit boundaries.

What would settle it

An experiment on a real-robot manipulation task where using RoboLineage shows no improvement in iteration speed or auditability compared to conventional scattered tools and scripts.

Figures

Figures reproduced from arXiv: 2606.22142 by Nanchun Guo, Qian Luo, Wentao Guo, Yanchao Yang, Yi Ma, Yunhan Zhao, Zhennan Qin.

Figure 1
Figure 1. Figure 1: Manual policy iteration versus agent￾governed lifecycle artifacts. To address this problem, we introduce RoboLineage, a lightweight, open-source, agent-native governance system that turns the messy data ecosystem behind robot policy iteration into an explicit data life￾cycle. In this lifecycle, each rollout, re￾view, dataset update, training run, evalu￾ation, and deployment decision is inter￾preted, versio… view at source ↗
Figure 2
Figure 2. Figure 2: RoboLineage governs robot policy iteration as an explicit data lifecycle. The outer loop [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Rollout interpretation uses three lanes. Raw capture pre [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Policy success from shared 100-rollout candidate pools. Each workflow trains on 60 se￾lected demonstrations and evaluates on 60 trials per task; error bars show binomial standard error. We ask whether RoboLineage can reduce expert-mediated lifecycle labor while matching the policy quality of an expert-managed work￾flow in the measured setting. For each of eight tasks distinct from the closed-loop studies, … view at source ↗
Figure 5
Figure 5. Figure 5: Pre-training lifecycle effort per 30- rollout cycle. RoboLineage does not shorten physical collection; it reduces review and integra￾tion labor [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Representative VSA/post-review plates for standard pick-place and container-transfer [PITH_FULL_IMAGE:figures/full_fig_p033_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Representative VSA/post-review plates for thin-object extraction and object reorientation. [PITH_FULL_IMAGE:figures/full_fig_p034_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Representative VSA/post-review plates for precision insertion and long-horizon multi [PITH_FULL_IMAGE:figures/full_fig_p035_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Robot platforms used in the evaluation. From left to right: ARX tabletop arm, Realman [PITH_FULL_IMAGE:figures/full_fig_p037_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Stack red-on-blue closed-loop case. The plate follows three dominant failure stages [PITH_FULL_IMAGE:figures/full_fig_p043_10.png] view at source ↗
read the original abstract

We present RoboLineage, an agent-native data lifecycle governance system for robot policy iteration. Modern robot policies improve through repeated data collection, review, retraining, evaluation, and release decisions, but the evidence connecting these steps is often scattered across local tools, scripts, and expert memory. RoboLineage makes this lifecycle explicit by representing rollouts, reviews, dataset decisions, training runs, policy metadata, evaluations, deployment recommendations, and next-collection plans as typed lineage artifacts. Agents interpret embodied rollout evidence, adapt accepted data to existing training stacks, maintain data health, and summarize cross-iteration state under explicit artifact boundaries. In real-robot manipulation workflows, RoboLineage makes routine policy iteration faster and more auditable while maintaining downstream policy performance. We open source RoboLineage as a lightweight lifecycle layer for different robot embodiments and training families. Project page: https://robolineage.github.io/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript presents RoboLineage, an agent-native data lifecycle governance system for robot policy iterations. It models key elements such as rollouts, reviews, dataset decisions, training runs, policy metadata, evaluations, deployment recommendations, and next-collection plans as typed lineage artifacts. Agents are enabled to interpret embodied evidence, adapt data, maintain health, and summarize state. The paper claims that in real-robot manipulation workflows, this system accelerates routine policy iteration, enhances auditability, and maintains downstream policy performance. The implementation is open-sourced as a lightweight layer compatible with various robot embodiments and training families.

Significance. If the empirical claims hold, the typed lineage artifact approach could provide a practical, agent-interpretable structure for managing iterative data collection and training cycles in robotics, addressing fragmentation across tools and memory. The open-source release of the system as a lightweight layer is a concrete strength that supports reproducibility and adoption across embodiments and training stacks.

major comments (1)
  1. [Abstract] Abstract: The claim that 'in real-robot manipulation workflows, RoboLineage makes routine policy iteration faster and more auditable while maintaining downstream policy performance' is presented without any experimental results, quantitative metrics (e.g., iteration time, success rates), baselines, error bars, or case-study details. This absence renders the central empirical contribution unevaluable from the manuscript.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the concern regarding unsubstantiated empirical claims in the abstract below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that 'in real-robot manipulation workflows, RoboLineage makes routine policy iteration faster and more auditable while maintaining downstream policy performance' is presented without any experimental results, quantitative metrics (e.g., iteration time, success rates), baselines, error bars, or case-study details. This absence renders the central empirical contribution unevaluable from the manuscript.

    Authors: We agree that the abstract makes an empirical claim about faster iteration, improved auditability, and maintained performance without supporting quantitative results, metrics, baselines, or case-study details. The manuscript is structured as a system description paper focused on the typed lineage artifact model, agent-native governance mechanisms, and the open-source lightweight implementation. The claim reflects design goals and informal observations from development rather than controlled experiments. We will revise the abstract to remove the unsubstantiated performance assertions and instead describe the system's features and intended benefits without implying measured outcomes. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive systems paper with no derivations or self-referential reductions

full rationale

The manuscript describes an agent-native data lifecycle system for robot policy iteration by defining typed lineage artifacts for rollouts, reviews, training runs, and related steps, then stating that agents can interpret and adapt data under explicit boundaries. No equations, fitted parameters, uniqueness theorems, or ansatzes appear; the central claim of faster iteration and maintained performance is presented as an empirical outcome of the artifact representation rather than derived from prior self-citations or by construction from inputs. The paper is self-contained as a systems contribution with no load-bearing steps that reduce to their own definitions or fitted values.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Ledger constructed from abstract only; no full paper available to identify additional parameters or entities.

axioms (1)
  • domain assumption Modern robot policies improve through repeated data collection, review, retraining, evaluation, and release decisions, with evidence often scattered across local tools, scripts, and expert memory.
    This premise is stated directly in the abstract as the motivation for the system.
invented entities (1)
  • typed lineage artifacts no independent evidence
    purpose: To explicitly represent rollouts, reviews, dataset decisions, training runs, policy metadata, evaluations, deployment recommendations, and next-collection plans for agent interpretation.
    Core new construct introduced to make the lifecycle explicit and agent-accessible.

pith-pipeline@v0.9.1-grok · 5700 in / 1262 out tokens · 22715 ms · 2026-06-26T11:43:29.590244+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 20 canonical work pages

  1. [1]

    T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware. In K. E. Bekris, K. Hauser, S. L. Herbert, and J. Yu, editors,Robotics: Science and Systems XIX, Daegu, Republic of Korea, July 10-14, 2023, 2023. doi:10.15607/ RSS.2023.XIX.016. URLhttps://doi.org/10.15607/RSS.2023.XIX.016

  2. [2]

    C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song. Diffusion policy: Visuomotor policy learning via action diffusion. In K. E. Bekris, K. Hauser, S. L. Herbert, and J. Yu, editors,Robotics: Science and Systems XIX, Daegu, Republic of Korea, July 10-14, 2023, 2023. doi:10.15607/RSS.2023.XIX.026. URLhttps://doi.org/10.15607/RSS. 2023.XIX.026

  3. [3]

    Brohan, N

    A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Haus- man, A. Herzog, J. Hsu, J. Ibarz, B. Ichter, A. Irpan, T. Jackson, S. Jesmonth, N. J. Joshi, R. Julian, D. Kalashnikov, Y . Kuang, I. Leal, K. Lee, S. Levine, Y . Lu, U. Malla, D. Man- junath, I. Mordatch, O. Nachum, C. Parada, J. Peralta, E. Perez, K. Pertsch, ...

  4. [4]

    Zitkovich, T

    B. Zitkovich, T. Yu, S. Xu, P. Xu, T. Xiao, F. Xia, J. Wu, P. Wohlhart, S. Welker, A. Wahid, Q. Vuong, V . Vanhoucke, H. T. Tran, R. Soricut, A. Singh, J. Singh, P. Sermanet, P. R. San- keti, G. Salazar, M. S. Ryoo, K. Reymann, K. Rao, K. Pertsch, I. Mordatch, H. Michalewski, Y . Lu, S. Levine, L. Lee, T. E. Lee, I. Leal, Y . Kuang, D. Kalashnikov, R. Jul...

  5. [5]

    2024 , url =

    A. O’Neill, A. Rehman, A. Maddukuri, A. Gupta, A. Padalkar, A. Lee, A. Pooley, A. Gupta, A. Mandlekar, A. Jain, A. Tung, A. Bewley, A. Herzog, A. Irpan, A. Khazatsky, A. Rai, A. Gupta, A. E. Wang, A. Singh, A. Garg, A. Kembhavi, A. Xie, A. Brohan, A. Raffin, A. Sharma, A. Yavary, A. Jain, A. Balakrishna, A. Wahid, B. Burgess-Limerick, B. Kim, B. Sch ¨olko...

  6. [6]

    Khazatsky, K

    A. Khazatsky, K. Pertsch, S. Nair, A. Balakrishna, S. Dasari, S. Karamcheti, S. Nasiriany, M. K. Srirama, L. Y . Chen, K. Ellis, P. D. Fagan, J. Hejna, M. Itkina, M. Lepert, Y . J. Ma, P. T. Miller, J. Wu, S. Belkhale, S. Dass, H. Ha, A. Jain, A. Lee, Y . Lee, M. Memmel, S. Park, I. Radosavovic, K. Wang, A. Zhan, K. Black, C. Chi, K. B. Hatch, S. Lin, J. ...

  7. [7]

    M. J. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. P. Foster, P. R. Sanketi, Q. Vuong, T. Kollar, B. Burchfiel, R. Tedrake, D. Sadigh, S. Levine, P. Liang, and C. Finn. Openvla: An open-source vision-language-action model. In P. Agrawal, O. Kroemer, and W. Burgard, editors,Conference on Robot Learning, 6-9 November 202...

  8. [8]

    URLhttps://proceedings.mlr.press/v270/kim25c.html

  9. [9]

    Dasari, F

    S. Dasari, F. Ebert, S. Tian, S. Nair, B. Bucher, K. Schmeckpeper, S. Singh, S. Levine, and C. Finn. Robonet: Large-scale multi-robot learning. In L. P. Kaelbling, D. Kragic, and K. Sug- iura, editors,3rd Annual Conference on Robot Learning, CoRL 2019, Osaka, Japan, October 30 - November 1, 2019, Proceedings, Proceedings of Machine Learning Research, pages 885–

  10. [10]

    URLhttp://proceedings.mlr.press/v100/dasari20a.html

    PMLR, 2019. URLhttp://proceedings.mlr.press/v100/dasari20a.html

  11. [11]

    H. Fang, H. Fang, Z. Tang, J. Liu, C. Wang, J. Wang, H. Zhu, and C. Lu. RH20T: A comprehensive robotic dataset for learning diverse skills in one-shot. InIEEE Interna- tional Conference on Robotics and Automation, ICRA 2024, Yokohama, Japan, May 13- 17, 2024, pages 653–660. IEEE, 2024. doi:10.1109/ICRA57147.2024.10611615. URL https://doi.org/10.1109/ICRA5...

  12. [12]

    F. Lin, Y . Hu, P. Sheng, C. Wen, J. You, and Y . Gao. Data scaling laws in imitation learning for robotic manipulation. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. URL https://openreview.net/forum?id=pISLZG7ktL. 10

  13. [13]

    Octo: An Open-Source Generalist Robot Policy

    D. Ghosh, H. R. Walke, K. Pertsch, K. Black, O. Mees, S. Dasari, J. Hejna, T. Kreiman, C. Xu, J. Luo, Y . L. Tan, L. Y . Chen, Q. Vuong, T. Xiao, P. R. Sanketi, D. Sadigh, C. Finn, and S. Levine. Octo: An open-source generalist robot policy. In D. Kulic, G. Venture, K. E. Bekris, and E. Coronado, editors,Robotics: Science and Systems XX, Delft, The Nether...

  14. [14]

    H. R. Walke, K. Black, T. Z. Zhao, Q. Vuong, C. Zheng, P. Hansen-Estruch, A. W. He, V . Myers, M. J. Kim, M. Du, A. Lee, K. Fang, C. Finn, and S. Levine. Bridgedata V2: A dataset for robot learning at scale. In J. Tan, M. Toussaint, and K. Darvish, editors, Conference on Robot Learning, CoRL 2023, 6-9 November 2023, Atlanta, GA, USA, Pro- ceedings of Mach...

  15. [15]

    RoboCasa: Large-Scale Simulation of Household Tasks for Generalist Robots , url=

    S. Nasiriany, A. Maddukuri, L. Zhang, A. Parikh, A. Lo, A. Joshi, A. Mandlekar, and Y . Zhu. Robocasa: Large-scale simulation of household tasks for generalist robots. In D. Kulic, G. Venture, K. E. Bekris, and E. Coronado, editors,Robotics: Science and Systems XX, Delft, The Netherlands, July 15-19, 2024, 2024. doi:10.15607/RSS.2024.XX.050. URL https://d...

  16. [16]

    Mandlekar, Y

    A. Mandlekar, Y . Zhu, A. Garg, J. Booher, M. Spero, A. Tung, J. Gao, J. Emmons, A. Gupta, E. Orbay, S. Savarese, and L. Fei-Fei. ROBOTURK: A crowdsourcing platform for robotic skill learning through imitation. In2nd Annual Conference on Robot Learning, CoRL 2018, Z¨urich, Switzerland, 29-31 October 2018, Proceedings, Proceedings of Machine Learning Resea...

  17. [17]

    P. Wu, Y . Shentu, Z. Yi, X. Lin, and P. Abbeel. GELLO: A general, low-cost, and intuitive teleoperation framework for robot manipulators. InIEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024, Abu Dhabi, United Arab Emirates, October 14- 18, 2024, pages 12156–12163. IEEE, 2024. doi:10.1109/IROS58592.2024.10801581. URL https://d...

  18. [18]

    C. Chi, Z. Xu, C. Pan, E. Cousineau, B. Burchfiel, S. Feng, R. Tedrake, and S. Song. Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots. In D. Kulic, G. Venture, K. E. Bekris, and E. Coronado, editors,Robotics: Science and Systems XX, Delft, The Netherlands, July 15-19, 2024, 2024. doi:10.15607/RSS.2024.XX.045. URLhttp...

  19. [19]

    S. Chen, C. Wang, K. Nguyen, L. Fei-Fei, and C. K. Liu. Arcap: Collecting high-quality human demonstrations for robot learning with augmented reality feedback. InIEEE International Conference on Robotics and Automation, ICRA 2025, Atlanta, GA, USA, May 19-23, 2025, pages 8291–8298. IEEE, 2025. doi:10.1109/ICRA55743.2025.11128717. URLhttps:// doi.org/10.11...

  20. [20]

    J. Fang, W. Chen, H. Xue, F. Zhou, T. Le, Y . Wang, Y . Zhang, J. Lv, C. Wen, and C. Lu. Robopocket: Improve robot policies instantly with your phone, 2026. URLhttps://arxiv. org/abs/2603.05504

  21. [21]

    Mandlekar, D

    A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y . Zhu, and R. Mart ´ın-Mart´ın. What matters in learning from offline human demonstra- tions for robot manipulation. In A. Faust, D. Hsu, and G. Neumann, editors,Conference on Robot Learning, 8-11 November 2021, London, UK, Proceedings of Machine Learning Re- sear...

  22. [22]

    Mandlekar, S

    A. Mandlekar, S. Nasiriany, B. Wen, I. Akinola, Y . Narang, L. Fan, Y . Zhu, and D. Fox. Mimicgen: A data generation system for scalable robot learning using human demonstra- tions. In J. Tan, M. Toussaint, and K. Darvish, editors,Conference on Robot Learning, CoRL 2023, 6-9 November 2023, Atlanta, GA, USA, Proceedings of Machine Learning Re- search, page...

  23. [23]

    Belkhale, Y

    S. Belkhale, Y . Cui, and D. Sadigh. Data quality in imitation learning. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Information Processing Systems 36: Annual Conference on Neural Informa- tion Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023, 2023. URLhttp://papers....

  24. [24]

    C. Agia, R. Sinha, J. Yang, R. Antonova, M. Pavone, H. Nishimura, M. Itkina, and J. Bohg. Cupid: Curating data your robot loves with influence functions. In J. Lim, S. Song, and H.-W. Park, editors,Proceedings of The 9th Conference on Robot Learning, volume 305 of Proceedings of Machine Learning Research, pages 2907–2932. PMLR, 27–30 Sep 2025. URL https:/...

  25. [25]

    S. Dass, A. Khaddaj, L. Engstrom, A. Madry, A. Ilyas, and R. Mart ´ın-Mart´ın. Datamil: Se- lecting data for robot imitation learning with datamodels, 2025. URLhttps://arxiv.org/ abs/2505.09603

  26. [26]

    Hejna, S

    J. Hejna, S. Mirchandani, A. Balakrishna, A. Xie, A. Wahid, J. Tompson, P. R. Sanketi, D. Shah, C. M. Devin, and D. Sadigh. Robot Data Curation with Mutual Information Esti- mators. InProceedings of Robotics: Science and Systems, LosAngeles, CA, USA, June 2025. doi:10.15607/RSS.2025.XXI.023

  27. [27]

    Zhang, Y

    Y . Zhang, Y . Xie, H. Liu, R. Shah, M. Wan, L. Fan, and Y . Zhu. Scizor: A self-supervised approach to data curation for large-scale imitation learning, 2025. URLhttps://arxiv. org/abs/2505.22626

  28. [28]

    Ichter, A

    B. Ichter, A. Brohan, Y . Chebotar, C. Finn, K. Hausman, A. Herzog, D. Ho, J. Ibarz, A. Irpan, E. Jang, R. Julian, D. Kalashnikov, S. Levine, Y . Lu, C. Parada, K. Rao, P. Sermanet, A. To- shev, V . Vanhoucke, F. Xia, T. Xiao, P. Xu, M. Yan, N. Brown, M. Ahn, O. Cortes, N. Sievers, C. Tan, S. Xu, D. Reyes, J. Rettinghouse, J. Quiambao, P. Pastor, L. Luu, ...

  29. [29]

    URLhttps://proceedings.mlr.press/v205/ichter23a.html

    PMLR, 2022. URLhttps://proceedings.mlr.press/v205/ichter23a.html

  30. [30]

    Huang, F

    W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y . Chebotar, P. Sermanet, T. Jackson, N. Brown, L. Luu, S. Levine, K. Hausman, and B. Ichter. Inner monologue: Embodied reasoning through planning with language models. In K. Liu, D. Kulic, and J. Ichnowski, editors,Conference on Robot Learning, CoRL 2022, 14-18 ...

  31. [31]

    URLhttps://proceedings.mlr.press/v205/huang23c.html

    PMLR, 2022. URLhttps://proceedings.mlr.press/v205/huang23c.html

  32. [32]

    Liang, W

    J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng. Code as policies: Language model programs for embodied control. InIEEE International Conference on Robotics and Automation, ICRA 2023, London, UK, May 29 - June 2, 2023, pages 9493–

  33. [33]

    ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions,

    IEEE, 2023. doi:10.1109/ICRA48891.2023.10160591. URLhttps://doi.org/10. 1109/ICRA48891.2023.10160591. 12

  34. [34]

    Driess, F

    D. Driess, F. Xia, M. S. M. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu, W. Huang, Y . Chebotar, P. Sermanet, D. Duckworth, S. Levine, V . Vanhoucke, K. Hausman, M. Toussaint, K. Greff, A. Zeng, I. Mordatch, and P. Florence. Palm-e: An embodied multimodal language model. In A. Krause, E. Brunskill, K. Cho, B. Engelhar...

  35. [35]

    Huang, C

    W. Huang, C. Wang, R. Zhang, Y . Li, J. Wu, and L. Fei-Fei. V oxposer: Composable 3d value maps for robotic manipulation with language models. In J. Tan, M. Toussaint, and K. Darvish, editors,Conference on Robot Learning, CoRL 2023, 6-9 November 2023, Atlanta, GA, USA, Proceedings of Machine Learning Research, pages 540–562. PMLR, 2023. URL https://procee...

  36. [36]

    Black, N

    K. Black, N. Brown, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, L. Groom, K. Haus- man, B. Ichter, S. Jakubczak, T. Jones, L. Ke, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, L. X. Shi, J. Tanner, Q. Vuong, A. Walling, H. Wang, and U. Zhilinsky.π 0: A vision-language-action flow model for general robot control, 2026. URLhttps://arxiv. o...

  37. [37]

    Black, N

    K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. R. Equi, C. Finn, N. Fusai, M. Y . Galliker, D. Ghosh, L. Groom, K. Hausman, b. ichter, S. Jakubczak, T. Jones, L. Ke, D. LeBlanc, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, A. Z. Ren, L. X. Shi, L. Smith, J. T. Springenberg, K. Stachowicz, J. Tanner, Q. Vuong, H. Walke...

  38. [38]

    URLhttps://proceedings.mlr.press/v305/black25a

    PMLR, 27–30 Sep 2025. URLhttps://proceedings.mlr.press/v305/black25a. html

  39. [39]

    M. J. Kim, C. Finn, and P. Liang. Fine-tuning vision-language-action models: Optimizing speed and success, 2025. URLhttps://arxiv.org/abs/2502.19645

  40. [40]

    S. Liu, L. Wu, B. Li, H. Tan, H. Chen, Z. Wang, K. Xu, H. Su, and J. Zhu. RDT-1B: a diffusion foundation model for bimanual manipulation. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. URLhttps://openreview.net/forum?id=yAzN4tz7oI

  41. [41]

    J. Duan, W. Pumacay, N. Kumar, Y . R. Wang, S. Tian, W. Yuan, R. Krishna, D. Fox, A. Mandlekar, and Y . Guo. AHA: A vision-language-model for detecting and reasoning over failures in robotic manipulation. InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. URL https://openre...

  42. [42]

    R. Li, Y . Zhou, Y . Zhu, K. Chen, J. Wang, S. Wang, K. Hu, M. Yu, B. Jiang, Z. Su, J. Ma, X. He, Y . Shen, Y . Yang, G. Ren, M. Yao, W. Wang, and Y . Mu. Roboclaw: An agentic framework for scalable long-horizon robotic tasks, 2026. URLhttps://arxiv.org/abs/2603.11558

  43. [43]

    Liang, S

    W. Liang, S. Wang, H.-J. Wang, O. Bastani, Y . J. Ma, and D. Jayaraman. Tether: Autonomous functional play with correspondence-driven trajectory warping, 2026. URLhttps://arxiv. org/abs/2603.03278

  44. [44]

    Sculley, G

    D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V . Chaudhary, M. Young, J. Crespo, and D. Dennison. Hidden technical debt in machine learning systems. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors,Ad- vances in Neural Information Processing Systems 28: Annual Conference on Neural In- formation Processin...

  45. [45]

    Baylor, E

    D. Baylor, E. Breck, H. Cheng, N. Fiedel, C. Y . Foo, Z. Haque, S. Haykal, M. Ispir, V . Jain, L. Koc, C. Y . Koo, L. Lew, C. Mewald, A. N. Modi, N. Polyzotis, S. Ramesh, S. Roy, S. E. Whang, M. Wicke, J. Wilkiewicz, X. Zhang, and M. Zinkevich. TFX: A tensorflow-based production-scale machine learning platform. InProceedings of the 23rd ACM SIGKDD In- ter...

  46. [46]

    Zaharia, A

    M. Zaharia, A. Chen, A. Davidson, A. Ghodsi, S. A. Hong, A. Konwinski, S. Murch- ing, T. Nykodym, P. Ogilvie, M. Parkhe, F. Xie, and C. Zumar. Accelerating the ma- chine learning lifecycle with mlflow.IEEE Data Eng. Bull., 41(4):39–45, 2018. URL http://sites.computer.org/debull/A18dec/p39.pdf

  47. [47]

    M. Vartak. MODELDB: A system for machine learning model management. In8th Biennial Conference on Innovative Data Systems Research, CIDR 2017, Chaminade, CA, USA, Jan- uary 8-11, 2017, Online Proceedings. www.cidrdb.org, 2017. URLhttp://cidrdb.org/ cidr2017/gongshow/abstracts/cidr2017_112.pdf

  48. [48]

    Souza, P

    R. Souza, P. Valduriez, M. Mattoso, R. Cerqueira, M. A. S. Netto, L. Azevedo, V . Lourenc ¸o, E. F. de Souza Soares, R. Thiago, R. Brand ˜ao, D. Civitarese, E. V . Brazil, and M. F. Moreno. Provenance data in the machine learning lifecycle in computational science and engineering. In2019 IEEE/ACM Workflows in Support of Large-Scale Science, WORKS@SC 2019,...

  49. [49]

    Schlegel and K

    M. Schlegel and K. Sattler. Management of machine learning lifecycle artifacts: A survey. SIGMOD Rec., 51(4):18–35, 2022. doi:10.1145/3582302.3582306. URLhttps://doi.org/ 10.1145/3582302.3582306

  50. [50]

    Schlegel and K

    M. Schlegel and K. Sattler. Capturing end-to-end provenance for machine learning pipelines. Inf. Syst., 132:102495, 2025. doi:10.1016/J.IS.2024.102495. URLhttps://doi.org/10. 1016/j.is.2024.102495

  51. [51]

    Communications of the ACM , volume = 64, number = 12, pages =

    T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. M. Wallach, H. D. III, and K. Craw- ford. Datasheets for datasets.Commun. ACM, 64(12):86–92, 2021. doi:10.1145/3458723. URLhttps://doi.org/10.1145/3458723

  52. [52]

    Mitchell, S

    M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru. Model cards for model reporting. In danah boyd and J. H. Morgenstern, editors, Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* 2019, Atlanta, GA, USA, January 29-31, 2019, pages 220–229. ACM, 2019. doi:10.1145/32...

  53. [53]

    we need more successes

    N. Sambasivan, S. Kapania, H. Highfill, D. Akrong, P. K. Paritosh, and L. Aroyo. ”every- one wants to do the model work, not the data work”: Data cascades in high-stakes AI. In Y . Kitamura, A. Quigley, K. Isbister, T. Igarashi, P. Bjørn, and S. M. Drucker, editors,CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Ja...