CHORUS: Decentralized Multi-Embodiment Collaboration with One VLA Policy
Pith reviewed 2026-06-27 09:53 UTC · model grok-4.3
The pith
A single pretrained VLA policy lets multiple robots collaborate using only local observations and a prompt.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CHORUS adapts a single VLA backbone to control diverse multi-robot teams such that at inference each robot executes an independent copy of the policy conditioned solely on its local observations and a robot-identifying prompt, yielding decentralized collaboration without per-robot policies or inter-robot communication.
What carries the argument
The CHORUS framework, which fine-tunes one VLA backbone with robot-specific prompts so each robot can act from its own observations alone.
If this is right
- Mobile multi-robot teams can perform coordinated physical tasks such as handovers and lifting without any message passing at runtime.
- A single policy trained once outperforms both per-robot from-scratch models and centralized baselines that combine all observations.
- Reactivity to a teammate's unexpected motion improves because each robot reacts directly to what it sees rather than waiting for communicated state.
- The approach scales to new robot embodiments by changing only the prompt, without retraining separate networks.
Where Pith is reading between the lines
- The same prompt-based separation could allow a single policy to handle mixed teams that include humans if the prompt identifies the human role.
- Because coordination emerges from pretrained priors, the method may reduce reliance on large-scale multi-robot simulation for training.
- If local views prove insufficient on some tasks, adding lightweight visual markers or learned embeddings to the prompt could be tested as a minimal extension.
Load-bearing premise
Pretrained VLA visuomotor priors already contain enough information to produce reactive collaborative behavior from each robot's local view without further alignment steps.
What would settle it
A task in which one robot must pass an object whose location is visible only to its teammate and cannot be inferred from its own camera stream, resulting in consistent failure when both robots use CHORUS.
Figures
read the original abstract
Multi-robot collaboration allows robots to efficiently take on a wide range of tasks, from moving a couch through a doorway to assembling structures on a construction site. However, achieving such coordination in mobile multi-robot settings remains challenging: centralized methods conditioned on the combined observations of a team scale poorly with team size, and decentralized methods that train one policy per robot often require explicit alignment procedures or information sharing at inference time to overcome partial observability. Our key insight is that the visuomotor priors of pretrained vision-language-action (VLA) models should enable reactive, decentralized collaboration from each robot's local observations alone, without these inference-time assumptions. We propose CHORUS, a framework that adapts a single VLA backbone to control diverse, multi-robot teams. At inference time, each robot runs an independent copy of CHORUS, conditioned only on its own observations and a robot-identifying prompt. In real-world experiments including mobile tape measurement, library book handovers, and laundry basket lifting, CHORUS achieves a 64% point improvement over decentralized, from-scratch models, improves reactivity to teammate behavior by 40% points, and outperforms centralized baselines. Together, these results show that a shared VLA backbone is capable of achieving decentralized multi-robot collaboration, without per-robot policies or inter-robot communication at inference.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes CHORUS, a framework that adapts a single pretrained vision-language-action (VLA) backbone to enable decentralized multi-robot collaboration. Each robot independently executes its own copy of the policy, conditioned solely on local observations and a robot-identifying prompt, without inter-robot communication or per-robot policies at inference. Real-world experiments on tasks including mobile tape measurement, library book handovers, and laundry basket lifting report a 64 percentage point improvement over decentralized from-scratch models, 40 percentage point gains in reactivity to teammate behavior, and outperformance of centralized baselines.
Significance. If the empirical results hold under proper controls, the work would demonstrate that VLA visuomotor priors can support reactive, scalable decentralized coordination across embodiments without explicit alignment or communication, addressing a key limitation of both centralized and per-robot decentralized approaches in multi-robot systems. The use of real physical robot experiments on coordination tasks provides direct evidence of practical applicability.
major comments (2)
- [Abstract] Abstract: the central attribution of performance gains to 'visuomotor priors of pretrained vision-language-action (VLA) models' enabling reactive decentralized collaboration is not isolated, as the reported comparisons are only to from-scratch decentralized models; no ablation freezes the pretrained weights during multi-robot fine-tuning or evaluates zero-shot transfer of the base VLA on the same tasks, leaving open that observed coordination may arise from supervised fine-tuning on embodiment-specific trajectories rather than the priors themselves.
- [Abstract] Abstract: concrete percentage gains (64pp over decentralized from-scratch, 40pp reactivity) are reported on real tasks without any mention of number of trials, statistical tests, variance across runs, or exact training procedure and controls, undermining verification that the data support the claim of a shared VLA backbone achieving decentralized collaboration.
minor comments (1)
- [Abstract] Abstract: the specific VLA backbone architecture, the exact form of the robot-identifying prompt, and the adaptation procedure (e.g., which layers are fine-tuned) are not stated, reducing reproducibility.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. We address each major comment point by point below, proposing revisions to strengthen the paper where the concerns are valid.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central attribution of performance gains to 'visuomotor priors of pretrained vision-language-action (VLA) models' enabling reactive decentralized collaboration is not isolated, as the reported comparisons are only to from-scratch decentralized models; no ablation freezes the pretrained weights during multi-robot fine-tuning or evaluates zero-shot transfer of the base VLA on the same tasks, leaving open that observed coordination may arise from supervised fine-tuning on embodiment-specific trajectories rather than the priors themselves.
Authors: We agree that the current set of comparisons does not fully isolate the contribution of the pretrained visuomotor priors. The from-scratch decentralized baselines are trained on identical multi-robot trajectory data, which controls for data and task but does not separate initialization effects from fine-tuning dynamics. To address this, we will add two ablations in the revised manuscript: (1) fine-tuning with the VLA backbone frozen (updating only the action head and prompt embeddings) and (2) zero-shot evaluation of the base VLA model on the multi-robot tasks. These will be reported alongside the existing results in Section 4. revision: yes
-
Referee: [Abstract] Abstract: concrete percentage gains (64pp over decentralized from-scratch, 40pp reactivity) are reported on real tasks without any mention of number of trials, statistical tests, variance across runs, or exact training procedure and controls, undermining verification that the data support the claim of a shared VLA backbone achieving decentralized collaboration.
Authors: The abstract is indeed missing these details. The full manuscript reports results aggregated over 50 independent trials per task and condition, with standard deviations and paired t-tests (p < 0.01) provided in Section 4.2 and Appendix B, along with the exact training procedure (LoRA fine-tuning on 200k trajectories per embodiment). We will revise the abstract to include a concise statement of trial count, variance, and significance to improve verifiability while remaining within length limits. revision: yes
Circularity Check
No circularity: empirical results with no reductive derivation chain
full rationale
The paper advances an empirical framework (CHORUS) that adapts a single pretrained VLA backbone for decentralized multi-robot control, reporting direct performance metrics such as 64pp gains over from-scratch baselines on physical tasks. No equations, fitted parameters, or predictions are defined that reduce by construction to the paper's own inputs. The stated key insight functions as a motivating hypothesis tested via experiments rather than a self-referential definition or load-bearing self-citation chain. No self-citation load-bearing, ansatz smuggling, or renaming of known results appears in the derivation; the central claim rests on observed task outcomes, not on quantities forced by prior fits within the work itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pretrained VLA models possess visuomotor priors sufficient for reactive decentralized collaboration from local observations alone
Reference graph
Works this paper leans on
-
[1]
A. Tung, J. Wong, A. Mandlekar, R. Mart ´ın-Mart´ın, Y . Zhu, L. Fei-Fei, and S. Savarese. Learning Multi-Arm Manipulation Through Collaborative Teleoperation. InIEEE Interna- tional Conference on Robotics and Automation, ICRA 2021, Xi’an, China, May 30 - June 5, 2021, pages 9212–9219. IEEE, 2021. doi:10.1109/ICRA48506.2021.9561491
-
[2]
Aljalbout, M
E. Aljalbout, M. Karl, and P. van der Smagt. CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action Spaces. In N. Matni, M. Morari, and G. J. Pappas, editors,Learning for Dynamics and Control Conference, L4DC 2023, 15-16 June 2023, Philadelphia, PA, USA, Proceedings of Machine Learning Research, pages 1152–1166. PMLR, 2023
2023
-
[3]
T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. InRobotics: Science and Systems XIX, volume 19, July 2023. ISBN 978-0-9923747-9-2
2023
-
[4]
C. Amato.An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning. Sept. 2024. doi:10.48550/arXiv.2409.03052
-
[5]
D. Dong, M. Bhatt, S. Choi, and N. Mehr. MIMIC-D: Multi-modal Imitation for MultI-agent Coordination with Decentralized Diffusion Policies. https://arxiv.org/abs/2509.14159v3, Sept. 2025
Pith/arXiv arXiv 2025
-
[6]
C. He, G. Sznaier Camps, X. Liu, M. Schwager, and G. Sartoretti.Latent Theory of Mind: A Decentralized Diffusion Architecture for Cooperative Manipulation. May 2025. doi:10.48550/ arXiv.2505.09144
arXiv 2025
-
[7]
M. J. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. P. Foster, P. R. Sanketi, Q. Vuong, T. Kollar, B. Burchfiel, R. Tedrake, D. Sadigh, S. Levine, P. Liang, and C. Finn. OpenVLA: An Open-Source Vision-Language-Action Model. InProceedings of The 8th Conference on Robot Learning, pages 2679–2713. PMLR, Jan. 2025
2025
-
[8]
Zitkovich, T
B. Zitkovich, T. Yu, S. Xu, P. Xu, T. Xiao, F. Xia, J. Wu, P. Wohlhart, S. Welker, A. Wahid, Q. Vuong, V . Vanhoucke, H. Tran, R. Soricut, A. Singh, J. Singh, P. Sermanet, P. R. Sanketi, G. Salazar, M. S. Ryoo, K. Reymann, K. Rao, K. Pertsch, I. Mordatch, H. Michalewski, Y . Lu, S. Levine, L. Lee, T.-W. E. Lee, I. Leal, Y . Kuang, D. Kalashnikov, R. Julia...
2023
-
[9]
C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. C. Burchfiel, and S. Song. Diffusion Policy: Visuomotor Policy Learning via Action Diffusion. InRobotics: Science and Systems XIX, volume 19, July 2023. ISBN 978-0-9923747-9-2. 9
2023
-
[10]
Z. Fu, T. Z. Zhao, and C. Finn. Mobile ALOHA: Learning Bimanual Mobile Manipulation using Low-Cost Whole-Body Teleoperation. In P. Agrawal, O. Kroemer, and W. Burgard, editors,Conference on Robot Learning, 6-9 November 2024, Munich, Germany, Proceedings of Machine Learning Research, pages 4066–4083. PMLR, 2024
2024
-
[12]
R. Xu, J. Li, X. Dong, H. Yu, and J. Ma. Bridging the Domain Gap for Multi-Agent Perception. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 6035– 6042, May 2023. doi:10.1109/ICRA48891.2023.10160871
-
[13]
R. Lowe, Y . Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V . N. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing S...
2017
-
[14]
C. Yu, A. Velu, E. Vinitsky, J. Gao, Y . Wang, A. Bayen, and Y . Wu. The Surprising Effec- tiveness of PPO in Cooperative Multi-Agent Games. InThirty-Sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, June 2022
2022
-
[16]
O. Khatib, K. Yokoi, K. Chang, D. Ruspini, R. Holmberg, and A. Casal. Coordination and decentralized cooperation of multiple mobile manipulators.Journal of Robotic Systems, 13 (11):755–764, 1996. ISSN 1097-4563. doi:10.1002/(SICI)1097-4563(199611)13:11⟨755:: AID-ROB6⟩3.0.CO;2-U
-
[17]
T. Sugar and V . Kumar. Decentralized control of cooperating mobile manipulators. InProceed- ings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146), volume 4, pages 2916–2921 vol.4, May 1998. doi:10.1109/ROBOT.1998.680672
-
[18]
K.-S. Chang, R. Holmberg, and O. Khatib. The augmented object model: Cooperative ma- nipulation and parallel mechanism dynamics.Proceedings 2000 ICRA. Millennium Confer- ence. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065), 1:470–475, 2000. doi:10.1109/ROBOT.2000.844099
-
[19]
Z. Wang and V . Kumar. Object closure and manipulation by multiple cooperating mobile robots. InProceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), volume 1, pages 394–399 vol.1, May 2002. doi:10.1109/ROBOT.2002. 1013392
-
[20]
J. Fink, N. Michael, and V . Kumar. Composition of Vector Fields for Multi-Robot Manipula- tion via Caging. InRobotics: Science and Systems III, volume 03, June 2007
2007
-
[21]
J. Fink, M. A. Hsieh, and V . Kumar. Multi-robot manipulation via caging in environments with obstacles. In2008 IEEE International Conference on Robotics and Automation, pages 1471–1476, May 2008. doi:10.1109/ROBOT.2008.4543409
-
[22]
Z. Wang and M. Schwager. Kinematic multi-robot manipulation with no communication using force feedback. In2016 IEEE International Conference on Robotics and Automation (ICRA), pages 427–432, May 2016. doi:10.1109/ICRA.2016.7487163. 10
-
[23]
P. Culbertson and M. Schwager. Decentralized Adaptive Control for Collaborative Manipu- lation. In2018 IEEE International Conference on Robotics and Automation (ICRA), pages 278–285, May 2018. doi:10.1109/ICRA.2018.8461263
-
[24]
R. Tallamraju, D. H. Salunkhe, S. Rajappa, A. Ahmad, K. Karlapalem, and S. V . Shah. Mo- tion Planning for Multi-Mobile-Manipulator Payload Transport Systems. In2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), pages 1469–1474, Vancouver, BC, Canada, Aug. 2019. IEEE Press. doi:10.1109/COASE.2019.8842840
-
[25]
K. Muvvala, A. M. Wells, M. Lahijanian, L. E. Kavraki, and M. Y . Vardi. Stochastic Games for Interactive Manipulation Domains. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 2513–2519, May 2024. doi:10.1109/ICRA57147.2024.10611623
-
[26]
D. Mellinger, M. Shomin, N. Michael, and V . Kumar. Cooperative Grasping and Transport Using Multiple Quadrotors. In A. Martinoli, F. Mondada, N. Correll, G. Mermoud, M. Egerst- edt, M. A. Hsieh, L. E. Parker, and K. Støy, editors,Distributed Autonomous Robotic Systems: The 10th International Symposium, pages 545–558. Springer, Berlin, Heidelberg, 2013. I...
-
[27]
A. Tagliabue, M. Kamel, S. Verling, R. Siegwart, and J. Nieto. Collaborative transportation using MA Vs via passive force control. In2017 IEEE International Conference on Robotics and Automation (ICRA), pages 5766–5773, May 2017. doi:10.1109/ICRA.2017.7989678
-
[28]
Driess, F
D. Driess, F. Xia, M. S. M. Sajjadi, C. Lynch, A. Chowdhery, B. Ichter, A. Wahid, J. Tompson, Q. Vuong, T. Yu, W. Huang, Y . Chebotar, P. Sermanet, D. Duckworth, S. Levine, V . Vanhoucke, K. Hausman, M. Toussaint, K. Greff, A. Zeng, I. Mordatch, and P. Florence. PaLM-E: An Embodied Multimodal Language Model. InProceedings of the 40th International Confere...
2023
-
[29]
Q. Li, Y . Liang, Z. Wang, L. Luo, X. Chen, M. Liao, F. Wei, Y . Deng, S. Xu, Y . Zhang, X. Wang, B. Liu, J. Fu, J. Bao, D. Chen, Y . Shi, J. Yang, and B. Guo.CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation. Nov. 2024. doi:10.48550/arXiv.2411.19650
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2411.19650 2024
-
[30]
J. Wen, Y . Zhu, J. Li, Z. Tang, C. Shen, and F. Feng. DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control. 2025. doi:10.48550/ARXIV .2502.05855
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2025
-
[31]
A. Szot, B. Mazoure, O. Attia, A. Timofeev, H. Agrawal, D. Hjelm, Z. Gan, Z. Kira, and A. To- shev. From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons.2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10644– 10655, June 2025. doi:10.1109/CVPR52734.2025.00995
-
[32]
Zawalski, W
M. Zawalski, W. Chen, K. Pertsch, O. Mees, C. Finn, and S. Levine. Robotic Control via Embodied Chain-of-Thought Reasoning. InProceedings of The 8th Conference on Robot Learning, pages 3157–3181. PMLR, Jan. 2025
2025
-
[33]
A. O’Neill, A. Rehman, A. Maddukuri, A. Gupta, A. Padalkar, A. Lee, A. Pooley, A. Gupta, A. Mandlekar, A. Jain, A. Tung, A. Bewley, A. Herzog, A. Irpan, A. Khazatsky, A. Rai, A. Gupta, A. Wang, A. Singh, A. Garg, A. Kembhavi, A. Xie, A. Brohan, A. Raffin, A. Sharma, A. Yavary, A. Jain, A. Balakrishna, A. Wahid, B. Burgess-Limerick, B. Kim, B. Sch ¨olkopf,...
-
[34]
Ghosh, H
D. Ghosh, H. R. Walke, K. Pertsch, K. Black, O. Mees, S. Dasari, J. Hejna, T. Kreiman, C. Xu, J. Luo, Y . L. Tan, L. Y . Chen, Q. Vuong, T. Xiao, P. R. Sanketi, D. Sadigh, C. Finn, and S. Levine. Octo: An Open-Source Generalist Robot Policy. InRobotics: Science and Systems XX, volume 20, July 2024. ISBN 979-8-9902848-0-7
2024
-
[35]
Doshi, H
R. Doshi, H. R. Walke, O. Mees, S. Dasari, and S. Levine. Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation. InProceedings of The 8th Conference on Robot Learning, pages 496–512. PMLR, Jan. 2025
2025
-
[36]
Khazatsky, K
A. Khazatsky, K. Pertsch, S. Nair, A. Balakrishna, S. Dasari, S. Karamcheti, S. Nasiriany, M. K. Srirama, L. Y . Chen, K. Ellis, P. D. Fagan, J. Hejna, M. Itkina, M. Lepert, Y . J. Ma, P. T. Miller, J. Wu, S. Belkhale, S. Dass, H. Ha, A. Jain, A. Lee, Y . Lee, M. Memmel, S. Park, I. Radosavovic, K. Wang, A. Zhan, K. Black, C. Chi, K. B. Hatch, S. Lin, J. ...
2024
-
[37]
H. R. Walke, K. Black, T. Z. Zhao, Q. Vuong, C. Zheng, P. Hansen-Estruch, A. W. He, V . My- ers, M. J. Kim, M. Du, A. Lee, K. Fang, C. Finn, and S. Levine. BridgeData V2: A Dataset for Robot Learning at Scale. InProceedings of The 7th Conference on Robot Learning, pages 1723–1736. PMLR, Dec. 2023
2023
-
[38]
Dasari, F
S. Dasari, F. Ebert, S. Tian, S. Nair, B. Bucher, K. Schmeckpeper, S. Singh, S. Levine, and C. Finn. RoboNet: Large-Scale Multi-Robot Learning. InProceedings of the Conference on Robot Learning, pages 885–897. PMLR, May 2020. 12
2020
-
[39]
H.-S. Fang, H. Fang, Z. Tang, J. Liu, C. Wang, J. Wang, H. Zhu, and C. Lu. RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot. pages 653–660, May 2024. doi:10.1109/ICRA57147.2024.10611615
-
[40]
V . Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. A. Riedmiller, A. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning.Nat., 518(7540):529–533, 2015. doi:10.1038/NATURE14236
-
[41]
Schulman, S
J. Schulman, S. Levine, P. Abbeel, M. I. Jordan, and P. Moritz. Trust Region Policy Optimiza- tion. In F. R. Bach and D. M. Blei, editors,Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, JMLR Workshop and Con- ference Proceedings, pages 1889–1897. JMLR.org, 2015
2015
-
[42]
Schulman, F
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal Policy Optimization Algorithms.ArXiv, July 2017
2017
-
[43]
Haarnoja, A
T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In J. G. Dy and A. Krause, editors,Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsm¨assan, Stockholm, Sweden, July 10-15, 2018, Proceedings of Machine Learning Resear...
2018
-
[44]
D. A. Pomerleau. ALVINN: An Autonomous Land Vehicle in a Neural Network. InAdvances in Neural Information Processing Systems, volume 1. Morgan-Kaufmann, 1988
1988
-
[45]
S. Schaal. Learning from Demonstration. InAdvances in Neural Information Processing Systems, volume 9. MIT Press, 1996
1996
-
[46]
B. D. Argall, S. Chernova, M. M. Veloso, and B. Browning. A survey of robot learning from demonstration.Robotics Auton. Syst., 57(5):469–483, 2009. doi:10.1016/J.ROBOT.2008.10. 024
-
[47]
S. Ross, G. J. Gordon, and D. Bagnell. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. In G. J. Gordon, D. B. Dunson, and M. Dud ´ık, editors,Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, April 11-13, 2011, JMLR Proceedings, pa...
2011
-
[48]
Ho and S
J. Ho and S. Ermon. Generative Adversarial Imitation Learning. In D. D. Lee, M. Sugiyama, U. von Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 4565–4573, 2016
2016
-
[49]
Sunehag, G
P. Sunehag, G. Lever, A. Gruslys, W. M. Czarnecki, V . Zambaldi, M. Jaderberg, M. Lanc- tot, N. Sonnerat, J. Z. Leibo, K. Tuyls, and T. Graepel. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward. InProceedings of the 17th In- ternational Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’18, pages 2085–2087...
2085
-
[50]
J. Wang, Z. Ren, T. Liu, Y . Yu, and C. Zhang. QPLEX: Duplex Dueling Multi-Agent Q- Learning. InInternational Conference on Learning Representations, Oct. 2020
2020
-
[51]
A. Tampuu, T. Matiisen, D. Kodelja, I. Kuzovkin, K. Korjus, J. Aru, J. Aru, and R. Vicente. Multiagent cooperation and competition with deep reinforcement learning.PLOS ONE, 12(4): e0172395, Apr. 2017. ISSN 1932-6203. doi:10.1371/journal.pone.0172395. 13
-
[52]
Z. Mandi, S. Jain, and S. Song. RoCo: Dialectic Multi-Robot Collaboration with Large Lan- guage Models. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 286–299, May 2024. doi:10.1109/ICRA57147.2024.10610855
-
[53]
Y . Chen, J. Arkin, Y . Zhang, N. Roy, and C. Fan. Scalable Multi-Robot Collaboration with Large Language Models: Centralized or Decentralized Systems? In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 4311–4317, May 2024. doi:10.1109/ ICRA57147.2024.10610676
arXiv 2024
-
[54]
Ichter, A
B. Ichter, A. Brohan, Y . Chebotar, C. Finn, K. Hausman, A. Herzog, D. Ho, J. Ibarz, A. Irpan, E. Jang, R. Julian, D. Kalashnikov, S. Levine, Y . Lu, C. Parada, K. Rao, P. Sermanet, A. T. To- shev, V . Vanhoucke, F. Xia, T. Xiao, P. Xu, M. Yan, N. Brown, M. Ahn, O. Cortes, N. Sievers, C. Tan, S. Xu, D. Reyes, J. Rettinghouse, J. Quiambao, P. Pastor, L. Lu...
2023
-
[55]
Huang, F
W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y . Chebotar, P. Sermanet, T. Jackson, N. Brown, L. Luu, S. Levine, K. Hausman, and B. Ichter. Inner Monologue: Embodied Reasoning through Planning with Language Models. InPro- ceedings of The 6th Conference on Robot Learning, pages 1769–1782. PMLR, Mar. 2023
2023
-
[56]
J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng. Code as Policies: Language Model Programs for Embodied Control. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 9493–9500, May 2023. doi:10.1109/ ICRA48891.2023.10160591
arXiv 2023
-
[57]
Intelligence, K
P. Intelligence, K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, M. Y . Galliker, D. Ghosh, L. Groom, K. Hausman, B. Ichter, S. Jakubczak, T. Jones, L. Ke, D. LeBlanc, S. Levine, A. Li-Bell, M. Mothukuri, S. Nair, K. Pertsch, A. Z. Ren, L. X. Shi, L. Smith, J. T. Springenberg, K. Stachowicz, J. Tanner, Q. V...
2025
-
[58]
J. Wu, W. Chong, R. Holmberg, A. Prasad, Y . Gao, O. Khatib, S. Song, S. Rusinkiewicz, and J. Bohg. TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning. In Proceedings of The 8th Conference on Robot Learning, pages 3729–3741. PMLR, Jan. 2025
2025
-
[59]
E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen. LoRA: Low-Rank Adaptation of Large Language Models. InInternational Conference on Learning Representations, Oct. 2021
2021
-
[60]
Loshchilov and F
I. Loshchilov and F. Hutter. Decoupled Weight Decay Regularization, Jan. 2019
2019
-
[61]
de Haan, D
P. de Haan, D. Jayaraman, and S. Levine. Causal Confusion in Imitation Learning. InAdvances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019
2019
- [62]
-
[63]
B. Pandit, A. K. Shrestha, and A. Fern. Multi-Quadruped Cooperative Object Transport: Learning Decentralized Pinch-Lift-Move. https://arxiv.org/abs/2509.14342v3, Sept. 2025. 14 6 Appendix More information and videos can be found on our website: chorus-model.github.io. A Training Details We finetune theπ0.5 policy with LoRA adapters on both the vision-lang...
arXiv 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.