HCLM: A Hierarchical Framework for Cooperative Loco-Manipulation with Dual Quadrupeds
Pith reviewed 2026-05-20 13:09 UTC · model grok-4.3
The pith
A hierarchical framework decouples high-level coordination from low-level control to enable reliable cooperative tasks between two quadruped robots.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The architecture systematically decouples high-level collaborative reasoning from low-level robust motion execution, enabling reliable task execution, strict configuration agnosticism, and exceptional resilience against severe physical perturbations through a Joint Diffusion Policy for coordination and a hybrid Whole-Body Controller with cooperative admittance.
What carries the argument
The hierarchical separation of a centralized Joint Diffusion Policy using SE(3)-invariant task-space representation from a task-centric hybrid Whole-Body Controller that integrates kinematic MPC and reactive cooperative admittance.
If this is right
- The framework supports tasks including cooperative carrying, packing, and handovers in simulation.
- It allows successful real-world deployment of handover tasks.
- The system demonstrates resilience to severe physical perturbations.
- It maintains performance independent of specific robot configurations.
Where Pith is reading between the lines
- This decoupling could be applied to other multi-robot setups involving different types of robots.
- Future work might test the system in more dynamic or cluttered environments.
- The approach suggests that learning coordinate-agnostic patterns reduces the need for precise calibration between robots.
Load-bearing premise
The reactive execution layer with cooperative admittance can always resolve kinematic conflicts and regulate internal stresses without causing instability or loss of locomotion stability during closed-chain interactions.
What would settle it
A test where external forces are applied during a real-world handover causing the robots to fall or drop the object would disprove the resilience claim.
Figures
read the original abstract
We introduce HCLM, a hierarchical framework for general-purpose cooperative loco-manipulation with dual quadrupedal systems. Coordinating multi-robot collaborative manipulation across floating bases is highly challenging due to the conflicting demands of spatial coordination, robust locomotion, and closed-chain physical interactions. To resolve this, our architecture systematically decouples high-level collaborative reasoning from low-level robust motion execution. At the high level, a centralized Joint Diffusion Policy leverages an SE(3)-invariant task-space representation to learn coordinate-agnostic spatial coordination patterns. To translate these frame-agnostic references into physical motion, a task-centric hybrid Whole-Body Controller synergizes a proactive kinematic Model Predictive Control for collision-free velocity distribution with a reactive execution layer. Crucially, this reactive layer guarantees rapid responsiveness for precise end-effector tracking, while concurrently integrating active force regulation via a cooperative admittance scheme to safely resolve kinematic conflicts and strictly regulate internal stresses during closed-chain interactions. We validate the framework across progressively challenging simulated scenarios, including cooperative carrying, packing and handovers, and successfully deploy the latter in the real world. The results demonstrate reliable task execution, strict configuration agnosticism, and exceptional resilience against severe physical perturbations, offering a highly robust pathway for multi-robot embodied coordination.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces HCLM, a hierarchical framework for cooperative loco-manipulation with dual quadrupedal robots. It decouples high-level collaborative reasoning, implemented via a centralized Joint Diffusion Policy operating on an SE(3)-invariant task-space representation, from low-level robust motion execution via a task-centric hybrid Whole-Body Controller. The low-level controller combines proactive kinematic Model Predictive Control for collision-free velocity distribution with a reactive execution layer that incorporates cooperative admittance for end-effector tracking, kinematic conflict resolution, and internal force regulation. The framework is validated across simulated scenarios including cooperative carrying, packing, and handovers, with real-world deployment of the handover task, claiming reliable task execution, strict configuration agnosticism, and exceptional resilience to severe physical perturbations.
Significance. If the central claims hold, the work provides a practical architecture for multi-robot loco-manipulation by systematically separating high-level spatial coordination from low-level physical interaction handling. The combination of diffusion-based policies for frame-agnostic planning and admittance-based force regulation addresses key difficulties in closed-chain floating-base systems. Real-world deployment of at least one task adds credibility, and the emphasis on configuration agnosticism could support broader applicability in logistics or assembly scenarios. The absence of quantitative metrics and stability analysis, however, currently limits the assessed impact.
major comments (2)
- Abstract and reactive execution layer description: The claim of 'exceptional resilience against severe physical perturbations' rests on the cooperative admittance scheme resolving kinematic conflicts and regulating internal stresses. For dual floating-base quadrupeds, closed-chain interactions produce force loops that couple directly to locomotion dynamics. The manuscript supplies no passivity, Lyapunov, or hybrid-system stability analysis or bounds for the WBC + admittance controller, leaving open the possibility that internal forces excite unstable modes in the contact-constrained system.
- Validation description (abstract): The abstract states successful validation across simulated scenarios and real-world deployment yet reports no quantitative metrics, success rates, force/torque profiles, ablation studies, or failure-case analysis. Without these data the assertions of 'reliable task execution' and 'exceptional resilience' cannot be evaluated against the central claim.
minor comments (2)
- The abstract and architecture overview would benefit from explicit section references or a high-level block diagram clarifying the data flow between the Joint Diffusion Policy and the hybrid Whole-Body Controller.
- Notation for the SE(3)-invariant task-space representation and the cooperative admittance gains should be introduced with a brief definition or reference to standard literature to improve readability for readers outside the immediate subfield.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable feedback on our work. We address the major comments point by point below and outline the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: Abstract and reactive execution layer description: The claim of 'exceptional resilience against severe physical perturbations' rests on the cooperative admittance scheme resolving kinematic conflicts and regulating internal stresses. For dual floating-base quadrupeds, closed-chain interactions produce force loops that couple directly to locomotion dynamics. The manuscript supplies no passivity, Lyapunov, or hybrid-system stability analysis or bounds for the WBC + admittance controller, leaving open the possibility that internal forces excite unstable modes in the contact-constrained system.
Authors: We acknowledge the referee's concern regarding the lack of formal stability analysis. The resilience to perturbations is demonstrated through extensive empirical testing in both simulation and real-world settings, where the cooperative admittance scheme successfully managed internal forces and kinematic conflicts without system failure. However, we agree that theoretical analysis would be beneficial. In the revised manuscript, we will add a subsection in the controller description discussing the passivity properties of the admittance control and how it contributes to stability in the closed-chain system, along with any available bounds from the MPC component. This will clarify the basis for our claims while noting that full Lyapunov analysis remains an avenue for future work. revision: partial
-
Referee: Validation description (abstract): The abstract states successful validation across simulated scenarios and real-world deployment yet reports no quantitative metrics, success rates, force/torque profiles, ablation studies, or failure-case analysis. Without these data the assertions of 'reliable task execution' and 'exceptional resilience' cannot be evaluated against the central claim.
Authors: We thank the referee for pointing this out. While the manuscript body includes detailed experimental setups and results illustrated through figures and qualitative analysis, we recognize that the abstract and validation summary could benefit from more quantitative support. In the revision, we will update the abstract to include key performance metrics from our experiments and expand the results section with additional quantitative data including force/torque profiles, ablation studies on the admittance parameters, and analysis of failure cases where applicable. This will provide a more rigorous evaluation of the framework's performance. revision: yes
Circularity Check
No significant circularity; framework description is self-contained
full rationale
The paper introduces a hierarchical architecture that decouples high-level collaborative reasoning (via a centralized Joint Diffusion Policy with SE(3)-invariant task-space representation) from low-level execution (via task-centric hybrid Whole-Body Controller combining kinematic MPC and reactive admittance). No equations, fitted parameters, or first-principles derivations are presented that reduce performance claims to inputs by construction, nor are there self-citations invoked as load-bearing uniqueness theorems or ansatzes smuggled from prior author work. Claims of resilience and configuration agnosticism rest on the described design choices and empirical validation in simulation and real-world deployment, which remain independent of any self-referential fitting or renaming of known results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption An SE(3)-invariant task-space representation permits coordinate-agnostic spatial coordination learning.
Reference graph
Works this paper leans on
-
[1]
J. Kim, R. T. Fawcett, V . R. Kamidi, A. D. Ames, and K. A. Hamed, “Layered control for cooperative locomotion of two quadrupedal robots: Centralized and distributed approaches,”IEEE Transactions on Robotics, vol. 39, no. 6, pp. 4728–4748, 2023
work page 2023
-
[2]
Learning multi-agent loco-manipulation for long-horizon quadrupedal pushing,
Y . Feng, C. Hong, Y . Niu, S. Liu, Y . Yang, and D. Zhao, “Learning multi-agent loco-manipulation for long-horizon quadrupedal pushing,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 14 441–14 448
work page 2025
-
[3]
Collaborative loco-manipulation for pick-and-place tasks with dynamic reward curriculum,
T. An, F. De Vincenti, Y . Ma, M. Hutter, and S. Coros, “Collaborative loco-manipulation for pick-and-place tasks with dynamic reward curriculum,”arXiv preprint arXiv:2509.13239, 2025
-
[4]
Multi-quadruped cooperative object transport: Learning decentralized pinch-lift-move,
B. Pandit, A. K. Shrestha, and A. Fern, “Multi-quadruped cooperative object transport: Learning decentralized pinch-lift-move,”arXiv preprint arXiv:2509.14342, 2025
-
[5]
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
Y . Ze, G. Zhang, K. Zhang, C. Hu, M. Wang, and H. Xu, “3d diffusion policy: Generalizable visuomotor policy learning via simple 3d representations,”arXiv preprint arXiv:2403.03954, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[6]
Generalizable humanoid manipulation with 3d diffusion policies,
Y . Ze, Z. Chen, W. Wang, T. Chen, X. He, Y . Yuan, X. B. Peng, and J. Wu, “Generalizable humanoid manipulation with 3d diffusion policies,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 2873–2880
work page 2025
-
[7]
Z. Gu, J. Li, W. Shen, W. Yu, Z. Xie, S. McCrory, X. Cheng, A. Shamsah, R. Griffin, C. K. Liuet al., “Humanoid locomotion and manipulation: Current progress and challenges in control, planning, and learning,”arXiv preprint arXiv:2501.02116, 2025
-
[8]
Whole-body inverse dynamics mpc for legged loco-manipulation,
L. Molnar, J. Cheng, G. Fadini, D. Kang, F. Zargarbashi, and S. Coros, “Whole-body inverse dynamics mpc for legged loco-manipulation,” IEEE Robotics and Automation Letters, 2025
work page 2025
-
[9]
A unified mpc framework for whole-body dynamic locomotion and manipulation,
J.-P. Sleiman, F. Farshidian, M. V . Minniti, and M. Hutter, “A unified mpc framework for whole-body dynamic locomotion and manipulation,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4688–4695, 2021
work page 2021
-
[10]
Rl+ model-based control: Using on-demand optimal control to learn versatile legged locomotion,
D. Kang, J. Cheng, M. Zamora, F. Zargarbashi, and S. Coros, “Rl+ model-based control: Using on-demand optimal control to learn versatile legged locomotion,”IEEE Robotics and Automation Letters, vol. 8, no. 10, pp. 6619–6626, 2023
work page 2023
-
[11]
Roboduet: Learning a cooperative policy for whole-body legged loco-manipulation,
G. Pan, Q. Ben, Z. Yuan, G. Jiang, Y . Ji, S. Li, J. Pang, H. Liu, and H. Xu, “Roboduet: Learning a cooperative policy for whole-body legged loco-manipulation,”IEEE Robotics and Automation Letters, 2025
work page 2025
-
[12]
Deep whole-body control: learning a unified policy for manipulation and locomotion,
Z. Fu, X. Cheng, and D. Pathak, “Deep whole-body control: learning a unified policy for manipulation and locomotion,” inConference on Robot Learning. PMLR, 2023, pp. 138–149
work page 2023
-
[13]
Y . Ma, F. Farshidian, T. Miki, J. Lee, and M. Hutter, “Combining learning-based locomotion policy with model-based manipulation for legged mobile manipulators,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2377–2384, 2022
work page 2022
-
[14]
Umi on legs: Making manipulation policies mobile with manipulation-centric whole-body controllers,
H. Ha, Y . Gao, Z. Fu, J. Tan, and S. Song, “Umi on legs: Making manipulation policies mobile with manipulation-centric whole-body controllers,”arXiv preprint arXiv:2407.10353, 2024
-
[15]
Centralized model predictive control for collaborative loco-manipulation
F. De Vincenti and S. Coros, “Centralized model predictive control for collaborative loco-manipulation.” inRobotics: Science and Systems, vol. 2023, 2023
work page 2023
-
[16]
G. Turrisi, L. Schulze, V . S. Medeiros, C. Semini, and V . Barasuol, “Pacc: A passive-arm approach for high-payload collaborative carrying with quadruped robots using model predictive control,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 11 139–11 146
work page 2024
-
[17]
Coordination and decentralized cooperation of multiple mobile manipulators,
O. Khatib, K. Yokoi, K. Chang, D. Ruspini, R. Holmberg, and A. Casal, “Coordination and decentralized cooperation of multiple mobile manipulators,”Journal of Robotic Systems, vol. 13, no. 11, pp. 755–764, 1996
work page 1996
-
[18]
Internal force analysis and load distribution for cooperative multi-robot manipulation,
S. Erhart and S. Hirche, “Internal force analysis and load distribution for cooperative multi-robot manipulation,”IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1238–1243, 2015
work page 2015
-
[19]
A decoupling scheme for force control in cooperative multi-robot manipulation tasks,
L. De Pascali, S. Erhart, L. Zaccarian, B. Francesco, and S. Hirche, “A decoupling scheme for force control in cooperative multi-robot manipulation tasks,” in2022 IEEE 17th international conference on advanced motion control (AMC). IEEE, 2022, pp. 243–249
work page 2022
-
[20]
Diffusion policy: Visuomotor policy learning via action diffusion,
C. Chi, Z. Xu, S. Feng, E. Cousineau, Y . Du, B. Burchfiel, R. Tedrake, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,”The International Journal of Robotics Research, vol. 44, no. 10-11, pp. 1684–1704, 2025
work page 2025
-
[21]
Umi-on-air: Embodiment-aware guidance for embodiment- agnostic visuomotor policies,
H. Gupta, X. Guo, H. Ha, C. Pan, M. Cao, D. Lee, S. Scherer, S. Song, and G. Shi, “Umi-on-air: Embodiment-aware guidance for embodiment- agnostic visuomotor policies,”arXiv preprint arXiv:2510.02614, 2025
-
[22]
Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
T. Z. Zhao, V . Kumar, S. Levine, and C. Finn, “Learning fine- grained bimanual manipulation with low-cost hardware,”arXiv preprint arXiv:2304.13705, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[23]
A practical guide for incorporating symmetry in diffusion policy,
D. Wang, B. Hu, S. Song, R. Walters, and R. Platt, “A practical guide for incorporating symmetry in diffusion policy,”arXiv preprint arXiv:2505.13431, 2025
-
[24]
Fast-umi: A scalable and hardware-independent universal manipulation interface,
K. Liu, C. Guan, Z. Jia, Z. Wu, X. Liu, T. Wang, S. Liang, P. Chen, P. Zhang, H. Songet al., “Fastumi: A scalable and hardware- independent universal manipulation interface with dataset,”arXiv preprint arXiv:2409.19499, 2024
-
[25]
Casadi: a software framework for nonlinear optimization and optimal control,
J. A. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “Casadi: a software framework for nonlinear optimization and optimal control,”Mathematical Programming Computation, vol. 11, no. 1, pp. 1–36, 2019
work page 2019
-
[26]
Pink: Python inverse kinematics based on Pinocchio,
S. Caron, Y . De Mont-Marin, R. Budhiraja, S. H. Bang, I. Domrachev, S. Nedelchev, P. Du, A. Escande, J. Vaillant, B. Wingo, and S. Patapati, “Pink: Python inverse kinematics based on Pinocchio,” 2026. [Online]. Available: https://github.com/stephane-caron/pink
work page 2026
-
[27]
Alma-articulated locomotion and manipu- lation for a torque-controllable robot,
C. D. Bellicoso, K. Kr ¨amer, M. St ¨auble, D. Sako, F. Jenelten, M. Bjelonic, and M. Hutter, “Alma-articulated locomotion and manipu- lation for a torque-controllable robot,” in2019 International conference on robotics and automation (ICRA). IEEE, 2019, pp. 8477–8483
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.