pith. sign in

arxiv: 2605.17300 · v1 · pith:6JNEEBKSnew · submitted 2026-05-17 · 💻 cs.RO

HCLM: A Hierarchical Framework for Cooperative Loco-Manipulation with Dual Quadrupeds

Pith reviewed 2026-05-20 13:09 UTC · model grok-4.3

classification 💻 cs.RO
keywords cooperative loco-manipulationdual quadrupedshierarchical controldiffusion policywhole-body controlleradmittance schememulti-robot coordination
0
0 comments X

The pith

A hierarchical framework decouples high-level coordination from low-level control to enable reliable cooperative tasks between two quadruped robots.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes HCLM as a way to make two four-legged robots work together on manipulation jobs while moving around. It does this by splitting the problem into a high-level part that plans the coordination in a way that does not depend on exact robot positions and a low-level part that executes the movements robustly. The high-level uses a diffusion-based policy on task space to learn patterns for things like carrying or handing over objects. The low-level combines model predictive control for safe motions with a reactive system that adjusts for forces and conflicts between the robots. If successful, this means such robot teams can handle real tasks with disturbances without needing perfect setups.

Core claim

The architecture systematically decouples high-level collaborative reasoning from low-level robust motion execution, enabling reliable task execution, strict configuration agnosticism, and exceptional resilience against severe physical perturbations through a Joint Diffusion Policy for coordination and a hybrid Whole-Body Controller with cooperative admittance.

What carries the argument

The hierarchical separation of a centralized Joint Diffusion Policy using SE(3)-invariant task-space representation from a task-centric hybrid Whole-Body Controller that integrates kinematic MPC and reactive cooperative admittance.

If this is right

  • The framework supports tasks including cooperative carrying, packing, and handovers in simulation.
  • It allows successful real-world deployment of handover tasks.
  • The system demonstrates resilience to severe physical perturbations.
  • It maintains performance independent of specific robot configurations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This decoupling could be applied to other multi-robot setups involving different types of robots.
  • Future work might test the system in more dynamic or cluttered environments.
  • The approach suggests that learning coordinate-agnostic patterns reduces the need for precise calibration between robots.

Load-bearing premise

The reactive execution layer with cooperative admittance can always resolve kinematic conflicts and regulate internal stresses without causing instability or loss of locomotion stability during closed-chain interactions.

What would settle it

A test where external forces are applied during a real-world handover causing the robots to fall or drop the object would disprove the resilience claim.

Figures

Figures reproduced from arXiv: 2605.17300 by Chen Le, Jincheng Yu, Qixuan Li, Xinlei Chen.

Figure 1
Figure 1. Figure 1: The HCLM Framework. Our hierarchical system enables dual [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the HCLM framework. The high-level Imitation Learning module leverages an SE(3)-invariant representation to generate synchronized, frame-agnostic end-effector trajectories via a joint diffusion process. To translate these commands into physical execution, the task-centric Whole-Body Controller (WBC) deploys a proactive kinematic MPC for collision-free velocity distribution. Concurrently, a reac… view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the cooperative admittance scheme for closed-chain [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Snapshots of the three cooperative loco-manipulation tasks evaluated [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: System robustness against external base perturbations. (a) Simulated [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: System response to an obstacle. The spatial trajectories illustrate [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Evaluation of closed-chain compliance and internal force regulation. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Snapshots of the handover task successfully deployed on the real [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
read the original abstract

We introduce HCLM, a hierarchical framework for general-purpose cooperative loco-manipulation with dual quadrupedal systems. Coordinating multi-robot collaborative manipulation across floating bases is highly challenging due to the conflicting demands of spatial coordination, robust locomotion, and closed-chain physical interactions. To resolve this, our architecture systematically decouples high-level collaborative reasoning from low-level robust motion execution. At the high level, a centralized Joint Diffusion Policy leverages an SE(3)-invariant task-space representation to learn coordinate-agnostic spatial coordination patterns. To translate these frame-agnostic references into physical motion, a task-centric hybrid Whole-Body Controller synergizes a proactive kinematic Model Predictive Control for collision-free velocity distribution with a reactive execution layer. Crucially, this reactive layer guarantees rapid responsiveness for precise end-effector tracking, while concurrently integrating active force regulation via a cooperative admittance scheme to safely resolve kinematic conflicts and strictly regulate internal stresses during closed-chain interactions. We validate the framework across progressively challenging simulated scenarios, including cooperative carrying, packing and handovers, and successfully deploy the latter in the real world. The results demonstrate reliable task execution, strict configuration agnosticism, and exceptional resilience against severe physical perturbations, offering a highly robust pathway for multi-robot embodied coordination.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces HCLM, a hierarchical framework for cooperative loco-manipulation with dual quadrupedal robots. It decouples high-level collaborative reasoning, implemented via a centralized Joint Diffusion Policy operating on an SE(3)-invariant task-space representation, from low-level robust motion execution via a task-centric hybrid Whole-Body Controller. The low-level controller combines proactive kinematic Model Predictive Control for collision-free velocity distribution with a reactive execution layer that incorporates cooperative admittance for end-effector tracking, kinematic conflict resolution, and internal force regulation. The framework is validated across simulated scenarios including cooperative carrying, packing, and handovers, with real-world deployment of the handover task, claiming reliable task execution, strict configuration agnosticism, and exceptional resilience to severe physical perturbations.

Significance. If the central claims hold, the work provides a practical architecture for multi-robot loco-manipulation by systematically separating high-level spatial coordination from low-level physical interaction handling. The combination of diffusion-based policies for frame-agnostic planning and admittance-based force regulation addresses key difficulties in closed-chain floating-base systems. Real-world deployment of at least one task adds credibility, and the emphasis on configuration agnosticism could support broader applicability in logistics or assembly scenarios. The absence of quantitative metrics and stability analysis, however, currently limits the assessed impact.

major comments (2)
  1. Abstract and reactive execution layer description: The claim of 'exceptional resilience against severe physical perturbations' rests on the cooperative admittance scheme resolving kinematic conflicts and regulating internal stresses. For dual floating-base quadrupeds, closed-chain interactions produce force loops that couple directly to locomotion dynamics. The manuscript supplies no passivity, Lyapunov, or hybrid-system stability analysis or bounds for the WBC + admittance controller, leaving open the possibility that internal forces excite unstable modes in the contact-constrained system.
  2. Validation description (abstract): The abstract states successful validation across simulated scenarios and real-world deployment yet reports no quantitative metrics, success rates, force/torque profiles, ablation studies, or failure-case analysis. Without these data the assertions of 'reliable task execution' and 'exceptional resilience' cannot be evaluated against the central claim.
minor comments (2)
  1. The abstract and architecture overview would benefit from explicit section references or a high-level block diagram clarifying the data flow between the Joint Diffusion Policy and the hybrid Whole-Body Controller.
  2. Notation for the SE(3)-invariant task-space representation and the cooperative admittance gains should be introduced with a brief definition or reference to standard literature to improve readability for readers outside the immediate subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and valuable feedback on our work. We address the major comments point by point below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: Abstract and reactive execution layer description: The claim of 'exceptional resilience against severe physical perturbations' rests on the cooperative admittance scheme resolving kinematic conflicts and regulating internal stresses. For dual floating-base quadrupeds, closed-chain interactions produce force loops that couple directly to locomotion dynamics. The manuscript supplies no passivity, Lyapunov, or hybrid-system stability analysis or bounds for the WBC + admittance controller, leaving open the possibility that internal forces excite unstable modes in the contact-constrained system.

    Authors: We acknowledge the referee's concern regarding the lack of formal stability analysis. The resilience to perturbations is demonstrated through extensive empirical testing in both simulation and real-world settings, where the cooperative admittance scheme successfully managed internal forces and kinematic conflicts without system failure. However, we agree that theoretical analysis would be beneficial. In the revised manuscript, we will add a subsection in the controller description discussing the passivity properties of the admittance control and how it contributes to stability in the closed-chain system, along with any available bounds from the MPC component. This will clarify the basis for our claims while noting that full Lyapunov analysis remains an avenue for future work. revision: partial

  2. Referee: Validation description (abstract): The abstract states successful validation across simulated scenarios and real-world deployment yet reports no quantitative metrics, success rates, force/torque profiles, ablation studies, or failure-case analysis. Without these data the assertions of 'reliable task execution' and 'exceptional resilience' cannot be evaluated against the central claim.

    Authors: We thank the referee for pointing this out. While the manuscript body includes detailed experimental setups and results illustrated through figures and qualitative analysis, we recognize that the abstract and validation summary could benefit from more quantitative support. In the revision, we will update the abstract to include key performance metrics from our experiments and expand the results section with additional quantitative data including force/torque profiles, ablation studies on the admittance parameters, and analysis of failure cases where applicable. This will provide a more rigorous evaluation of the framework's performance. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework description is self-contained

full rationale

The paper introduces a hierarchical architecture that decouples high-level collaborative reasoning (via a centralized Joint Diffusion Policy with SE(3)-invariant task-space representation) from low-level execution (via task-centric hybrid Whole-Body Controller combining kinematic MPC and reactive admittance). No equations, fitted parameters, or first-principles derivations are presented that reduce performance claims to inputs by construction, nor are there self-citations invoked as load-bearing uniqueness theorems or ansatzes smuggled from prior author work. Claims of resilience and configuration agnosticism rest on the described design choices and empirical validation in simulation and real-world deployment, which remain independent of any self-referential fitting or renaming of known results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated beyond the high-level architectural choices.

axioms (1)
  • domain assumption An SE(3)-invariant task-space representation permits coordinate-agnostic spatial coordination learning.
    Invoked to justify the high-level Joint Diffusion Policy.

pith-pipeline@v0.9.0 · 5755 in / 1282 out tokens · 35393 ms · 2026-05-20T13:09:39.329771+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 2 internal anchors

  1. [1]

    Layered control for cooperative locomotion of two quadrupedal robots: Centralized and distributed approaches,

    J. Kim, R. T. Fawcett, V . R. Kamidi, A. D. Ames, and K. A. Hamed, “Layered control for cooperative locomotion of two quadrupedal robots: Centralized and distributed approaches,”IEEE Transactions on Robotics, vol. 39, no. 6, pp. 4728–4748, 2023

  2. [2]

    Learning multi-agent loco-manipulation for long-horizon quadrupedal pushing,

    Y . Feng, C. Hong, Y . Niu, S. Liu, Y . Yang, and D. Zhao, “Learning multi-agent loco-manipulation for long-horizon quadrupedal pushing,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 14 441–14 448

  3. [3]

    Collaborative loco-manipulation for pick-and-place tasks with dynamic reward curriculum,

    T. An, F. De Vincenti, Y . Ma, M. Hutter, and S. Coros, “Collaborative loco-manipulation for pick-and-place tasks with dynamic reward curriculum,”arXiv preprint arXiv:2509.13239, 2025

  4. [4]

    Multi-quadruped cooperative object transport: Learning decentralized pinch-lift-move,

    B. Pandit, A. K. Shrestha, and A. Fern, “Multi-quadruped cooperative object transport: Learning decentralized pinch-lift-move,”arXiv preprint arXiv:2509.14342, 2025

  5. [5]

    3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations

    Y . Ze, G. Zhang, K. Zhang, C. Hu, M. Wang, and H. Xu, “3d diffusion policy: Generalizable visuomotor policy learning via simple 3d representations,”arXiv preprint arXiv:2403.03954, 2024

  6. [6]

    Generalizable humanoid manipulation with 3d diffusion policies,

    Y . Ze, Z. Chen, W. Wang, T. Chen, X. He, Y . Yuan, X. B. Peng, and J. Wu, “Generalizable humanoid manipulation with 3d diffusion policies,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 2873–2880

  7. [7]

    Karen Liu, Abder- rahmane Kheddar, Xue Bin Peng, Yuke Zhu, Guanya Shi, Quan Nguyen, Gordon Cheng, Huijun Gao, and Ye Zhao

    Z. Gu, J. Li, W. Shen, W. Yu, Z. Xie, S. McCrory, X. Cheng, A. Shamsah, R. Griffin, C. K. Liuet al., “Humanoid locomotion and manipulation: Current progress and challenges in control, planning, and learning,”arXiv preprint arXiv:2501.02116, 2025

  8. [8]

    Whole-body inverse dynamics mpc for legged loco-manipulation,

    L. Molnar, J. Cheng, G. Fadini, D. Kang, F. Zargarbashi, and S. Coros, “Whole-body inverse dynamics mpc for legged loco-manipulation,” IEEE Robotics and Automation Letters, 2025

  9. [9]

    A unified mpc framework for whole-body dynamic locomotion and manipulation,

    J.-P. Sleiman, F. Farshidian, M. V . Minniti, and M. Hutter, “A unified mpc framework for whole-body dynamic locomotion and manipulation,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4688–4695, 2021

  10. [10]

    Rl+ model-based control: Using on-demand optimal control to learn versatile legged locomotion,

    D. Kang, J. Cheng, M. Zamora, F. Zargarbashi, and S. Coros, “Rl+ model-based control: Using on-demand optimal control to learn versatile legged locomotion,”IEEE Robotics and Automation Letters, vol. 8, no. 10, pp. 6619–6626, 2023

  11. [11]

    Roboduet: Learning a cooperative policy for whole-body legged loco-manipulation,

    G. Pan, Q. Ben, Z. Yuan, G. Jiang, Y . Ji, S. Li, J. Pang, H. Liu, and H. Xu, “Roboduet: Learning a cooperative policy for whole-body legged loco-manipulation,”IEEE Robotics and Automation Letters, 2025

  12. [12]

    Deep whole-body control: learning a unified policy for manipulation and locomotion,

    Z. Fu, X. Cheng, and D. Pathak, “Deep whole-body control: learning a unified policy for manipulation and locomotion,” inConference on Robot Learning. PMLR, 2023, pp. 138–149

  13. [13]

    Combining learning-based locomotion policy with model-based manipulation for legged mobile manipulators,

    Y . Ma, F. Farshidian, T. Miki, J. Lee, and M. Hutter, “Combining learning-based locomotion policy with model-based manipulation for legged mobile manipulators,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2377–2384, 2022

  14. [14]

    Umi on legs: Making manipulation policies mobile with manipulation-centric whole-body controllers,

    H. Ha, Y . Gao, Z. Fu, J. Tan, and S. Song, “Umi on legs: Making manipulation policies mobile with manipulation-centric whole-body controllers,”arXiv preprint arXiv:2407.10353, 2024

  15. [15]

    Centralized model predictive control for collaborative loco-manipulation

    F. De Vincenti and S. Coros, “Centralized model predictive control for collaborative loco-manipulation.” inRobotics: Science and Systems, vol. 2023, 2023

  16. [16]

    Pacc: A passive-arm approach for high-payload collaborative carrying with quadruped robots using model predictive control,

    G. Turrisi, L. Schulze, V . S. Medeiros, C. Semini, and V . Barasuol, “Pacc: A passive-arm approach for high-payload collaborative carrying with quadruped robots using model predictive control,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 11 139–11 146

  17. [17]

    Coordination and decentralized cooperation of multiple mobile manipulators,

    O. Khatib, K. Yokoi, K. Chang, D. Ruspini, R. Holmberg, and A. Casal, “Coordination and decentralized cooperation of multiple mobile manipulators,”Journal of Robotic Systems, vol. 13, no. 11, pp. 755–764, 1996

  18. [18]

    Internal force analysis and load distribution for cooperative multi-robot manipulation,

    S. Erhart and S. Hirche, “Internal force analysis and load distribution for cooperative multi-robot manipulation,”IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1238–1243, 2015

  19. [19]

    A decoupling scheme for force control in cooperative multi-robot manipulation tasks,

    L. De Pascali, S. Erhart, L. Zaccarian, B. Francesco, and S. Hirche, “A decoupling scheme for force control in cooperative multi-robot manipulation tasks,” in2022 IEEE 17th international conference on advanced motion control (AMC). IEEE, 2022, pp. 243–249

  20. [20]

    Diffusion policy: Visuomotor policy learning via action diffusion,

    C. Chi, Z. Xu, S. Feng, E. Cousineau, Y . Du, B. Burchfiel, R. Tedrake, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,”The International Journal of Robotics Research, vol. 44, no. 10-11, pp. 1684–1704, 2025

  21. [21]

    Umi-on-air: Embodiment-aware guidance for embodiment- agnostic visuomotor policies,

    H. Gupta, X. Guo, H. Ha, C. Pan, M. Cao, D. Lee, S. Scherer, S. Song, and G. Shi, “Umi-on-air: Embodiment-aware guidance for embodiment- agnostic visuomotor policies,”arXiv preprint arXiv:2510.02614, 2025

  22. [22]

    Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

    T. Z. Zhao, V . Kumar, S. Levine, and C. Finn, “Learning fine- grained bimanual manipulation with low-cost hardware,”arXiv preprint arXiv:2304.13705, 2023

  23. [23]

    A practical guide for incorporating symmetry in diffusion policy,

    D. Wang, B. Hu, S. Song, R. Walters, and R. Platt, “A practical guide for incorporating symmetry in diffusion policy,”arXiv preprint arXiv:2505.13431, 2025

  24. [24]

    Fast-umi: A scalable and hardware-independent universal manipulation interface,

    K. Liu, C. Guan, Z. Jia, Z. Wu, X. Liu, T. Wang, S. Liang, P. Chen, P. Zhang, H. Songet al., “Fastumi: A scalable and hardware- independent universal manipulation interface with dataset,”arXiv preprint arXiv:2409.19499, 2024

  25. [25]

    Casadi: a software framework for nonlinear optimization and optimal control,

    J. A. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “Casadi: a software framework for nonlinear optimization and optimal control,”Mathematical Programming Computation, vol. 11, no. 1, pp. 1–36, 2019

  26. [26]

    Pink: Python inverse kinematics based on Pinocchio,

    S. Caron, Y . De Mont-Marin, R. Budhiraja, S. H. Bang, I. Domrachev, S. Nedelchev, P. Du, A. Escande, J. Vaillant, B. Wingo, and S. Patapati, “Pink: Python inverse kinematics based on Pinocchio,” 2026. [Online]. Available: https://github.com/stephane-caron/pink

  27. [27]

    Alma-articulated locomotion and manipu- lation for a torque-controllable robot,

    C. D. Bellicoso, K. Kr ¨amer, M. St ¨auble, D. Sako, F. Jenelten, M. Bjelonic, and M. Hutter, “Alma-articulated locomotion and manipu- lation for a torque-controllable robot,” in2019 International conference on robotics and automation (ICRA). IEEE, 2019, pp. 8477–8483