SurGE: Surrogate Gradient-guided Evolution for Co-design of Legged Robots with Parallel Elasticity
Pith reviewed 2026-06-26 12:24 UTC · model grok-4.3
The pith
SurGE injects surrogate gradients from a kinodynamic model into CMA-ES to stabilize co-design of legged robots with parallel elasticity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SurGE computes surrogate gradients of the design objective through a kinodynamic single-rigid-body model and a design-aware control policy, then injects them into CMA-ES via mean shift with cosine-annealed step decay. On a 4-DOF design space of a hopping robot with unidirectional parallel spring, this produces six times lower cross-seed standard deviation and 18 percent tighter population concentration than vanilla CMA-ES while matching or improving the best objective. Hardware experiments on a 2D subspace starting from a hand-tuned initial design reduce the objective by 37.65 percent, with the improvement trend observed in simulation carrying over to the physical system.
What carries the argument
Surrogate gradient injection into CMA-ES via mean shift, derived from the differentiable Kino-SRB model and design-aware control policy pipeline.
If this is right
- Evolutionary searches for robot designs become less sensitive to the choice of random seed.
- Design improvements identified in simulation are more likely to appear on physical hardware.
- Co-design remains feasible for mechanisms that include contacts and spring engagement without requiring end-to-end differentiability.
- Candidate design populations concentrate more tightly around high-performing regions of the search space.
- The method can match or exceed the single best design found by standard CMA-ES while improving reliability.
Where Pith is reading between the lines
- The same surrogate-gradient injection could be tested on co-design problems involving other elastic elements such as series springs or variable-stiffness actuators.
- Extending the underlying model to capture additional degrees of freedom or multi-contact scenarios would test how far the approach scales before surrogate fidelity drops.
- Hybrid gradient-evolution methods of this form may reduce the total number of expensive hardware evaluations needed during robot development.
- The technique suggests a general pattern for blending simplified differentiable models with black-box optimizers in other non-differentiable engineering domains.
Load-bearing premise
The kinodynamic single-rigid-body model together with the design-aware control policy supplies surrogate gradients sufficiently faithful to the true non-differentiable design objective despite contact dynamics and mechanism engagement.
What would settle it
Multiple independent hardware optimization runs with SurGE versus vanilla CMA-ES that show no reduction in objective value or no decrease in cross-seed variance would falsify the central claim.
Figures
read the original abstract
Co-design of legged robots with elastic elements is challenging due to the non-differentiability of contact dynamics and mechanism engagement. This paper presents SurGE, a framework that computes surrogate gradients of the design objective through a differentiable pipeline consisting of a kinodynamic single-rigid-body (Kino-SRB) model and a design-aware control policy, and injects them into CMA-ES via mean shift with cosine-annealed step decay. On a 4-DOF design space of a hopping robot with unidirectional parallel spring, SurGE achieves 6 times lower cross-seed standard deviation and 18% tighter population concentration compared to vanilla CMA-ES, while matching or improving the best objective. Hardware experiments on a 2D design subspace show that, starting from a hand-tuned initial design, SurGE reduces the design objective by 37.65% on hardware, with the improvement trend identified in simulation transferring consistently to the physical system. SurGE provides the potential to accelerate non-differentiable co-design problems in legged robots via surrogate model gradients.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SurGE, a framework for co-design of legged robots with parallel elasticity that computes surrogate gradients via a kinodynamic single-rigid-body (Kino-SRB) model and design-aware control policy, then injects them into CMA-ES using mean-shift with cosine-annealed decay. On a 4-DOF hopping-robot design space with unidirectional springs, it reports 6x lower cross-seed standard deviation and 18% tighter population concentration than vanilla CMA-ES while matching or improving the best objective value. Hardware trials on a 2D design subspace, starting from a hand-tuned design, show a 37.65% reduction in the objective with consistent sim-to-real transfer.
Significance. If the Kino-SRB surrogate gradients remain sufficiently aligned with the true non-differentiable objective, the approach offers a practical way to accelerate evolutionary co-design of elastic legged robots by combining differentiable approximations with population-based search. The explicit hardware validation on a physical 2D subspace is a concrete strength, as is the focus on unidirectional parallel springs, which are common in real mechanisms. The method could generalize to other contact-rich co-design problems if the gradient-faithfulness assumption holds.
major comments (3)
- [Method (surrogate gradient computation and mean-shift step)] Method section on surrogate gradient injection: the central performance claims (6x lower cross-seed std, 18% tighter concentration, 37.65% hardware improvement) rest on the unverified assumption that gradients from the Kino-SRB model plus design-aware policy are aligned with the true objective; no cosine similarity, directional error, or finite-difference comparison against the full simulator or hardware is reported, despite the acknowledged discontinuities from contacts and unidirectional spring engagement.
- [Results (4-DOF hopping robot experiments)] Results on 4-DOF simulation experiments: the quantitative gains are stated without error bars, number of independent seeds, or statistical tests, so it is impossible to determine whether the reported 6x std reduction and 18% concentration improvement are robust or sensitive to post-hoc seed selection.
- [Hardware experiments] Hardware experiments paragraph: the 37.65% objective reduction is measured on a 2D subspace starting from a hand-tuned design, but no details are given on how many physical trials were performed, what variance was observed, or whether the design-aware policy was deployed on hardware versus simulation only.
minor comments (2)
- [Abstract] Abstract: the final sentence uses 'provides the potential'; rephrase to 'offers the potential' for standard academic tone.
- [Introduction / Method] Notation: the acronym Kino-SRB is introduced without an explicit expansion on first use in the main text.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for explicit validation of surrogate gradient alignment, improved statistical reporting in simulation results, and additional details on hardware experiments. We address each major comment below and will incorporate revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Method (surrogate gradient computation and mean-shift step)] Method section on surrogate gradient injection: the central performance claims (6x lower cross-seed std, 18% tighter concentration, 37.65% hardware improvement) rest on the unverified assumption that gradients from the Kino-SRB model plus design-aware policy are aligned with the true objective; no cosine similarity, directional error, or finite-difference comparison against the full simulator or hardware is reported, despite the acknowledged discontinuities from contacts and unidirectional spring engagement.
Authors: We agree that direct quantitative verification of gradient alignment (such as cosine similarity or finite-difference comparisons) is not provided in the current manuscript. The empirical improvements in search efficiency and hardware transfer serve as indirect validation, but to directly address this, we will add a new analysis subsection (e.g., in Methods or an appendix) computing directional alignment metrics on sampled designs using finite differences from the full simulator. This will include cosine similarities and error statistics to quantify how well the Kino-SRB surrogate tracks the true objective direction. revision: yes
-
Referee: [Results (4-DOF hopping robot experiments)] Results on 4-DOF simulation experiments: the quantitative gains are stated without error bars, number of independent seeds, or statistical tests, so it is impossible to determine whether the reported 6x std reduction and 18% concentration improvement are robust or sensitive to post-hoc seed selection.
Authors: The simulation results were generated using 10 independent random seeds per method. We will revise the Results section and all relevant figure captions to explicitly state the number of seeds, include error bars (standard deviation across seeds), and add statistical tests (paired t-tests or Wilcoxon rank-sum) to confirm significance of the reported reductions in standard deviation and improvements in population concentration. revision: yes
-
Referee: [Hardware experiments] Hardware experiments paragraph: the 37.65% objective reduction is measured on a 2D subspace starting from a hand-tuned design, but no details are given on how many physical trials were performed, what variance was observed, or whether the design-aware policy was deployed on hardware versus simulation only.
Authors: The hardware validation used the design-aware policy transferred directly to the onboard controller and performed 5 repeated physical trials per evaluated design on the 2D subspace, with observed objective variance below 5% of the mean value across trials. We will expand the Hardware Experiments section to report the exact number of trials, measured variance, and explicit confirmation of hardware deployment of the policy, along with any additional sim-to-real consistency metrics. revision: yes
Circularity Check
No circularity; algorithmic surrogate injection is independent of fitted inputs
full rationale
The derivation chain consists of an explicit algorithmic procedure: a Kino-SRB model plus design-aware policy produces surrogate gradients that are then used inside a modified CMA-ES update (mean-shift with cosine annealing). No equation or claim reduces a reported performance metric (cross-seed std, population concentration, hardware objective) to a quantity fitted from the same data by construction. No self-citation is invoked as a uniqueness theorem or load-bearing premise. The central claim remains an empirical demonstration that the surrogate pipeline can be injected into an existing evolutionary optimizer; the paper does not rename or re-derive its own outputs as predictions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The kinodynamic single-rigid-body model and design-aware control policy produce surrogate gradients faithful enough to guide CMA-ES on the true non-differentiable objective.
Reference graph
Works this paper leans on
-
[1]
A comparison of series and parallel elasticity in a monoped hopper,
Y . Yesilevskiy, W. Xi, and C. D. Remy, “A comparison of series and parallel elasticity in a monoped hopper,” in2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2015, pp. 1036–1041
2015
-
[2]
Parallel stiffness in a bounding quadruped with flexible spine,
G. A. Folkertsma, S. Kim, and S. Stramigioli, “Parallel stiffness in a bounding quadruped with flexible spine,” in2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012, pp. 2210–2215
2012
-
[3]
A novel optimization design of dual-slide parallel elastic actuator for legged robots,
S. Liu, J. Ding, C. Lu, Z. Wang, B. Su, and Z. Guo, “A novel optimization design of dual-slide parallel elastic actuator for legged robots,”IEEE/ASME Transactions on Mechatronics, vol. 29, no. 4, pp. 2886–2894, 2024
2024
-
[4]
Design and verification of a parallel elastic robotic leg,
E. Tanfener, O. K. Karag ¨oz, S. S. Candan, A. E. Turgut, Y . Yazıcıoglu, M. M. Ankaralı, and U. Saranlı, “Design and verification of a parallel elastic robotic leg,”Bioinspiration & Biomimetics, vol. 19, no. 2, p. 026014, 2024
2024
-
[5]
SPEAR: a monopedal robot with switchable parallel elastic actuation,
X. Liu, A. Rossi, and I. Poulakakis, “SPEAR: a monopedal robot with switchable parallel elastic actuation,” in2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015, pp. 5142–5147
2015
-
[6]
Birdbot achieves energy-efficient gait with minimal control using avian-inspired leg clutching,
A. Badri-Spr ¨owitz, A. Aghamaleki Sarvestani, M. Sitti, and M. A. Daley, “Birdbot achieves energy-efficient gait with minimal control using avian-inspired leg clutching,”Science Robotics, vol. 7, no. 64, p. eabg4055, 2022
2022
-
[7]
A versatile co-design approach for dynamic legged robots,
T. Dinev, C. Mastalli, V . Ivan, S. Tonneau, and S. Vijayakumar, “A versatile co-design approach for dynamic legged robots,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 10 343–10 349
2022
-
[8]
Meta reinforcement learning for optimal design of legged robots,
´A. Belmonte-Baeza, J. Lee, G. Valsecchi, and M. Hutter, “Meta reinforcement learning for optimal design of legged robots,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 12 134–12 141, 2022
2022
-
[9]
Learning-based design and control for quadrupedal robots with parallel-elastic actuators,
F. Bjelonic, J. Lee, P. Arm, D. Sako, D. Tateo, J. Peters, and M. Hutter, “Learning-based design and control for quadrupedal robots with parallel-elastic actuators,”IEEE Robotics and Automation Letters, vol. 8, no. 3, pp. 1611–1618, 2023
2023
-
[10]
Engineering compliance in legged robots via robust co-design,
G. Bravo-Palacios, H. Li, and P. M. Wensing, “Engineering compliance in legged robots via robust co-design,”IEEE/ASME Transactions on Mechatronics, 2024
2024
-
[11]
Computational design of energy-efficient legged robots: Optimizing for size and actuators,
G. Fadini, T. Flayols, A. Del Prete, N. Mansard, and P. Sou `eres, “Computational design of energy-efficient legged robots: Optimizing for size and actuators,” in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 9898–9904
2021
-
[12]
Tutorial on amortized optimization,
B. Amos, “Tutorial on amortized optimization,”Foundations and Trends in Machine Learning, vol. 16, no. 5, pp. 592–732, 2023
2023
-
[13]
The CMA evolution strategy: a comparing review,
N. Hansen, “The CMA evolution strategy: a comparing review,” Towards a new evolutionary computation: Advances in the estimation of distribution algorithms, pp. 75–102, 2006
2006
-
[14]
Vitruvio: An open-source leg design optimization toolbox for walking robots,
M. Chadwick, H. Kolvenbach, F. Dubois, H. F. Lau, and M. Hutter, “Vitruvio: An open-source leg design optimization toolbox for walking robots,”IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6318–6325, 2020
2020
-
[15]
An introduction to zero-order optimization techniques for robotics,
A. Jordana, J. Zhang, J. Amigo, and L. Righetti, “An introduction to zero-order optimization techniques for robotics,”arXiv preprint arXiv:2506.22087, 2025
-
[16]
Efficient adjoint-based design optimization with optimal control,
S. He, S. Kaneko, M. Howell, N. Li, and J. R. Martins, “Efficient adjoint-based design optimization with optimal control,”arXiv preprint arXiv:2602.15242, 2026
-
[17]
Brax - a differentiable physics engine for large scale rigid body simulation,
C. D. Freeman, E. Frey, A. Raichuk, S. Girgin, I. Mordatch, and O. Bachem, “Brax - a differentiable physics engine for large scale rigid body simulation,” inThirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021
2021
-
[18]
Efficient differentiable simulation of articulated bodies,
Y .-L. Qiao, J. Liang, V . Koltun, and M. C. Lin, “Efficient differentiable simulation of articulated bodies,” inInternational Conference on Machine Learning. PMLR, 2021, pp. 8661–8671
2021
-
[19]
Accelerated policy learning with parallel differen- tiable simulation,
J. Xu, M. Macklin, V . Makoviychuk, Y . Narang, A. Garg, F. Ramos, and W. Matusik, “Accelerated policy learning with parallel differen- tiable simulation,” inInternational Conference on Learning Represen- tations, 2022
2022
-
[20]
Learning deployable locomotion control via differentiable simulation,
C. Schwarke, V . Klemm, J. Bagajo, J. P. Sleiman, I. Georgiev, J. T. Torres, and M. Hutter, “Learning deployable locomotion control via differentiable simulation,” in9th Annual Conference on Robot Learning, 2025
2025
-
[21]
Do dif- ferentiable simulators give better policy gradients?
H. J. Suh, M. Simchowitz, K. Zhang, and R. Tedrake, “Do dif- ferentiable simulators give better policy gradients?” inInternational Conference on Machine Learning. PMLR, 2022, pp. 20 668–20 696
2022
-
[22]
Gradi- ents are not all you need,
L. Metz, C. D. Freeman, S. S. Schoenholz, and T. Kachman, “Gradi- ents are not all you need,”arXiv preprint arXiv:2111.05803, 2021
-
[23]
Learning quadruped locomotion using differentiable simulation,
Y . Song, S. Kim, and D. Scaramuzza, “Learning quadruped locomotion using differentiable simulation,” in8th Annual Conference on Robot Learning, 2024
2024
-
[24]
Guided evolutionary strategies: Augmenting random search with surrogate gradients,
N. Maheswaranathan, L. Metz, G. Tucker, D. Choi, and J. Sohl- Dickstein, “Guided evolutionary strategies: Augmenting random search with surrogate gradients,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 4264–4273
2019
-
[25]
Injecting external solutions into CMA-ES,
N. Hansen, “Injecting external solutions into CMA-ES,” INRIA, Research Report RR-7748, Oct. 2011
2011
-
[26]
Making use of design-aware policy optimiza- tion in legged-robotics co-design,
G. Fadini and S. Coros, “Making use of design-aware policy optimiza- tion in legged-robotics co-design,” inProceedings of the Morphology- Aware Policy and Design Learning (MAPoDeL) Workshop at CoRL, 2024
2024
-
[27]
Robust co-design: Coupling morphology and feedback design through stochastic programming,
G. Bravo-Palacios, G. Grandesso, A. D. Prete, and P. M. Wensing, “Robust co-design: Coupling morphology and feedback design through stochastic programming,”Journal of Dynamic Systems, Measurement, and Control, vol. 144, no. 2, p. 021007, 2022
2022
-
[28]
Learning quadrupedal locomotion over challenging terrain,
J. Lee, J. Hwangbo, L. Wellhausen, V . Koltun, and M. Hutter, “Learning quadrupedal locomotion over challenging terrain,”Science robotics, vol. 5, no. 47, p. eabc5986, 2020
2020
-
[29]
Deep whole-body control: learning a unified policy for manipulation and locomotion,
Z. Fu, X. Cheng, and D. Pathak, “Deep whole-body control: learning a unified policy for manipulation and locomotion,” inConference on Robot Learning. PMLR, 2023, pp. 138–149
2023
-
[30]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[31]
Real-time model predictive control for versatile dynamic motions in quadrupedal robots,
Y . Ding, A. Pandala, and H.-W. Park, “Real-time model predictive control for versatile dynamic motions in quadrupedal robots,” in2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8484–8490
2019
-
[32]
Kinodynamic model predictive control for energy efficient locomotion of legged robots with parallel elasticity,
Y . Zhuang, Y . Wang, and Y . Ding, “Kinodynamic model predictive control for energy efficient locomotion of legged robots with parallel elasticity,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 12 365–12 371
2025
-
[33]
Isaac Gym: High performance GPU based physics simulation for robot learning,
V . Makoviychuk, L. Wawrzyniak, Y . Guo, M. Lu, K. Storey, M. Mack- lin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, and G. State, “Isaac Gym: High performance GPU based physics simulation for robot learning,” inThirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021
2021
-
[34]
Saltation matrices: The essential tool for linearizing hybrid dynamical systems,
N. J. Kong, J. J. Payne, J. Zhu, and A. M. Johnson, “Saltation matrices: The essential tool for linearizing hybrid dynamical systems,” Proceedings of the IEEE, vol. 112, no. 6, pp. 585–608, 2024
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.