An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
Highly dynamic quadruped locomotion via whole-body impulse control and model predictive control
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
DD-SRad is a new RL constraint technique that adapts per-actuator radii dynamically to achieve zero violations and unconstrained-level task performance on heterogeneous robotic joints.
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
A real-time MPC for bipedal robots uses a detailed whole-body model near-term and a simplified rigid-body model later, solved with SQP in acados and tested in MuJoCo simulation on the HyPer-2 robot.
Sparsely gated MoE policies double the success rate of a real Unitree Go2 quadruped on large-obstacle parkour versus matched-active-parameter MLP baselines while cutting inference time compared with a scaled-up MLP.
citing papers explorer
-
Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain
An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
-
Constraint-Enhanced Reinforcement Learning Based on Dynamic Decoupled Spherical Radial Squashing
DD-SRad is a new RL constraint technique that adapts per-actuator radii dynamically to achieve zero violations and unconstrained-level task performance on heterogeneous robotic joints.
-
Trajectory-based actuator identification via differentiable simulation
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
-
Right Model, Right Time: Real-Time Cascaded-Fidelity MPC for Bipedal Walking
A real-time MPC for bipedal robots uses a detailed whole-body model near-term and a simplified rigid-body model later, solved with SQP in acados and tested in MuJoCo simulation on the HyPer-2 robot.
-
Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input
Sparsely gated MoE policies double the success rate of a real Unitree Go2 quadruped on large-obstacle parkour versus matched-active-parameter MLP baselines while cutting inference time compared with a scaled-up MLP.