Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
hub
//arxiv.org/abs/1909.06586
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
EP-based PPO with CPG and residual policies matches standard PPO performance on 12-DoF quadruped uneven-terrain locomotion while using 4.3 times less GPU memory during training.
DD-SRad is a new RL constraint technique that adapts per-actuator radii dynamically to achieve zero violations and unconstrained-level task performance on heterogeneous robotic joints.
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
SemLoco is a reinforcement learning system that integrates semantic understanding with foothold planning to let legged robots navigate cluttered environments without stepping on sensitive low-lying objects.
FLORES is a wheel-legged robot with front-leg hip-yaw DoFs replacing hip-roll, paired with a custom RL controller using adapted HIM and tailored rewards for smooth wheeled-to-legged transitions and efficient gaits.
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
TAG-K combines greedy randomized Kaczmarz row selection with tail averaging to deliver faster convergence and noise robustness for online inertial parameter estimation in robotics.
Multi-phase whole-body MPC for bipedal locomotion uses detailed model near horizon and simplified model later, solved via acados SQP without preselected footsteps, validated in simulation.
Sparsely gated MoE policies double the success rate of a real Unitree Go2 quadruped on large-obstacle parkour versus matched-active-parameter MLP baselines while cutting inference time compared with a scaled-up MLP.
citing papers explorer
-
Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
-
Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain
EP-based PPO with CPG and residual policies matches standard PPO performance on 12-DoF quadruped uneven-terrain locomotion while using 4.3 times less GPU memory during training.
-
Constraint-Enhanced Reinforcement Learning Based on Dynamic Decoupled Spherical Radial Squashing
DD-SRad is a new RL constraint technique that adapts per-actuator radii dynamically to achieve zero violations and unconstrained-level task performance on heterogeneous robotic joints.
-
Trajectory-based actuator identification via differentiable simulation
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
-
Watch Your Step: Learning Semantically-Guided Locomotion in Cluttered Environment
SemLoco is a reinforcement learning system that integrates semantic understanding with foothold planning to let legged robots navigate cluttered environments without stepping on sensitive low-lying objects.
-
A Reconfigured Wheel-Legged Robot for Enhanced Steering and Adaptability
FLORES is a wheel-legged robot with front-leg hip-yaw DoFs replacing hip-roll, paired with a custom RL controller using adapted HIM and tailored rewards for smooth wheeled-to-legged transitions and efficient gaits.
-
Iteratively Learning Muscle Memory for Legged Robots to Master Adaptive and High Precision Locomotion
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
-
TAG-K: Tail-Averaged Greedy Kaczmarz for Computationally Efficient and Performant Online Inertial Parameter Estimation
TAG-K combines greedy randomized Kaczmarz row selection with tail averaging to deliver faster convergence and noise robustness for online inertial parameter estimation in robotics.
-
Right Model, Right Time: Real-Time Cascaded-Fidelity MPC for Bipedal Walking
Multi-phase whole-body MPC for bipedal locomotion uses detailed model near horizon and simplified model later, solved via acados SQP without preselected footsteps, validated in simulation.
-
Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input
Sparsely gated MoE policies double the success rate of a real Unitree Go2 quadruped on large-obstacle parkour versus matched-active-parameter MLP baselines while cutting inference time compared with a scaled-up MLP.