Actuator reality shaping uses a 2DOF controller to align real actuator closed-loop behavior with idealized simulation reference dynamics, enabling zero-shot sim-to-real policy deployment across multiple robot platforms.
hub
Learning robust perceptive locomotion for quadrupedal robots in the wild
18 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
Perceptive BFM grounds human motion priors in robot terrain perception via terrain-conformal reference synthesis and teacher-student transfer from adapted to raw-reference tracking.
HORIZON is a recoverability-governed checkpointed frontier curriculum for on-policy physical-domain scaling on quadruped locomotion that identifies three regularities: uneven widening, non-monotonic composition, and the necessity of joint on-policy interaction.
Per-Frame Deep Sets enables scaling single-sphere to five-sphere transport on a quadruped by performing permutation-invariant pooling within each history frame, reaching 100% no-drop success in simulation where standard encoders plateau.
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
A framework using 3D Gaussian Splatting for visual domain randomization enables robust monocular RGB-based dexterous in-hand reorientation on real hardware for multiple objects under varied lighting.
T-GMP learns a terrain-conditioned latent motion manifold via CVAE from demonstrations and integrates it into an adversarial pipeline with a foothold penalty for versatile, natural humanoid locomotion.
Integrating foot position maps into heightmaps and adding a locomotion-stability reward in an attention-based RL framework improves quadrupedal success rates on both trained and out-of-domain complex terrains.
An end-to-end policy learns robust humanoid locomotion directly from noisy depth images via high-fidelity sensor simulation, vision-aware distillation from privileged maps, and terrain-specific multi-critic reward shaping.
Empirical comparison of blind, critic-perceptive, and fully perceptive variants of morphology-aware RL locomotion controllers shows critic-only perception improves robustness over blind baselines while remaining more stable under perception noise than full perception.
A DRL locomotion controller extended from prior quadruped work enabled the Go2-W robot to complete 2.8 km of autonomous real-world navigation including mixed terrain and stairs.
A multi-channel terrain affordance reward combined with lower-body compliance training via virtual wrenches enables end-to-end PPO-trained humanoid policies to walk at 1 m/s on 0.2 m risers with improved payload robustness.
Excessive sim2real focus impedes robotics policy learning via simulator lock-in; a kinematics-only sim2sim2real paradigm is proposed to restore exploration.
SDPG is a new on-policy visual RL algorithm that estimates gradients via stochastic perturbations of rollouts, achieving faster training and lower memory use than baselines on visual MuJoCo tasks while adding new robotics benchmarks and sim-to-real results.
Tuned foot compliance in quadruped robots lowers locomotion energy consumption by roughly 17 percent relative to rigid or overly soft designs.
Develops and tests a model-based RL controller with post-training for gait in a tendon-driven soft quadruped, reporting improved efficiency and robustness over benchmarks.
Swarm robots navigate unknown environments using goal direction and neighbor positions only, with mathematical validation, potential-field simulations, and sound-field robot experiments.
citing papers explorer
-
Actuator Reality Shaping for Zero-Shot Sim-to-Real Robot Learning
Actuator reality shaping uses a 2DOF controller to align real actuator closed-loop behavior with idealized simulation reference dynamics, enabling zero-shot sim-to-real policy deployment across multiple robot platforms.
-
Perceptive Behavior Foundation Model: Adapting Human Motion Priors to Robot-Centric Terrain
Perceptive BFM grounds human motion priors in robot terrain perception via terrain-conformal reference synthesis and teacher-student transfer from adapted to raw-reference tracking.
-
HORIZON: Recoverability-Governed Curriculum for Physical-Domain Scaling
HORIZON is a recoverability-governed checkpointed frontier curriculum for on-policy physical-domain scaling on quadruped locomotion that identifies three regularities: uneven widening, non-monotonic composition, and the necessity of joint on-policy interaction.
-
S2M-Trek: From Single to Multi-Sphere Transport via Per-Frame Deep Sets on a Wheel-Legged Robot
Per-Frame Deep Sets enables scaling single-sphere to five-sphere transport on a quadruped by performing permutation-invariant pooling within each history frame, reaching 100% no-drop success in simulation where standard encoders plateau.
-
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
-
ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation
A framework using 3D Gaussian Splatting for visual domain randomization enables robust monocular RGB-based dexterous in-hand reorientation on real hardware for multiple objects under varied lighting.
-
T-GMP: Terrain-conditioned Generative Motion Priors for Versatile and Natural Humanoid Locomotion
T-GMP learns a terrain-conditioned latent motion manifold via CVAE from demonstrations and integrates it into an adversarial pipeline with a foothold penalty for versatile, natural humanoid locomotion.
-
Learning Locomotion on Complex Terrain for Quadrupedal Robots with Foot Position Maps and Stability Rewards
Integrating foot position maps into heightmaps and adding a locomotion-stability reward in an attention-based RL framework improves quadrupedal success rates on both trained and out-of-domain complex terrains.
-
Now You See That: Learning End-to-End Humanoid Locomotion from Raw Pixels
An end-to-end policy learns robust humanoid locomotion directly from noisy depth images via high-fidelity sensor simulation, vision-aware distillation from privileged maps, and terrain-specific multi-critic reward shaping.
-
Learning Perceptive Platform Adaptive Locomotion Controllers for Quadrupedal Robots
Empirical comparison of blind, critic-perceptive, and fully perceptive variants of morphology-aware RL locomotion controllers shows critic-only perception improves robustness over blind baselines while remaining more stable under perception noise than full perception.
-
Long-Distance Real-World Navigation of the Legged-Wheeled Robot Go2-W Using Deep Reinforcement Learning
A DRL locomotion controller extended from prior quadruped work enabled the Go2-W robot to complete 2.8 km of autonomous real-world navigation including mixed terrain and stairs.
-
TACT-ful: Multi-Channel Terrain Affordance and Compliance Training for Payload-Robust Perceptive Humanoid Locomotion
A multi-channel terrain affordance reward combined with lower-body compliance training via virtual wrenches enables end-to-end PPO-trained humanoid policies to walk at 1 m/s on 0.2 m risers with improved payload robustness.
-
Too Much of a Good Thing: When sim2real Efforts Impede Policy Learning (And What to Do About It)
Excessive sim2real focus impedes robotics policy learning via simulator lock-in; a kinematics-only sim2sim2real paradigm is proposed to restore exploration.
-
Efficient On-policy Visual-RL via Stochastic Decoupled Policy Gradient
SDPG is a new on-policy visual RL algorithm that estimates gradients via stochastic perturbations of rollouts, achieving faster training and lower memory use than baselines on visual MuJoCo tasks while adding new robotics benchmarks and sim-to-real results.
-
Energy-Efficient Quadruped Locomotion with Compliant Feet
Tuned foot compliance in quadruped robots lowers locomotion energy consumption by roughly 17 percent relative to rigid or overly soft designs.