pith. sign in

arxiv: 2505.10022 · v4 · pith:54WUPMOGnew · submitted 2025-05-15 · 💻 cs.RO

APEX: Action Priors Enable Efficient Exploration for Robust Motion Tracking on Legged Robots

classification 💻 cs.RO
keywords apexmotionactiondeploymentexplorationlearningpriorsreference
0
0 comments X
read the original abstract

Learning natural, animal-like locomotion from demonstrations has become a core paradigm in legged robotics. While motion tracking can reproduce reference gaits, many approaches still require substantial tuning and depend on reference motion inputs at deployment, which can limit responsiveness to task objectives and reduce adaptability. We present APEX (Action Priors enable Efficient eXploration), a motion-tracking reinforcement learning (RL) framework that removes deployment-time dependence on reference motion inputs, improves sample efficiency, and reduces tuning effort. APEX integrates demonstrations into RL via decaying action priors, which guide early exploration toward demonstration-consistent actions and then fade to zero, yielding a pure RL policy at deployment. This is combined with a multi-critic framework that separates style and task + regularization learning signals. Moreover, APEX enables a single policy to learn diverse motions and transfer reference-like styles across different terrains and velocities, while remaining robust to variations in training parameters. We validate our method in simulation on both humanoid and quadruped robots, and with zero-shot deployment on a Unitree Go2 robot. Website and code: https://marmotlab.github.io/APEX/.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. X-Morph: Human Motion Priors for Scalable Robot Learning Across Morphologies

    cs.RO 2026-06 unverdicted novelty 6.0

    X-Morph retargets human motions to kinematically plausible references for multiple legged morphologies, trains privileged RL trackers, and distills them into deployable policies that generalize and enable teleoperatio...