DreamPolicy integrates an autoregressive diffusion world model with policy learning to produce a single scalable policy that generalizes to unseen composite terrains for humanoid locomotion.
Distillation-ppo: A novel two-stage reinforcement learning framework for humanoid robot perceptive locomotion
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2years
2025 2representative citing papers
A four-stage RL system with teacher-student distillation and online constrained adaptation enables humanoid robots to achieve robust ball-kicking accuracy under noisy perception in simulation and on physical hardware.
citing papers explorer
-
DreamPolicy: A Unified World-model Policy for Scalable Humanoid Locomotion
DreamPolicy integrates an autoregressive diffusion world model with policy learning to produce a single scalable policy that generalizes to unseen composite terrains for humanoid locomotion.
-
Learning Agile Striker Skills for Humanoid Soccer Robots from Noisy Sensory Input
A four-stage RL system with teacher-student distillation and online constrained adaptation enables humanoid robots to achieve robust ball-kicking accuracy under noisy perception in simulation and on physical hardware.