Uni-Mo generates 7,488 language-annotated quadruped motions via LLM prompts and video diffusion, lifts them to 3D trajectories, and trains policies achieving 96.7% real-robot success on 392 sampled motions.
hub
Exbody2: Advanced expressive humanoid whole-body control
27 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
CWI decouples MoCap data for upper-body manipulation and lower-body locomotion, using dual discriminators and multi-critic training plus distillation to produce a policy that works from hand poses and velocity commands alone.
PressMimic fuses RGB and pressure for pose estimation via FRAPPE++ and uses pressure signals in RL policy PSP, backed by the MotionPRO dataset, to achieve physically consistent humanoid motion imitation.
CoorDex distills privileged body and hand motion teachers into proprioceptive latent priors and composes them via shared-context residual RL heads to enable continuous high-DoF dexterous loco-manipulation.
Stubborn introduces a unified RL framework with yaw-aligned representation, Bernoulli probabilistic termination, and adaptive sampling for robust humanoid motion tracking and fall recovery.
MotionWAM conditions a policy on intermediate features from a video world model to predict unified whole-body motion tokens, enabling real-time humanoid loco-manipulation that outperforms VLA baselines by over 30% on nine Unitree G1 tasks.
A data-centric approach shows that less than 3% of AMASS motion data, filtered by physics feasibility, diversity, and complexity, yields better humanoid tracking policies than the full dataset.
A multi-condition latent diffusion model transfers human motion styles to diverse humanoid robot contents with physics regularizations, achieving 96% success in real-robot trials on Unitree G1.
Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple rewards for mocap deployment.
CEER proposes a compliant end-effector and root control interface that unifies loco-manipulation for humanoids via a distilled low-level policy and hierarchical planners.
VOFA combines a high-level visuomotor policy with a low-level force-adaptive controller to let humanoids push objects up to 17 kg to arbitrary goals using only noisy onboard vision, achieving over 80% real-world success.
A weightlessness mechanism enables humanoid robots to dynamically relax joints for stable, contact-rich motions across diverse environments without task-specific tuning.
A diffusion-based motion generator combined with an RL motion tracker enables terrain-aware whole-body locomotion on a humanoid robot by adapting reference motions online from perception.
AssistMimic is the first multi-agent RL method that successfully tracks assistive human-human interaction motions in simulation by using partner-aware policies, single-agent initialization, dynamic reference retargeting, and contact-promoting rewards.
cuRoboV2 unifies B-spline optimization, GPU-native dense signed distance fields, and scalable whole-body kinematics and dynamics to achieve 99.7% success on payloaded manipulators and 99.6% collision-free IK on 48-DoF humanoids.
TeleGate achieves high-precision real-time whole-body teleoperation of humanoid robots by dynamically gating between expert policies and using a VAE motion prior to infer future intent from history, outperforming distillation baselines on dynamic motions with only 2.5 hours of mocap data.
Humanoid-LLA converts unconstrained natural language commands into stable whole-body motions for humanoid robots using a unified motion vocabulary and two-stage supervised-plus-reinforcement fine-tuning.
HANDOFF is a distilled mixture-of-experts humanoid whole-body controller that follows a compact task-space interface, matches SOTA velocity tracking, provides large manipulation workspace on Unitree G1, and supports VLM-driven agentic planning with no task-specific data.
M3imic unifies heterogeneous motion modalities via encoders into a shared latent space for a single RL-trained whole-body controller achieving high sim success and sim-to-real transfer on Unitree G1.
Humanoid-GPT is a causal Transformer pre-trained on a unified billion-scale motion dataset that tracks dynamic behaviors with zero-shot generalization to unseen motions and tasks.
MuGen learns a generative latent representation of multi-skill humanoid locomotion from heterogeneous human data using VQ-VAEs and RL, then distills a deployable policy that tracks unseen motions and reuses the latent space.
A single causal-transformer policy with latent recovery modes and contact-affordance prediction enables humanoid robots to recover from 100-300 N pushes with 100% success in simulation, generalizing zero-shot across wall distances, mass, friction, and latency changes.
Switch enables humanoid robots to perform agile, seamless transitions between locomotion skills via a kinematic skill graph, DRL tracking policy, and real-time graph-search scheduler.
citing papers explorer
-
CWI: Composite Humanoid Whole-Body Imitation System for Loco-manipulation
CWI decouples MoCap data for upper-body manipulation and lower-body locomotion, using dual discriminators and multi-critic training plus distillation to produce a policy that works from hand poses and velocity commands alone.
-
PressMimic: Pressure-Guided Motion Capture and Control for Humanoid Robot Imitation
PressMimic fuses RGB and pressure for pose estimation via FRAPPE++ and uses pressure signals in RL policy PSP, backed by the MotionPRO dataset, to achieve physically consistent humanoid motion imitation.
-
CoorDex: Coordinating Body and Hand Priors for Continuous Dexterous Humanoid Loco-Manipulation
CoorDex distills privileged body and hand motion teachers into proprioceptive latent priors and composes them via shared-context residual RL heads to enable continuous high-DoF dexterous loco-manipulation.
-
Stubborn: A Streamlined and Unified Reinforcement Learning Framework for Robust Motion Tracking and Fall Recovery for Humanoids
Stubborn introduces a unified RL framework with yaw-aligned representation, Bernoulli probabilistic termination, and adaptive sampling for robust humanoid motion tracking and fall recovery.
-
MotionWAM: Towards Foundation World Action Models for Real-Time Humanoid Loco-Manipulation
MotionWAM conditions a policy on intermediate features from a video world model to predict unified whole-body motion tokens, enabling real-time humanoid loco-manipulation that outperforms VLA baselines by over 30% on nine Unitree G1 tasks.
-
LIMMT: Less is More for Motion Tracking
A data-centric approach shows that less than 3% of AMASS motion data, filtered by physics feasibility, diversity, and complexity, yields better humanoid tracking policies than the full dataset.
-
Bionic Human-Motion Style Transfer for Physically Executable Whole-Body Control of Humanoid Robots
A multi-condition latent diffusion model transfers human motion styles to diverse humanoid robot contents with physics regularizations, achieving 96% success in real-robot trials on Unitree G1.
-
Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors
Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple rewards for mocap deployment.
-
CEER: Compliant End-Effector and Root Control as a Unified Interface for Hierarchical Humanoid Loco-Manipulation
CEER proposes a compliant end-effector and root control interface that unifies loco-manipulation for humanoids via a distilled low-level policy and hierarchical planners.
-
VOFA: Visual Object Goal Pushing with Force-Adaptive Control for Humanoids
VOFA combines a high-level visuomotor policy with a low-level force-adaptive controller to let humanoids push objects up to 17 kg to arbitrary goals using only noisy onboard vision, achieving over 80% real-world success.
-
Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot
A weightlessness mechanism enables humanoid robots to dynamically relax joints for stable, contact-rich motions across diverse environments without task-specific tuning.
-
Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking
A diffusion-based motion generator combined with an RL motion tracker enables terrain-aware whole-body locomotion on a humanoid robot by adapting reference motions online from perception.
-
Learning to Assist: Physics-Grounded Human-Human Control via Multi-Agent Reinforcement Learning
AssistMimic is the first multi-agent RL method that successfully tracks assistive human-human interaction motions in simulation by using partner-aware policies, single-agent initialization, dynamic reference retargeting, and contact-promoting rewards.
-
TeleGate: Whole-Body Humanoid Teleoperation via Gated Expert Selection with Motion Prior
TeleGate achieves high-precision real-time whole-body teleoperation of humanoid robots by dynamically gating between expert policies and using a VAE motion prior to infer future intent from history, outperforming distillation baselines on dynamic motions with only 2.5 hours of mocap data.
-
Commanding Humanoid by Free-form Language: A Large Language Action Model with Unified Motion Vocabulary
Humanoid-LLA converts unconstrained natural language commands into stable whole-body motions for humanoid robots using a unified motion vocabulary and two-stage supervised-plus-reinforcement fine-tuning.
-
HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers
HANDOFF is a distilled mixture-of-experts humanoid whole-body controller that follows a compact task-space interface, matches SOTA velocity tracking, provides large manipulation workspace on Unitree G1, and supports VLM-driven agentic planning with no task-specific data.
-
M3imic: Learning a Versatile Whole-Body Controller for Multimodal Motion Mimicking
M3imic unifies heterogeneous motion modalities via encoders into a shared latent space for a single RL-trained whole-body controller achieving high sim success and sim-to-real transfer on Unitree G1.
-
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking
Humanoid-GPT is a causal Transformer pre-trained on a unified billion-scale motion dataset that tracks dynamic behaviors with zero-shot generalization to unseen motions and tasks.
-
MuGen: Multi-Skill Generative Locomotion Controller for Humanoid Robots
MuGen learns a generative latent representation of multi-skill humanoid locomotion from heterogeneous human data using VQ-VAEs and RL, then distills a deployable policy that tracks unseen motions and reuses the latent space.
-
RecoverFormer: End-to-End Contact-Aware Recovery for Humanoid Robots
A single causal-transformer policy with latent recovery modes and contact-affordance prediction enables humanoid robots to recover from 100-300 N pushes with 100% success in simulation, generalizing zero-shot across wall distances, mass, friction, and latency changes.
-
Switch: Learning Agile Skills Switching for Humanoid Robots
Switch enables humanoid robots to perform agile, seamless transitions between locomotion skills via a kinematic skill graph, DRL tracking policy, and real-time graph-search scheduler.
-
UniCon: A Unified System for Efficient Robot Learning Transfers
UniCon standardizes states and control logic into modular execution graphs for efficient transfer of learning controllers across heterogeneous robots, with lower latency than ROS.
-
Toward Seamless Physical Human-Humanoid Interaction: Insights from Control, Intent, and Modeling with a Vision for What Comes Next
A literature review of pHHI that proposes a taxonomy of interaction types by modality and engagement level while outlining pathways to integrate control, intent, and modeling for more seamless humanoid-human collaboration.
-
RPG: Robust Policy Gating for Smooth Multi-Skill Transitions in Humanoid Fighting
RPG trains a unified humanoid robot policy using motion and temporal randomization to achieve smooth, stable transitions between fighting skills and locomotion.