OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

Angjoo Kanazawa; Carmelo Sferrazza; C. Karen Liu; Guanya Shi; Lujie Yang; Pieter Abbeel; Rocky Duan; Xiaoyu Huang; Zhen Wu

arxiv: 2509.26633 · v3 · pith:NTGW3B7Pnew · submitted 2025-09-30 · 💻 cs.RO · cs.AI· cs.LG· cs.SY· eess.SY

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

Lujie Yang , Xiaoyu Huang , Zhen Wu , Angjoo Kanazawa , Pieter Abbeel , Carmelo Sferrazza , C. Karen Liu , Rocky Duan

show 1 more author

Guanya Shi

This is my paper

classification 💻 cs.RO cs.AIcs.LGcs.SYeess.SY

keywords dataomniretargethumanoidkinematicloco-manipulationretargetingcontactenables

0 comments

read the original abstract

A dominant paradigm for teaching humanoid robots complex skills is to retarget human motions as kinematic references to train reinforcement learning (RL) policies. However, existing retargeting pipelines often struggle with the significant embodiment gap between humans and robots, producing physically implausible artifacts like foot-skating and penetration. More importantly, common retargeting methods neglect the rich human-object and human-environment interactions essential for expressive locomotion and loco-manipulation. To address this, we introduce OmniRetarget, an interaction-preserving data generation engine based on an interaction mesh that explicitly models and preserves the crucial spatial and contact relationships between an agent, the terrain, and manipulated objects. By minimizing the Laplacian deformation between the human and robot meshes while enforcing kinematic constraints, OmniRetarget generates kinematically feasible trajectories. Moreover, preserving task-relevant interactions enables efficient data augmentation, from a single demonstration to different robot embodiments, terrains, and object configurations. We comprehensively evaluate OmniRetarget by retargeting motions from OMOMO, LAFAN1, and our in-house MoCap datasets, generating over 8-hour trajectories that achieve better kinematic constraint satisfaction and contact preservation than widely used baselines. Such high-quality data enables proprioceptive RL policies to successfully execute long-horizon (up to 30 seconds) parkour and loco-manipulation skills on a Unitree G1 humanoid, trained with only 5 reward terms and simple domain randomization shared by all tasks, without any learning curriculum.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 13 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ReActor: Reinforcement Learning for Physics-Aware Motion Retargeting
cs.RO 2026-05 unverdicted novelty 7.0

ReActor jointly optimizes motion retargeting and RL policy training with an approximate gradient to generate physically consistent robot motions from human references using only sparse body correspondences.
Physics-Informed Reinforcement Learning of Spatial Density Velocity Potentials for Map-Free Racing
cs.RO 2026-04 unverdicted novelty 7.0

A DRL policy learns racing controls from depth spectral distributions using a non-geometric physics-informed reward, achieving 12% better performance than humans on out-of-distribution tracks with under 1% of baseline...
Rhythm: Learning Interactive Whole-Body Control for Dual Humanoids
cs.RO 2026-03 unverdicted novelty 7.0

Rhythm transfers interactive whole-body behaviors from simulation to real dual Unitree G1 humanoids via interaction-aware retargeting and graph-reward RL.
Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors
cs.RO 2026-05 unverdicted novelty 6.0

Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple reward...
CEER: Compliant End-Effector and Root Control as a Unified Interface for Hierarchical Humanoid Loco-Manipulation
cs.RO 2026-05 unverdicted novelty 6.0

CEER proposes a compliant end-effector and root control interface that unifies loco-manipulation for humanoids via a distilled low-level policy and hierarchical planners.
SynAgent: Generalizable Cooperative Humanoid Manipulation via Solo-to-Cooperative Agent Synergy
cs.CV 2026-04 unverdicted novelty 6.0

SynAgent enables generalizable cooperative humanoid manipulation by transferring skills from solo human-object interactions to multi-agent scenarios via interaction-preserving retargeting, single-agent pretraining wit...
Sumo: Dynamic and Generalizable Whole-Body Loco-Manipulation
cs.RO 2026-04 unverdicted novelty 6.0

Test-time steering of pre-trained whole-body policies via sample-based planning lets legged robots generalize dynamic loco-manipulation to varied heavy objects and tasks without additional training or tuning.
Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
cs.RO 2026-02 unverdicted novelty 6.0

A modular system uses motion matching to compose long-horizon human skill chains, trains RL experts, and distills them into a depth-based policy that lets a Unitree G1 humanoid autonomously climb, vault, and roll over...
HAIC: Humanoid Agile Object Interaction Control via Dynamics-Aware World Model
cs.RO 2026-02 unverdicted novelty 6.0

HAIC enables robust humanoid interactions with underactuated objects by predicting their dynamics from proprioceptive history and using a world model for adaptive control.
HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control
cs.RO 2026-02 conditional novelty 6.0

HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.
Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors
cs.RO 2026-05 unverdicted novelty 5.0

Imagine2Real is a zero-shot humanoid-object interaction method that unifies robot and object motion as 4D point trajectories, tracks only sparse keypoints inside a behavior foundation model latent space, and trains wi...
Humanoid Whole-Body Manipulation via Active Spatial Brain and Generalizable Action Cerebellum
cs.RO 2026-05 unverdicted novelty 5.0

A multi-agent LLM framework for humanoid loco-manipulation that separates active spatial perception and task planning from generalizable action generation without task-specific real-robot data.
Learning Versatile Humanoid Manipulation with Touch Dreaming
cs.RO 2026-04 conditional novelty 5.0

HTD, a multimodal transformer policy trained with behavioral cloning and touch dreaming to predict future tactile latents, achieves a 90.9% relative success rate improvement over baselines on five real-world contact-r...