Any2Any transfers pretrained humanoid whole-body tracking policies to new embodiments with 1% of original training cost via kinematic alignment and parameter-efficient fine-tuning.
Omnixtreme: Breaking the generality barrier in high- dynamic humanoid control, 2026a
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple rewards for mocap deployment.
EgoKit is a new toolkit and accessory set that unifies egocentric video collection with wrist views across heterogeneous consumer devices using a consistent interface and log format.
ExoActor uses exocentric video generation to implicitly model robot-environment-object interactions and converts the resulting videos into task-conditioned humanoid control sequences.
citing papers explorer
-
Any2Any: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking
Any2Any transfers pretrained humanoid whole-body tracking policies to new embodiments with 1% of original training cost via kinematic alignment and parameter-efficient fine-tuning.
-
Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors
Imagine2Real enables zero-shot humanoid-object interaction by unifying motions as 4D point trajectories, tracking only base/hands/object keypoints inside a BFM latent space, and training with progressive simple rewards for mocap deployment.
-
EgoKit: Towards Unified Low-Cost Egocentric Data Collection with Heterogeneous Devices
EgoKit is a new toolkit and accessory set that unifies egocentric video collection with wrist views across heterogeneous consumer devices using a consistent interface and log format.
-
ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control
ExoActor uses exocentric video generation to implicitly model robot-environment-object interactions and converts the resulting videos into task-conditioned humanoid control sequences.