X-Morph: Human Motion Priors for Scalable Robot Learning Across Morphologies

Arhaan Jain; Chengyang He; Guillaume Sartoretti; Ritwik Sharma; Shivam Sood; Shyam Charan Kesavamoorthi

arxiv: 2606.30290 · v1 · pith:2RKSWZQCnew · submitted 2026-06-29 · 💻 cs.RO

X-Morph: Human Motion Priors for Scalable Robot Learning Across Morphologies

Ritwik Sharma , Shivam Sood , Arhaan Jain , Shyam Charan Kesavamoorthi , Chengyang He , Guillaume Sartoretti This is my paper

Pith reviewed 2026-06-30 05:17 UTC · model grok-4.3

classification 💻 cs.RO

keywords human motion retargetinglegged robot learningreinforcement learningmotion priorscross-morphology transferlocomotion policiesloco-manipulation

0 comments

The pith

Human motion data can be retargeted to train deployable locomotion policies for quadrupeds, hexapods, and manipulator robots.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a pipeline that adapts abundant human motion data to create control policies for legged robots with different body plans. It first converts human motions into references suited to each robot's structure while preserving intent, then trains a policy with privileged information to follow those references, and finally distills the policy into one that operates with only onboard observations. This approach matters because robot-specific motion data is scarce while human data is plentiful, opening a route to scalable behavior learning across embodiments. Evaluation on three distinct platforms demonstrates that the resulting policies track varied motions, handle new human sequences, and enable tasks such as video teleoperation and text-driven motion generation.

Core claim

X-Morph converts human motions into kinematically plausible robot references through cross-morphology retargeting, tracks those references with a privileged reinforcement learning policy, and distills the result into a causal student policy; the resulting policies track diverse retargeted motions, generalize to unseen human motions, and support downstream applications including video-based teleoperation, behavior-prior control, and text-conditioned motion generation on quadruped, hexapod, and quadruped-manipulator platforms.

What carries the argument

Cross-morphology retargeting stage that produces kinematically plausible, intent-preserving robot motion references from human data for subsequent tracking by privileged RL.

If this is right

Policies track diverse retargeted motions across three morphologically distinct platforms.
Policies generalize to human motions not seen during training.
The approach supports video-based teleoperation, behavior-prior control, and text-conditioned motion generation.
Large-scale human motion data can serve as a substrate for reusable behavior priors on non-humanoid robots.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Shared human priors could reduce the cost of collecting robot-specific demonstration data for each new morphology.
The retargeting-plus-distillation pattern might extend to wheeled or aerial platforms if suitable kinematic mappings are defined.
Combining the distilled policies with sim-to-real transfer methods could accelerate physical robot deployment without additional motion capture.

Load-bearing premise

The retargeted motions remain kinematically and dynamically feasible for the target robot to track under its own physics.

What would settle it

Policies trained via the pipeline fail to track the retargeted references or show no advantage over direct robot-specific training when evaluated on held-out human motion sequences.

Figures

Figures reproduced from arXiv: 2606.30290 by Arhaan Jain, Chengyang He, Guillaume Sartoretti, Ritwik Sharma, Shivam Sood, Shyam Charan Kesavamoorthi.

**Figure 1.** Figure 1: X-Morph framework. Source human/G1 motions are converted into target robot references by a cross-morphology retargeting model and then refined by a physics-aware corrector to reduce contact and ground-interaction artifacts. The resulting clean retargeted motions provide reference data for learning the reference-conditioned tracker and for distilling a causal retargeting model from the offline retargeting… view at source ↗

**Figure 2.** Figure 2: Video-driven teleoperation across non-humanoid morphologies. X-Morph converts monocular human motion into executable robot references for multiple target platforms. (a) Forward walking is transferred to both a quadruped and a hexapod. (b) Human body rotation induces turning behaviors on both robots. (c) A squat motion is retargeted to a quadruped while preserving the high-level lowering intent. (d) Large a… view at source ↗

**Figure 3.** Figure 3: Text-conditioned skill execution. A language command is converted into a human motion through a text-conditioned human-motion model or retrieval system. X-Morph retargets the resulting G1 motion to the target morphology and executes it with the same deployed tracker [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Downstream door-opening case study. Left: X-Morph retargets a Kimodo-generated [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Retargeting specifications for locomotion and loco-manipulation. Colored skeleton seg [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: Offline retargeting teacher architecture. Solid arrows denote motion and latent flow, gray [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Qualitative generalization to unseen manipulation motions. X-Morph transfers source [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

read the original abstract

Recent progress in humanoid behavior models has been driven in large part by abundant human motion data, but comparable motion data is scarce for non-humanoid legged robots such as quadrupeds, hexapods, and quadruped manipulators. A promising alternative is to repurpose human motion across embodiments; however, direct retargeting often produces motions that are visually plausible yet physically inconsistent or difficult to track under robot dynamics. We present X-Morph, a human-motion-to-robot-behavior pipeline that converts human motion into deployable locomotion and loco-manipulation policies for diverse non-humanoid legged morphologies. A cross-morphology retargeting stage converts human motions into kinematically plausible, intent-preserving robot references, which are then tracked by a privileged RL policy and distilled into a causal student policy. We evaluate X-Morph on three morphologically distinct platforms: a quadruped, a hexapod, and a quadruped equipped with a manipulator. The resulting policies track diverse retargeted motions, generalize to unseen human motions, and support downstream use cases including video-based teleoperation, behavior-prior control, and text-conditioned motion generation. These results suggest that large-scale human motion can serve as a substrate for learning broad, reusable behavior priors beyond humanoid robots. Project page: https://maker-rat.github.io/morph/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

X-Morph sketches a retarget-then-RL pipeline to move human motion data onto quadrupeds, hexapods, and manipulator quads, but the abstract supplies no evidence that the retargeted references are dynamically feasible.

read the letter

X-Morph tries to turn abundant human motion data into locomotion and loco-manipulation policies for non-humanoid legged robots. The abstract describes a three-stage pipeline: cross-morphology retargeting to create robot references, privileged RL to track them, and distillation to a deployable policy. It reports evaluation on a quadruped, a hexapod, and a quadruped with an arm, plus downstream uses like video teleoperation.

What is actually new is the explicit multi-morphology scope. Prior retargeting work has mostly stayed with humanoids or single quadruped types; applying the same steps to hexapods and manipulator platforms is a concrete extension. The paper also does a clean job naming the data bottleneck for legged robots and listing practical follow-on tasks.

The soft spot is the one the stress-test note flags. Kinematically plausible retargeted trajectories do not automatically satisfy torque limits, friction cones, or contact dynamics on the target robot. If a non-trivial fraction of the references are infeasible, the tracking stage cannot deliver the claimed generalization. The abstract gives no feasibility filter, no failure-mode statistics, and no quantitative tracking results, so the central assumption stays untested.

The work is aimed at researchers who already work on motion priors or legged RL and want a practical route to reuse human datasets. A reader in that group can extract the high-level recipe and check the project page, but the current version is too thin for serious technical engagement.

I would send the full manuscript with methods, ablations, and numbers to peer review. The problem is real and the multi-platform framing is useful; the paper needs the evidence to match the claim.

Referee Report

1 major / 0 minor

Summary. The paper presents X-Morph, a pipeline that repurposes abundant human motion data for non-humanoid legged robots via a cross-morphology retargeting stage that produces kinematically plausible, intent-preserving robot references. These references are tracked by a privileged RL policy and distilled into a causal student policy. The method is evaluated on three platforms (quadruped, hexapod, quadruped-with-manipulator), with claims that the resulting policies track diverse retargeted motions, generalize to unseen human motions, and enable downstream tasks including video-based teleoperation, behavior-prior control, and text-conditioned motion generation.

Significance. If the core pipeline is validated with quantitative evidence of dynamic feasibility and generalization, the work would be significant for enabling scalable behavior learning across morphologies where robot-specific motion data is scarce. The explicit evaluation across three distinct platforms and the demonstration of multiple downstream use cases provide concrete evidence of breadth that strengthens the contribution relative to single-morphology retargeting approaches.

major comments (1)

[Abstract] Abstract (and pipeline description): The central claim that retargeted references are tracked by the privileged RL policy and yield generalizable student policies rests on the assumption that the cross-morphology retargeting stage produces dynamically feasible trajectories. Kinematic plausibility and intent preservation do not imply satisfaction of the target robot's equations of motion, friction cones, or torque limits; no explicit feasibility check, failure-mode analysis, or fraction of invalid references is reported.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for this constructive comment on dynamic feasibility. We address it directly below.

read point-by-point responses

Referee: [Abstract] Abstract (and pipeline description): The central claim that retargeted references are tracked by the privileged RL policy and yield generalizable student policies rests on the assumption that the cross-morphology retargeting stage produces dynamically feasible trajectories. Kinematic plausibility and intent preservation do not imply satisfaction of the target robot's equations of motion, friction cones, or torque limits; no explicit feasibility check, failure-mode analysis, or fraction of invalid references is reported.

Authors: We agree that kinematic plausibility does not guarantee dynamic feasibility under the robot's equations of motion, friction cones, or torque limits. In our pipeline the privileged RL policy is trained to track the retargeted references using the full robot dynamics (including contact forces and actuator limits), and successful low-error tracking on the training set provides implicit evidence that the references are feasible for the motions retained. However, we did not report an explicit success rate (fraction of retargeted references trackable within a defined error threshold) or a failure-mode analysis of the retargeting stage. We will add this quantitative analysis and any failure examples to the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; standard retarget-RL-distill pipeline with independent empirical claims.

full rationale

The described pipeline converts human motion via cross-morphology retargeting into references that are then tracked by a privileged RL policy and distilled to a student policy. No equations, fitted parameters, or central claims reduce by construction to inputs defined within the paper or via load-bearing self-citations. The abstract and provided text present the method as a sequence of distinct stages whose success is evaluated empirically on multiple platforms, without renaming known results or smuggling ansatzes through prior author work. This is the common case of a self-contained engineering pipeline.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no equations, no explicit parameters, and no stated assumptions beyond the high-level pipeline description; therefore no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5787 in / 1192 out tokens · 27387 ms · 2026-06-30T05:17:17.689064+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 19 canonical work pages · 4 internal anchors

[1]

X. B. Peng, P. Abbeel, S. Levine, and M. van de Panne. Deepmimic: example-guided deep reinforcement learning of physics-based character skills.ACM Transactions on Graphics, 37 (4):1–14, 2018. ISSN 1557-7368. doi:10.1145/3197517.3201311. URLhttp://dx.doi. org/10.1145/3197517.3201311

work page doi:10.1145/3197517.3201311 2018
[2]

BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion

K. Sreenath, C. K. Liu, T. E. Truong, Q. Liao, X. Huang, and G. Tevet. Beyondmimic: From motion tracking to versatile humanoid control via guided diffusion, 2025. URLhttps:// arxiv.org/abs/2508.08241

work page internal anchor Pith review Pith/arXiv arXiv 2025
[3]

H. Weng, Y . Li, N. Sobanbabu, Z. Wang, Z. Luo, T. He, D. Ramanan, and G. Shi. Hdmi: Learning interactive humanoid whole-body control from human videos.arXiv preprint arXiv:2509.16757, 2025

work page arXiv 2025
[4]

Z. Luo, Y . Yuan, T. Wang, C. Li, F. Casta˜neda, S. Chen, Z.-A. Cao, J. Li, D. Minor, Q. Ben, J. Park, D. Sami, Z. Wang, X. Da, R. Ding, C. Hogg, L. Song, E. Lim, E. Jeong, T. He, H. Xue, W. Xiao, S. Yuen, J. Kautz, Y . Chang, U. Iqbal, L. J. Fan, and Y . Zhu. Sonic: Supersizing motion tracking for natural humanoid whole-body control, 2026. URLhttps://arx...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

B.; Jiang, Y.; Wang, T.; Iqbal, U.; Minor, D.; de Ruyter, M.; et al

D. Rempe, M. Petrovich, Y . Yuan, H. Zhang, X. B. Peng, Y . Jiang, T. Wang, U. Iqbal, D. Minor, M. de Ruyter, J. Li, C. Tessler, E. Lim, E. Jeong, S. Wu, E. Hassani, M. Huang, J.-B. Yu, C. Chung, L. Song, O. Dionne, J. Kautz, S. Yuen, and S. Fidler. Kimodo: Scaling controllable human motion generation, 2026. URLhttps://arxiv.org/abs/2603.15546

work page arXiv 2026
[6]

T. Li, H. Jung, M. Gombolay, Y . Cho, and S. Ha. Crossloco: Human motion driven control of legged robots via guided unsupervised reinforcement learning. InInternational Conference on Learning Representations, volume 2024, pages 46892–46905, 2024

2024
[7]

W. Kim, T. Li, and S. Ha. Moreflow: Motion retargeting learning through unsupervised flow matching, 2025. URLhttps://arxiv.org/abs/2509.25600

work page arXiv 2025
[8]

Zhang, T

L. Zhang, T. Komura, Z. Dou, J. Wang, L.-H. Chen, X. Chen, Y . Zhang, and Z. Yin. Motion2motion: Cross-topology motion transfer with sparse correspondence, 2025. URL https://arxiv.org/abs/2508.13139

work page arXiv 2025
[9]

A. H. Bermano, D. Cohen-Or, G. Tevet, S. Raab, I. Gat, and Y . Reshef. Anytop: Character animation diffusion with any topology, 2025. URLhttps://arxiv.org/abs/2502.17327

work page arXiv 2025
[10]

X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa. Amp.ACM Transactions on Graphics (TOG), 40:1 – 20, 2021. URLhttps://api.semanticscholar.org/CorpusID: 233033739

2021
[11]

X. B. Peng, Y . Guo, L. Halper, S. Levine, and S. Fidler. Ase: Large-scale reusable adversarial skill embeddings for physically simulated characters.ACM Transactions On Graphics (TOG), 41(4):1–17, 2022

2022
[12]

Z. Luo, J. Cao, J. Merel, A. Winkler, J. Huang, K. Kitani, and W. Xu. Universal humanoid motion representations for physics-based control. InInternational Conference on Learning Representations, volume 2024, pages 56766–56782, 2024

2024
[13]

Z. Luo, J. Cao, K. Kitani, W. Xu, et al. Perpetual humanoid control for real-time simulated avatars. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10895–10904, 2023

2023
[14]

Z. Chen, M. Ji, X. Cheng, X. Peng, X. B. Peng, and X. Wang. Gmt: General motion tracking for humanoid whole-body control.arXiv preprint arXiv:2506.14770, 2025. 10

work page arXiv 2025
[15]

Y . Mu, Z. Zhang, Y . Shi, M. Matsumoto, K. Imamura, G. Tevet, C. Guo, M. Taylor, C. Shu, P. Xi, et al. Smp: Reusable score-matching motion priors for physics-based character control. arXiv preprint arXiv:2512.03028, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[16]

X. B. Peng, E. Coumans, T. Zhang, T.-W. Lee, J. Tan, and S. Levine. Learning agile robotic locomotion skills by imitating animals.arXiv preprint arXiv:2004.00784, 2020

work page arXiv 2004
[17]

Aberman, P

K. Aberman, P. Li, D. Lischinski, O. Sorkine-Hornung, D. Cohen-Or, and B. Chen. Skeleton- aware networks for deep motion retargeting.ACM Transactions on Graphics, 39(4), Aug
[18]

doi:10.1145/3386569.3392462

ISSN 1557-7368. doi:10.1145/3386569.3392462. URLhttp://dx.doi.org/10. 1145/3386569.3392462

work page doi:10.1145/3386569.3392462
[19]

Q. Zhao, P. Li, W. Yifan, O. Sorkine-Hornung, and G. Wetzstein. Pose-to-motion: Cross- domain motion retargeting with pose prior, 2023. URLhttps://arxiv.org/abs/2310. 20249

2023
[20]

L.-H. Chen, Y . Zhang, Z. Yin, Z. Dou, X. Chen, J. Wang, T. Komura, and L. Zhang. Mo- tion2motion: Cross-topology motion transfer with sparse correspondence. InProceedings of the SIGGRAPH Asia 2025 Conference Papers, pages 1–11, 2025

2025
[21]

S. Liu, M. Wang, B. Dai, and C. Lu. Palum: Part-based attention learning for unified motion retargeting, 2026. URLhttps://arxiv.org/abs/2601.07272

work page arXiv 2026
[22]

L. Hu, Z. Zhang, C. Zhong, B. Jiang, and S. Xia. Pose-aware attention network for flexible motion retargeting by body part.IEEE Transactions on Visualization and Computer Graphics, 30(8):4792–4808, Aug. 2024. ISSN 2160-9306. doi:10.1109/tvcg.2023.3277918. URLhttp: //dx.doi.org/10.1109/TVCG.2023.3277918

work page doi:10.1109/tvcg.2023.3277918 2024
[23]

S. Kim, M. Sorokin, J. Lee, and S. Ha. Humanconquad: human motion control of quadrupedal robots using deep reinforcement learning. InSIGGRAPH Asia 2022 Emerging Technologies, pages 1–2. 2022

2022
[24]

T. Li, J. Won, A. Clegg, J. Kim, A. Rai, and S. Ha. Ace: Adversarial correspondence em- bedding for cross morphology motion retargeting from human to nonhuman characters. In SIGGRAPH Asia 2023 Conference Papers, pages 1–11, 2023

2023
[25]

T. Yoon, D. Kang, S. Kim, J. Cheng, M. Ahn, S. Coros, and S. Choi. Spatio-temporal motion retargeting for quadruped robots.IEEE Transactions on Robotics, 2025

2025
[26]

T. Yang, S. He, H. Jing, J. Yang, Z. Liu, C. Zou, and Y . Wang. Fast sam 3d body: Accelerating sam 3d body for real-time full-body human mesh recovery, 2026. URLhttps://arxiv. org/abs/2603.15603

work page arXiv 2026
[27]

J. P. Araujo, Y . Ze, P. Xu, J. Wu, and C. K. Liu. Retargeting matters: General motion retargeting for humanoid motion tracking.arXiv preprint arXiv:2510.02252, 2025

work page arXiv 2025
[28]

Mahmood, N

N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. J. Black. AMASS: Archive of motion capture as surface shapes. InInternational Conference on Computer Vision, pages 5442–5451, Oct. 2019

2019
[29]

F. G. Harvey, M. Yurick, D. Nowrouzezahrai, and C. Pal. Robust motion in-betweening.ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), 39(4), 2020

2020
[30]

S. Sood, L. Nakhwa, S. Ge, Y . Cao, J. Cheng, F. Zargarbashi, T. Yoon, S. Choi, S. Coros, and G. Sartoretti. Apex: Action priors enable efficient exploration for robust motion tracking on legged robots, 2025. URLhttps://arxiv.org/abs/2505.10022

work page internal anchor Pith review Pith/arXiv arXiv 2025
[31]

ACM Trans

M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M. J. Black. Smpl: a skinned multi- person linear model.ACM Trans. Graph., 34(6), Nov. 2015. ISSN 0730-0301. doi:10.1145/ 2816795.2818013. URLhttps://doi.org/10.1145/2816795.2818013. 11

work page doi:10.1145/2816795.2818013 2015
[32]

S. Sood, G. Sun, P. Li, and G. Sartoretti. Decap : Decaying action priors for accelerated imitation learning of torque-based legged locomotion policies. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2809–2815, 2024. doi:10.1109/ IROS58592.2024.10802000. 12 Appendix A Retargeting Specifications X-Morph separates m...

work page arXiv 2024

[1] [1]

X. B. Peng, P. Abbeel, S. Levine, and M. van de Panne. Deepmimic: example-guided deep reinforcement learning of physics-based character skills.ACM Transactions on Graphics, 37 (4):1–14, 2018. ISSN 1557-7368. doi:10.1145/3197517.3201311. URLhttp://dx.doi. org/10.1145/3197517.3201311

work page doi:10.1145/3197517.3201311 2018

[2] [2]

BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion

K. Sreenath, C. K. Liu, T. E. Truong, Q. Liao, X. Huang, and G. Tevet. Beyondmimic: From motion tracking to versatile humanoid control via guided diffusion, 2025. URLhttps:// arxiv.org/abs/2508.08241

work page internal anchor Pith review Pith/arXiv arXiv 2025

[3] [3]

H. Weng, Y . Li, N. Sobanbabu, Z. Wang, Z. Luo, T. He, D. Ramanan, and G. Shi. Hdmi: Learning interactive humanoid whole-body control from human videos.arXiv preprint arXiv:2509.16757, 2025

work page arXiv 2025

[4] [4]

Z. Luo, Y . Yuan, T. Wang, C. Li, F. Casta˜neda, S. Chen, Z.-A. Cao, J. Li, D. Minor, Q. Ben, J. Park, D. Sami, Z. Wang, X. Da, R. Ding, C. Hogg, L. Song, E. Lim, E. Jeong, T. He, H. Xue, W. Xiao, S. Yuen, J. Kautz, Y . Chang, U. Iqbal, L. J. Fan, and Y . Zhu. Sonic: Supersizing motion tracking for natural humanoid whole-body control, 2026. URLhttps://arx...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[5] [5]

B.; Jiang, Y.; Wang, T.; Iqbal, U.; Minor, D.; de Ruyter, M.; et al

D. Rempe, M. Petrovich, Y . Yuan, H. Zhang, X. B. Peng, Y . Jiang, T. Wang, U. Iqbal, D. Minor, M. de Ruyter, J. Li, C. Tessler, E. Lim, E. Jeong, S. Wu, E. Hassani, M. Huang, J.-B. Yu, C. Chung, L. Song, O. Dionne, J. Kautz, S. Yuen, and S. Fidler. Kimodo: Scaling controllable human motion generation, 2026. URLhttps://arxiv.org/abs/2603.15546

work page arXiv 2026

[6] [6]

T. Li, H. Jung, M. Gombolay, Y . Cho, and S. Ha. Crossloco: Human motion driven control of legged robots via guided unsupervised reinforcement learning. InInternational Conference on Learning Representations, volume 2024, pages 46892–46905, 2024

2024

[7] [7]

W. Kim, T. Li, and S. Ha. Moreflow: Motion retargeting learning through unsupervised flow matching, 2025. URLhttps://arxiv.org/abs/2509.25600

work page arXiv 2025

[8] [8]

Zhang, T

L. Zhang, T. Komura, Z. Dou, J. Wang, L.-H. Chen, X. Chen, Y . Zhang, and Z. Yin. Motion2motion: Cross-topology motion transfer with sparse correspondence, 2025. URL https://arxiv.org/abs/2508.13139

work page arXiv 2025

[9] [9]

A. H. Bermano, D. Cohen-Or, G. Tevet, S. Raab, I. Gat, and Y . Reshef. Anytop: Character animation diffusion with any topology, 2025. URLhttps://arxiv.org/abs/2502.17327

work page arXiv 2025

[10] [10]

X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa. Amp.ACM Transactions on Graphics (TOG), 40:1 – 20, 2021. URLhttps://api.semanticscholar.org/CorpusID: 233033739

2021

[11] [11]

X. B. Peng, Y . Guo, L. Halper, S. Levine, and S. Fidler. Ase: Large-scale reusable adversarial skill embeddings for physically simulated characters.ACM Transactions On Graphics (TOG), 41(4):1–17, 2022

2022

[12] [12]

Z. Luo, J. Cao, J. Merel, A. Winkler, J. Huang, K. Kitani, and W. Xu. Universal humanoid motion representations for physics-based control. InInternational Conference on Learning Representations, volume 2024, pages 56766–56782, 2024

2024

[13] [13]

Z. Luo, J. Cao, K. Kitani, W. Xu, et al. Perpetual humanoid control for real-time simulated avatars. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10895–10904, 2023

2023

[14] [14]

Z. Chen, M. Ji, X. Cheng, X. Peng, X. B. Peng, and X. Wang. Gmt: General motion tracking for humanoid whole-body control.arXiv preprint arXiv:2506.14770, 2025. 10

work page arXiv 2025

[15] [15]

Y . Mu, Z. Zhang, Y . Shi, M. Matsumoto, K. Imamura, G. Tevet, C. Guo, M. Taylor, C. Shu, P. Xi, et al. Smp: Reusable score-matching motion priors for physics-based character control. arXiv preprint arXiv:2512.03028, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[16] [16]

X. B. Peng, E. Coumans, T. Zhang, T.-W. Lee, J. Tan, and S. Levine. Learning agile robotic locomotion skills by imitating animals.arXiv preprint arXiv:2004.00784, 2020

work page arXiv 2004

[17] [17]

Aberman, P

K. Aberman, P. Li, D. Lischinski, O. Sorkine-Hornung, D. Cohen-Or, and B. Chen. Skeleton- aware networks for deep motion retargeting.ACM Transactions on Graphics, 39(4), Aug

[18] [18]

doi:10.1145/3386569.3392462

ISSN 1557-7368. doi:10.1145/3386569.3392462. URLhttp://dx.doi.org/10. 1145/3386569.3392462

work page doi:10.1145/3386569.3392462

[19] [19]

Q. Zhao, P. Li, W. Yifan, O. Sorkine-Hornung, and G. Wetzstein. Pose-to-motion: Cross- domain motion retargeting with pose prior, 2023. URLhttps://arxiv.org/abs/2310. 20249

2023

[20] [20]

L.-H. Chen, Y . Zhang, Z. Yin, Z. Dou, X. Chen, J. Wang, T. Komura, and L. Zhang. Mo- tion2motion: Cross-topology motion transfer with sparse correspondence. InProceedings of the SIGGRAPH Asia 2025 Conference Papers, pages 1–11, 2025

2025

[21] [21]

S. Liu, M. Wang, B. Dai, and C. Lu. Palum: Part-based attention learning for unified motion retargeting, 2026. URLhttps://arxiv.org/abs/2601.07272

work page arXiv 2026

[22] [22]

L. Hu, Z. Zhang, C. Zhong, B. Jiang, and S. Xia. Pose-aware attention network for flexible motion retargeting by body part.IEEE Transactions on Visualization and Computer Graphics, 30(8):4792–4808, Aug. 2024. ISSN 2160-9306. doi:10.1109/tvcg.2023.3277918. URLhttp: //dx.doi.org/10.1109/TVCG.2023.3277918

work page doi:10.1109/tvcg.2023.3277918 2024

[23] [23]

S. Kim, M. Sorokin, J. Lee, and S. Ha. Humanconquad: human motion control of quadrupedal robots using deep reinforcement learning. InSIGGRAPH Asia 2022 Emerging Technologies, pages 1–2. 2022

2022

[24] [24]

T. Li, J. Won, A. Clegg, J. Kim, A. Rai, and S. Ha. Ace: Adversarial correspondence em- bedding for cross morphology motion retargeting from human to nonhuman characters. In SIGGRAPH Asia 2023 Conference Papers, pages 1–11, 2023

2023

[25] [25]

T. Yoon, D. Kang, S. Kim, J. Cheng, M. Ahn, S. Coros, and S. Choi. Spatio-temporal motion retargeting for quadruped robots.IEEE Transactions on Robotics, 2025

2025

[26] [26]

T. Yang, S. He, H. Jing, J. Yang, Z. Liu, C. Zou, and Y . Wang. Fast sam 3d body: Accelerating sam 3d body for real-time full-body human mesh recovery, 2026. URLhttps://arxiv. org/abs/2603.15603

work page arXiv 2026

[27] [27]

J. P. Araujo, Y . Ze, P. Xu, J. Wu, and C. K. Liu. Retargeting matters: General motion retargeting for humanoid motion tracking.arXiv preprint arXiv:2510.02252, 2025

work page arXiv 2025

[28] [28]

Mahmood, N

N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. J. Black. AMASS: Archive of motion capture as surface shapes. InInternational Conference on Computer Vision, pages 5442–5451, Oct. 2019

2019

[29] [29]

F. G. Harvey, M. Yurick, D. Nowrouzezahrai, and C. Pal. Robust motion in-betweening.ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH), 39(4), 2020

2020

[30] [30]

S. Sood, L. Nakhwa, S. Ge, Y . Cao, J. Cheng, F. Zargarbashi, T. Yoon, S. Choi, S. Coros, and G. Sartoretti. Apex: Action priors enable efficient exploration for robust motion tracking on legged robots, 2025. URLhttps://arxiv.org/abs/2505.10022

work page internal anchor Pith review Pith/arXiv arXiv 2025

[31] [31]

ACM Trans

M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M. J. Black. Smpl: a skinned multi- person linear model.ACM Trans. Graph., 34(6), Nov. 2015. ISSN 0730-0301. doi:10.1145/ 2816795.2818013. URLhttps://doi.org/10.1145/2816795.2818013. 11

work page doi:10.1145/2816795.2818013 2015

[32] [32]

S. Sood, G. Sun, P. Li, and G. Sartoretti. Decap : Decaying action priors for accelerated imitation learning of torque-based legged locomotion policies. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2809–2815, 2024. doi:10.1109/ IROS58592.2024.10802000. 12 Appendix A Retargeting Specifications X-Morph separates m...

work page arXiv 2024