Rethinking diffusion for text-driven human motion generation

· 2024 · arXiv 2411.16575

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

ScaleMoGen: Autoregressive Next-Scale Prediction for Human Motion Generation

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

ScaleMoGen introduces a scale-wise autoregressive framework that quantizes motions into hierarchical discrete tokens and predicts next-scale maps to achieve SOTA FID 0.030 on HumanML3D and text-guided editing.

SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control

cs.RO · 2025-11-11 · unverdicted · novelty 6.0

Scaling motion tracking models along size, data volume, and compute produces a foundation model for natural, robust humanoid whole-body control with downstream uses in kinematic planning and vision-language-action models.

MARRS: Masked Autoregressive Unit-based Reaction Synthesis

cs.CV · 2025-05-16 · unverdicted · novelty 6.0

MARRS synthesizes fine-grained reaction motions via unit-distinguished VAE, masked action-conditioned fusion, mutual unit modulation, and compact MLP diffusion predictors.

Next-Scale Autoregressive Models for Text-to-Motion Generation

cs.CV · 2026-04-04

citing papers explorer

Showing 4 of 4 citing papers.

ScaleMoGen: Autoregressive Next-Scale Prediction for Human Motion Generation cs.CV · 2026-05-12 · unverdicted · none · ref 28
ScaleMoGen introduces a scale-wise autoregressive framework that quantizes motions into hierarchical discrete tokens and predicts next-scale maps to achieve SOTA FID 0.030 on HumanML3D and text-guided editing.
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control cs.RO · 2025-11-11 · unverdicted · none · ref 36
Scaling motion tracking models along size, data volume, and compute produces a foundation model for natural, robust humanoid whole-body control with downstream uses in kinematic planning and vision-language-action models.
MARRS: Masked Autoregressive Unit-based Reaction Synthesis cs.CV · 2025-05-16 · unverdicted · none · ref 39
MARRS synthesizes fine-grained reaction motions via unit-distinguished VAE, masked action-conditioned fusion, mutual unit modulation, and compact MLP diffusion predictors.
Next-Scale Autoregressive Models for Text-to-Motion Generation cs.CV · 2026-04-04 · unreviewed · ref 31

Rethinking diffusion for text-driven human motion generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer