ScaleMoGen introduces a scale-wise autoregressive framework that quantizes motions into hierarchical discrete tokens and predicts next-scale maps to achieve SOTA FID 0.030 on HumanML3D and text-guided editing.
T2m-gpt: Generating human motion from textual descriptions with discrete representations
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4representative citing papers
ExpertEdit edits novice motions to expert skill levels by learning a motion prior from unpaired videos and infilling masked skill-critical spans.
SDFlow learns a global transport map via similarity-driven flow matching in VQ latent space, using low-rank manifold decomposition and a categorical posterior to handle discreteness, yielding SOTA long-horizon performance and inference speedups.
MSDformer introduces a multi-scale discrete transformer that tokenizes time series at multiple scales and models them autoregressively in discrete space, claiming superior performance over prior DTM methods with rate-distortion theoretical support.
citing papers explorer
-
ScaleMoGen: Autoregressive Next-Scale Prediction for Human Motion Generation
ScaleMoGen introduces a scale-wise autoregressive framework that quantizes motions into hierarchical discrete tokens and predicts next-scale maps to achieve SOTA FID 0.030 on HumanML3D and text-guided editing.
-
ExpertEdit: Learning Skill-Aware Motion Editing from Expert Videos
ExpertEdit edits novice motions to expert skill levels by learning a motion prior from unpaired videos and infilling masked skill-critical spans.
-
SDFlow: Similarity-Driven Flow Matching for Time Series Generation
SDFlow learns a global transport map via similarity-driven flow matching in VQ latent space, using low-rank manifold decomposition and a categorical posterior to handle discreteness, yielding SOTA long-horizon performance and inference speedups.
-
MSDformer: Multi-scale Discrete Transformer For Time Series Generation
MSDformer introduces a multi-scale discrete transformer that tokenizes time series at multiple scales and models them autoregressively in discrete space, claiming superior performance over prior DTM methods with rate-distortion theoretical support.