Delta tokens compress VFM feature differences into single tokens, enabling a lightweight generative world model that predicts diverse futures with far lower compute than existing approaches.
RoFormer: Enhanced Transformer with Rotary Position Embedding.Neurocomputing, 568:127063,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2representative citing papers
A4Mer learns compositional Action Atoms and Motifs from unlabeled 3D pose data via masked token prediction in a nested Transformer, enabling better human behavior modeling.
citing papers explorer
-
A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens
Delta tokens compress VFM feature differences into single tokens, enabling a lightweight generative world model that predicts diverse futures with far lower compute than existing approaches.
-
Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements
A4Mer learns compositional Action Atoms and Motifs from unlabeled 3D pose data via masked token prediction in a nested Transformer, enabling better human behavior modeling.