Video generation based on full-sequence diffusion models currently achieves better overall quality than autoregressive next-token prediction
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it