Chai: Cache attention inference for text2video

Joel Mathew Cherian, Ashutosh Muralidhara Bharadwaj, Vima Gupta, Anand Padmanabha Iyer · 2026 · arXiv 2602.16132

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Not All Tokens Need 40 Steps: Heterogeneous Step Allocation in Diffusion Transformers for Efficient Video Generation

cs.CV · 2026-05-07 · unverdicted · novelty 6.0

HSA assigns variable denoising steps to spatiotemporal tokens in DiTs based on velocity dynamics, with KV-cache sync and cached Euler updates, outperforming prior caching methods on quality-runtime tradeoffs for T2V and I2V generation.

citing papers explorer

Showing 1 of 1 citing paper.

Not All Tokens Need 40 Steps: Heterogeneous Step Allocation in Diffusion Transformers for Efficient Video Generation cs.CV · 2026-05-07 · unverdicted · none · ref 33
HSA assigns variable denoising steps to spatiotemporal tokens in DiTs based on velocity dynamics, with KV-cache sync and cached Euler updates, outperforming prior caching methods on quality-runtime tradeoffs for T2V and I2V generation.

Chai: Cache attention inference for text2video

fields

years

verdicts

representative citing papers

citing papers explorer