Train short, inference long: Training-free horizon extension for autoregressive video generation

Li, J · 2026 · arXiv 2602.14027

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Towards Memory-Efficient Autoregressive Video Generation via Instance-Specific Parametric Absorption

cs.CV · 2026-07-01 · unverdicted · novelty 7.0

ISPA reduces KV cache size by up to 50% in AR video models by transitioning layers to local attention and applying instance-specific least-squares weight modulation to compensate for lost history.

Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

Future Forcing constructs a future query proxy from historical pre-RoPE statistics to score and merge KV tokens, improving subject consistency by up to 1.49 on VBench-Long for 60s AR video generation.

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.

RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

RAVEN aligns training and inference for causal autoregressive video diffusion via interleaved rollout repacking and introduces CM-GRPO for direct RL on consistency-model kernels, claiming better quality than recent baselines.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Towards Memory-Efficient Autoregressive Video Generation via Instance-Specific Parametric Absorption cs.CV · 2026-07-01 · unverdicted · none · ref 18
ISPA reduces KV cache size by up to 50% in AR video models by transitioning layers to local attention and applying instance-specific least-squares weight modulation to compensate for lost history.
Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation cs.CV · 2026-05-28 · unverdicted · none · ref 20
Future Forcing constructs a future query proxy from historical pre-RoPE statistics to score and merge KV tokens, improving subject consistency by up to 1.49 on VBench-Long for 60s AR video generation.
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 33
LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.
RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO cs.CV · 2026-05-14 · unverdicted · none · ref 52
RAVEN aligns training and inference for causal autoregressive video diffusion via interleaved rollout repacking and introduces CM-GRPO for direct RL on consistency-model kernels, claiming better quality than recent baselines.

Train short, inference long: Training-free horizon extension for autoregressive video generation

fields

years

verdicts

representative citing papers

citing papers explorer