hub Canonical reference

Autoregressive adversarial post- training for real-time interactive video generation

Shanchuan Lin, Ceyuan Yang, Hao He, Jianwen Jiang, Yuxi Ren, Xin Xia, Yang Zhao, Xuefeng Xiao, Lu Jiang · 2025 · arXiv 2506.09350

Canonical reference. 100% of citing Pith papers cite this work as background.

13 Pith papers citing it

Background 100% of classified citations

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5

citation-polarity summary

background 5

representative citing papers

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

cs.CV · 2026-05-05 · unverdicted · novelty 7.0

Stream-R1 improves distillation of autoregressive streaming video diffusion models by adaptively weighting supervision with a reward model at both rollout and per-pixel levels.

Continuous Adversarial Flow Models

cs.LG · 2026-04-13 · unverdicted · novelty 6.0

Continuous adversarial flow models replace MSE in flow matching with adversarial training via a discriminator, improving guidance-free FID on ImageNet from 8.26 to 3.63 for SiT and similar gains for JiT and text-to-image benchmarks.

Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation

cs.CV · 2026-04-03 · unverdicted · novelty 6.0

Salt improves low-step video generation quality by adding endpoint-consistent regularization to distribution matching distillation and using cache-conditioned feature alignment for autoregressive models.

Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion

cs.CV · 2026-02-08 · unverdicted · novelty 6.0

Rolling Sink is a training-free cache adjustment technique that maintains visual consistency in autoregressive video diffusion models for ultra-long open-ended generation beyond training horizons.

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

cs.CV · 2025-12-16 · unverdicted · novelty 6.0

WorldPlay uses dual action representation, reconstituted context memory, and context forcing distillation to produce consistent 720p streaming video at 24 FPS for interactive world modeling.

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

cs.CV · 2025-12-04 · conditional · novelty 6.0

Reward Forcing combines EMA-Sink tokens and Rewarded Distribution Matching Distillation to deliver state-of-the-art streaming video generation at 23.1 FPS without copying initial frames.

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

cs.CV · 2025-10-02 · conditional · novelty 6.0

Self-Forcing++ scales autoregressive video diffusion to over 4 minutes by using self-generated segments for guidance, reducing error accumulation and outperforming baselines in fidelity and consistency.

Rolling Forcing: Autoregressive Long Video Diffusion in Real Time

cs.CV · 2025-09-29 · unverdicted · novelty 6.0

Rolling Forcing generates multi-minute videos in real time by jointly denoising frames at increasing noise levels, anchoring attention to early frames, and using windowed distillation to limit error accumulation.

TurboTalk: Progressive Distillation for One-Step Audio-Driven Talking Avatar Generation

cs.CV · 2026-04-16 · unverdicted · novelty 5.0

TurboTalk uses progressive distillation from 4 steps to 1 step with distribution matching and adversarial training to achieve 120x faster single-step audio-driven talking avatar video generation.

Matrix-game 2.0: An open-source real-time and streaming interactive world model

cs.CV · 2025-08-18 · unverdicted · novelty 5.0

Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.

Advancing Open-source World Models

cs.CV · 2026-01-28 · unverdicted · novelty 4.0

LingBot-World is presented as an open-source world model that delivers high-fidelity simulation, minute-level contextual consistency, and real-time interactivity under one second latency.

Evolution of Video Generative Foundations

cs.CV · 2026-04-07 · unverdicted · novelty 2.0

This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

citing papers explorer

Showing 13 of 13 citing papers.

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 40
LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation cs.CV · 2026-05-05 · unverdicted · none · ref 20
Stream-R1 improves distillation of autoregressive streaming video diffusion models by adaptively weighting supervision with a reward model at both rollout and per-pixel levels.
Continuous Adversarial Flow Models cs.LG · 2026-04-13 · unverdicted · none · ref 42
Continuous adversarial flow models replace MSE in flow matching with adversarial training via a discriminator, improving guidance-free FID on ImageNet from 8.26 to 3.63 for SiT and similar gains for JiT and text-to-image benchmarks.
Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation cs.CV · 2026-04-03 · unverdicted · none · ref 21
Salt improves low-step video generation quality by adding endpoint-consistent regularization to distribution matching distillation and using cache-conditioned feature alignment for autoregressive models.
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion cs.CV · 2026-02-08 · unverdicted · none · ref 60
Rolling Sink is a training-free cache adjustment technique that maintains visual consistency in autoregressive video diffusion models for ultra-long open-ended generation beyond training horizons.
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling cs.CV · 2025-12-16 · unverdicted · none · ref 39
WorldPlay uses dual action representation, reconstituted context memory, and context forcing distillation to produce consistent 720p streaming video at 24 FPS for interactive world modeling.
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation cs.CV · 2025-12-04 · conditional · none · ref 41
Reward Forcing combines EMA-Sink tokens and Rewarded Distribution Matching Distillation to deliver state-of-the-art streaming video generation at 23.1 FPS without copying initial frames.
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation cs.CV · 2025-10-02 · conditional · none · ref 35
Self-Forcing++ scales autoregressive video diffusion to over 4 minutes by using self-generated segments for guidance, reducing error accumulation and outperforming baselines in fidelity and consistency.
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time cs.CV · 2025-09-29 · unverdicted · none · ref 79
Rolling Forcing generates multi-minute videos in real time by jointly denoising frames at increasing noise levels, anchoring attention to early frames, and using windowed distillation to limit error accumulation.
TurboTalk: Progressive Distillation for One-Step Audio-Driven Talking Avatar Generation cs.CV · 2026-04-16 · unverdicted · none · ref 14
TurboTalk uses progressive distillation from 4 steps to 1 step with distribution matching and adversarial training to achieve 120x faster single-step audio-driven talking avatar video generation.
Matrix-game 2.0: An open-source real-time and streaming interactive world model cs.CV · 2025-08-18 · unverdicted · none · ref 26
Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.
Advancing Open-source World Models cs.CV · 2026-01-28 · unverdicted · none · ref 40
LingBot-World is presented as an open-source world model that delivers high-fidelity simulation, minute-level contextual consistency, and real-time interactivity under one second latency.
Evolution of Video Generative Foundations cs.CV · 2026-04-07 · unverdicted · none · ref 96
This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

Autoregressive adversarial post- training for real-time interactive video generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer