hub

Stable video infinity: Infinite-length video generation with error recycling.arXiv preprint arXiv:2510.09212

Li, W · 2025 · arXiv 2510.09212

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 method 2

citation-polarity summary

background 2 use method 2

representative citing papers

Towards Memory-Efficient Autoregressive Video Generation via Instance-Specific Parametric Absorption

cs.CV · 2026-07-01 · unverdicted · novelty 7.0

ISPA reduces KV cache size by up to 50% in AR video models by transitioning layers to local attention and applying instance-specific least-squares weight modulation to compensate for lost history.

MemLearner: Learning to Query Context memory for Video World Models

cs.CV · 2026-06-30 · unverdicted · novelty 7.0

MemLearner introduces a learning-based adaptive context query method using query tokens in video world models to improve long-term scene consistency over rule-based retrieval.

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.

Social-Mamba: Socially-Aware Trajectory Forecasting with State-Space Models

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Social-Mamba introduces a Cycle Mamba block and social triplet factorization to achieve state-of-the-art trajectory forecasting accuracy with linear-time social interaction modeling on five benchmarks.

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

cs.CV · 2026-05-05 · unverdicted · novelty 7.0

Stream-R1 improves distillation of autoregressive streaming video diffusion models by adaptively weighting supervision with a reward model at both rollout and per-pixel levels.

Efficient Video Diffusion Models: Advancements and Challenges

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.

MetaWorld: Scaling Multi-Agent Video World Model from Single-view Video Data

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

MetaWorld scales multi-agent video world models from single-view videos using monocular decomposition into ego-motion and trajectories, subject-aware generation, and cross-attention alignment for consistency.

Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

Robust Dreamer uses Latent Gaussian Memory anchored to diffusion latents and Deviation Learning with a Dynamic Deviation Archive to reduce drift in long-horizon action-controlled image-to-video generation, reporting SOTA results on ScanNet, DL3DV, and OmniWorldGame.

StreamEdit: Training-Free Video Editing via Few-Step Streaming Video Generation

cs.CV · 2026-05-20 · unverdicted · novelty 6.0 · 2 refs

StreamEdit enables high-quality training-free video editing by adapting streaming video generation models with dual-branch fast sampling, self-attention bridge, cross-attention grounding, source-oriented guidance, and visual prompting, outperforming prior methods in few-step regimes.

EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

EverAnimate restores drifted latent flow trajectories in chunked video generation via persistent latent propagation and restorative flow matching, achieving measurable gains in PSNR, SSIM, LPIPS, and FID over prior long-animation methods with only LoRA tuning.

HorizonDrive: Self-Corrective Autoregressive World Model for Long-horizon Driving Simulation

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

HorizonDrive is a new anti-drifting autoregressive training and distillation method that enables minute-scale stable driving video rollouts by making the teacher model rollout-capable via scheduled rollout recovery and teacher rollout DMD.

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

cs.CV · 2025-12-04 · conditional · novelty 6.0

Reward Forcing combines EMA-Sink tokens and Rewarded Distribution Matching Distillation to deliver state-of-the-art streaming video generation at 23.1 FPS without copying initial frames.

Drift-Resistant Navigation World Model with Anchored Epipolar Guidance

cs.CV · 2026-05-23 · unverdicted · novelty 5.0

A generative navigation world model that uses sparse anchored rollout with epipolar constraints to reduce perceptual and geometric drift.

Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

cs.CV · 2026-04-10 · unverdicted · novelty 4.0

Matrix-Game 3.0 delivers 720p real-time video generation at 40 FPS with minute-scale memory consistency by combining residual self-correction training, camera-aware memory injection, and DMD-based autoregressive distillation on a 5B model.

EchoTorrent: Towards Swift, Sustained, and Streaming Multi-Modal Video Generation

cs.CV · 2026-02-14 · unverdicted · novelty 4.0

EchoTorrent combines multi-teacher distillation, adaptive CFG calibration, hybrid long-tail forcing, and VAE decoder refinement to enable few-pass autoregressive streaming video generation with improved temporal consistency and audio-lip sync.

Evolution of Video Generative Foundations

cs.CV · 2026-04-07 · unverdicted · novelty 2.0

This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

citing papers explorer

Showing 15 of 15 citing papers after filters.

Towards Memory-Efficient Autoregressive Video Generation via Instance-Specific Parametric Absorption cs.CV · 2026-07-01 · unverdicted · none · ref 19
ISPA reduces KV cache size by up to 50% in AR video models by transitioning layers to local attention and applying instance-specific least-squares weight modulation to compensate for lost history.
MemLearner: Learning to Query Context memory for Video World Models cs.CV · 2026-06-30 · unverdicted · none · ref 38
MemLearner introduces a learning-based adaptive context query method using query tokens in video world models to improve long-term scene consistency over rule-based retrieval.
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 38
LongLive-2.0 delivers an NVFP4 parallel infrastructure that enables direct training of long multi-shot autoregressive diffusion video models and achieves up to 2.15x training and 1.84x inference speedups on Blackwell and other GPUs.
Social-Mamba: Socially-Aware Trajectory Forecasting with State-Space Models cs.CV · 2026-05-14 · unverdicted · none · ref 27
Social-Mamba introduces a Cycle Mamba block and social triplet factorization to achieve state-of-the-art trajectory forecasting accuracy with linear-time social interaction modeling on five benchmarks.
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation cs.CV · 2026-05-05 · unverdicted · none · ref 19
Stream-R1 improves distillation of autoregressive streaming video diffusion models by adaptively weighting supervision with a reward model at both rollout and per-pixel levels.
Efficient Video Diffusion Models: Advancements and Challenges cs.CV · 2026-04-17 · unverdicted · none · ref 71
A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.
MetaWorld: Scaling Multi-Agent Video World Model from Single-view Video Data cs.CV · 2026-06-01 · unverdicted · none · ref 20
MetaWorld scales multi-agent video world models from single-view videos using monocular decomposition into ego-motion and trajectories, subject-aware generation, and cross-attention alignment for consistency.
Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generation cs.CV · 2026-05-29 · unverdicted · none · ref 35
Robust Dreamer uses Latent Gaussian Memory anchored to diffusion latents and Deviation Learning with a Dynamic Deviation Archive to reduce drift in long-horizon action-controlled image-to-video generation, reporting SOTA results on ScanNet, DL3DV, and OmniWorldGame.
StreamEdit: Training-Free Video Editing via Few-Step Streaming Video Generation cs.CV · 2026-05-20 · unverdicted · none · ref 43 · 2 links
StreamEdit enables high-quality training-free video editing by adapting streaming video generation models with dual-branch fast sampling, self-attention bridge, cross-attention grounding, source-oriented guidance, and visual prompting, outperforming prior methods in few-step regimes.
EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration cs.CV · 2026-05-14 · unverdicted · none · ref 16
EverAnimate restores drifted latent flow trajectories in chunked video generation via persistent latent propagation and restorative flow matching, achieving measurable gains in PSNR, SSIM, LPIPS, and FID over prior long-animation methods with only LoRA tuning.
HorizonDrive: Self-Corrective Autoregressive World Model for Long-horizon Driving Simulation cs.CV · 2026-05-12 · unverdicted · none · ref 11 · 2 links
HorizonDrive is a new anti-drifting autoregressive training and distillation method that enables minute-scale stable driving video rollouts by making the teacher model rollout-capable via scheduled rollout recovery and teacher rollout DMD.
Drift-Resistant Navigation World Model with Anchored Epipolar Guidance cs.CV · 2026-05-23 · unverdicted · none · ref 21
A generative navigation world model that uses sparse anchored rollout with epipolar constraints to reduce perceptual and geometric drift.
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory cs.CV · 2026-04-10 · unverdicted · none · ref 24
Matrix-Game 3.0 delivers 720p real-time video generation at 40 FPS with minute-scale memory consistency by combining residual self-correction training, camera-aware memory injection, and DMD-based autoregressive distillation on a 5B model.
EchoTorrent: Towards Swift, Sustained, and Streaming Multi-Modal Video Generation cs.CV · 2026-02-14 · unverdicted · none · ref 14
EchoTorrent combines multi-teacher distillation, adaptive CFG calibration, hybrid long-tail forcing, and VAE decoder refinement to enable few-pass autoregressive streaming video generation with improved temporal consistency and audio-lip sync.
Evolution of Video Generative Foundations cs.CV · 2026-04-07 · unverdicted · none · ref 279
This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

Stable video infinity: Infinite-length video generation with error recycling.arXiv preprint arXiv:2510.09212

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer