Gen- L-Video: multi-text to long video generation via temporal co-denoising

Fu-Yun Wang, Wenshuo Chen, Guanglu Song, Han-Jia Ye, Yu Liu, Hongsheng Li · 2023 · arXiv 2305.18264

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

DCR: Counterfactual Attractor Guidance for Rare Compositional Generation

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

DCR uses a counterfactual attractor and projection-based repulsion to suppress default completion bias in diffusion models, improving fidelity for rare compositional prompts while preserving quality.

FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

FreeSpec uses SVD-based spectral reconstruction to fuse global low-rank and local high-rank features, reducing content drift and preserving temporal dynamics in long video generation.

DrawVideo: Generating Long Video from Storyboard Keyframe Sketches

cs.GR · 2026-05-22 · unverdicted · novelty 6.0

DrawVideo is a sketch-guided framework that decomposes long videos into controllable shots using keyframe sketches, appearance prompts, and motion prompts, supported by a new SketchLongVideo dataset.

Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

MIGA introduces two-stage alignment to close train-inference gaps and dual consistency enhancement via self-reflection and long-range guidance to achieve SOTA temporal consistency in infinite-frame video generation on VBench and NarrLV.

RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling

cs.CV · 2025-10-23 · unverdicted · novelty 6.0

RAPO++ is a three-stage prompt optimization framework combining retrieval-augmented refinement, closed-loop test-time scaling, and LLM fine-tuning to enhance text-to-video generation quality.

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

cs.CV · 2025-03-25 · unverdicted · novelty 6.0

FAR baseline plus asymmetric kernels for long short-term context modeling achieves SOTA short and long video generation in autoregressive setups.

Character-Centered Dialogue Generation from Scene-Level Prompts

cs.CV · 2025-05-22 · unverdicted · novelty 4.0

A training-free framework generates expressive, character-grounded dialogue and speech from scene prompts using vision-language encoders, LLMs, and a recursive narrative memory bank for cross-scene consistency.

Scene-Action Prompt Fusion for Coherent Text-to-Video Storytelling

cs.CV · 2025-03-08 · unverdicted · novelty 3.0

A prompt fusion approach combines bidirectional time-weighted latent blending, dynamics-informed prompt weighting via CLIP, and semantic action representations to produce temporally consistent long videos from text without retraining.

TIE: Time Interval Encoding for Video Generation over Events

cs.CV · 2026-05-11

citing papers explorer

Showing 9 of 9 citing papers.

DCR: Counterfactual Attractor Guidance for Rare Compositional Generation cs.CV · 2026-05-07 · unverdicted · none · ref 39
DCR uses a counterfactual attractor and projection-based repulsion to suppress default completion bias in diffusion models, improving fidelity for rare compositional prompts while preserving quality.
FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction cs.CV · 2026-05-07 · unverdicted · none · ref 13
FreeSpec uses SVD-based spectral reconstruction to fuse global low-rank and local high-rank features, reducing content drift and preserving temporal dynamics in long video generation.
DrawVideo: Generating Long Video from Storyboard Keyframe Sketches cs.GR · 2026-05-22 · unverdicted · none · ref 38
DrawVideo is a sketch-guided framework that decomposes long videos into controllable shots using keyframe sketches, appearance prompts, and motion prompts, supported by a new SketchLongVideo dataset.
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos cs.CV · 2026-05-18 · unverdicted · none · ref 25
MIGA introduces two-stage alignment to close train-inference gaps and dual consistency enhancement via self-reflection and long-range guidance to achieve SOTA temporal consistency in infinite-frame video generation on VBench and NarrLV.
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling cs.CV · 2025-10-23 · unverdicted · none · ref 78
RAPO++ is a three-stage prompt optimization framework combining retrieval-augmented refinement, closed-loop test-time scaling, and LLM fine-tuning to enhance text-to-video generation quality.
Long-Context Autoregressive Video Modeling with Next-Frame Prediction cs.CV · 2025-03-25 · unverdicted · none · ref 15
FAR baseline plus asymmetric kernels for long short-term context modeling achieves SOTA short and long video generation in autoregressive setups.
Character-Centered Dialogue Generation from Scene-Level Prompts cs.CV · 2025-05-22 · unverdicted · none · ref 53
A training-free framework generates expressive, character-grounded dialogue and speech from scene prompts using vision-language encoders, LLMs, and a recursive narrative memory bank for cross-scene consistency.
Scene-Action Prompt Fusion for Coherent Text-to-Video Storytelling cs.CV · 2025-03-08 · unverdicted · none · ref 35
A prompt fusion approach combines bidirectional time-weighted latent blending, dynamics-informed prompt weighting via CLIP, and semantic action representations to produce temporally consistent long videos from text without retraining.
TIE: Time Interval Encoding for Video Generation over Events cs.CV · 2026-05-11 · unreviewed · ref 33

Gen- L-Video: multi-text to long video generation via temporal co-denoising

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer