Consisti2v: Enhancing visual consistency for image-to-video generation

· 2024 · arXiv 2402.04324

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 3 baseline 1

citation-polarity summary

background 3 baseline 1

representative citing papers

CaC: Advancing Video Reward Models via Hierarchical Spatiotemporal Concentrating

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

CaC is a hierarchical spatiotemporal concentrating reward model for video anomalies that reports 25.7% accuracy gains on fine-grained benchmarks and 11.7% anomaly reduction in generated videos via a new dataset and GRPO training with temporal/spatial IoU rewards.

Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation

cs.CV · 2026-04-28 · unverdicted · novelty 6.0

Mutual Forcing trains a single native autoregressive audio-video model with mutually reinforcing few-step and multi-step modes via self-distillation to match 50-step baselines at 4-8 steps.

Show-o2: Improved Native Unified Multimodal Models

cs.CV · 2025-06-18 · unverdicted · novelty 4.0

Show-o2 unifies text, image, and video understanding and generation in a single autoregressive-plus-flow-matching model built on 3D causal VAE representations.

Image-to-Video Diffusion: From Foundations to Open Frontiers

cs.CV · 2026-05-17 · unverdicted · novelty 3.0

A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.

Cosmos World Foundation Model Platform for Physical AI

cs.CV · 2025-01-07 · unverdicted · novelty 3.0

The Cosmos platform supplies open-source pre-trained world models and supporting tools for building fine-tunable digital world simulations to train Physical AI.

citing papers explorer

Showing 5 of 5 citing papers.

CaC: Advancing Video Reward Models via Hierarchical Spatiotemporal Concentrating cs.CV · 2026-05-12 · unverdicted · none · ref 33
CaC is a hierarchical spatiotemporal concentrating reward model for video anomalies that reports 25.7% accuracy gains on fine-grained benchmarks and 11.7% anomaly reduction in generated videos via a new dataset and GRPO training with temporal/spatial IoU rewards.
Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation cs.CV · 2026-04-28 · unverdicted · none · ref 35
Mutual Forcing trains a single native autoregressive audio-video model with mutually reinforcing few-step and multi-step modes via self-distillation to match 50-step baselines at 4-8 steps.
Show-o2: Improved Native Unified Multimodal Models cs.CV · 2025-06-18 · unverdicted · none · ref 91
Show-o2 unifies text, image, and video understanding and generation in a single autoregressive-plus-flow-matching model built on 3D causal VAE representations.
Image-to-Video Diffusion: From Foundations to Open Frontiers cs.CV · 2026-05-17 · unverdicted · none · ref 61
A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.
Cosmos World Foundation Model Platform for Physical AI cs.CV · 2025-01-07 · unverdicted · none · ref 165
The Cosmos platform supplies open-source pre-trained world models and supporting tools for building fine-tunable digital world simulations to train Physical AI.

Consisti2v: Enhancing visual consistency for image-to-video generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer