Omniv2v: Versatile video generation and editing via dynamic content manipulation

Liang, S · 2025 · arXiv 2506.01801

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing

cs.CV · 2026-06-29 · unverdicted · novelty 7.0 · 2 refs

Goku provides a 2M-pair dataset for multi-task structural video editing, Goku-Edit model with MLLM and dual-branch design, and Goku-Bench yielding up to 8% gains in instruction following.

SpongeBob: Sync-Aware Harmonious Audio-Visual Generative Editing

cs.CV · 2026-05-24 · unverdicted · novelty 6.0

SpongeBob introduces the first end-to-end audio-visual joint editing framework using sync-aware bidirectional attention and context-aware modules, plus a new dataset and benchmark, claiming 30% Sync-C and 12.5% Ctx-F1 gains over baselines.

LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

LIVEditor-14B applies a new sparse attention method (ISA) that prunes context and uses query-sharpness routing to cut attention latency ~60% with no loss in editing quality on standard benchmarks.

InsEdit: Towards Instruction-based Visual Editing via Data-Efficient Video Diffusion Models Adaptation

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

InsEdit adapts a video diffusion backbone for text-instruction video editing via Mutual Context Attention, achieving SOTA open-source results with O(100K) data while also supporting image editing.

Image-to-Video Diffusion: From Foundations to Open Frontiers

cs.CV · 2026-05-17 · unverdicted · novelty 3.0

A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.

Evolution of Video Generative Foundations

cs.CV · 2026-04-07 · unverdicted · novelty 2.0

This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

citing papers explorer

Showing 6 of 6 citing papers after filters.

Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing cs.CV · 2026-06-29 · unverdicted · none · ref 21 · 2 links
Goku provides a 2M-pair dataset for multi-task structural video editing, Goku-Edit model with MLLM and dual-branch design, and Goku-Bench yielding up to 8% gains in instruction following.
SpongeBob: Sync-Aware Harmonious Audio-Visual Generative Editing cs.CV · 2026-05-24 · unverdicted · none · ref 8
SpongeBob introduces the first end-to-end audio-visual joint editing framework using sync-aware bidirectional attention and context-aware modules, plus a new dataset and benchmark, claiming 30% Sync-C and 12.5% Ctx-F1 gains over baselines.
LIVEditor-14B: Lightning Unified Video Editing via In-Context Sparse Attention cs.CV · 2026-05-06 · unverdicted · none · ref 25
LIVEditor-14B applies a new sparse attention method (ISA) that prunes context and uses query-sharpness routing to cut attention latency ~60% with no loss in editing quality on standard benchmarks.
InsEdit: Towards Instruction-based Visual Editing via Data-Efficient Video Diffusion Models Adaptation cs.CV · 2026-04-09 · unverdicted · none · ref 20
InsEdit adapts a video diffusion backbone for text-instruction video editing via Mutual Context Attention, achieving SOTA open-source results with O(100K) data while also supporting image editing.
Image-to-Video Diffusion: From Foundations to Open Frontiers cs.CV · 2026-05-17 · unverdicted · none · ref 146
A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.
Evolution of Video Generative Foundations cs.CV · 2026-04-07 · unverdicted · none · ref 275
This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

Omniv2v: Versatile video generation and editing via dynamic content manipulation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer