pith. sign in

hub

arXiv preprint arXiv:2510.14648 , year=

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

hub tools

citation-role summary

background 2 baseline 2

citation-polarity summary

fields

cs.CV 13

years

2026 12 2025 1

verdicts

UNVERDICTED 13

clear filters

representative citing papers

VideoCoF: Unified Video Editing with Temporal Reasoner

cs.CV · 2025-12-08 · unverdicted · novelty 7.0

VideoCoF adds an explicit reasoning step using edit-region latents in video diffusion models to enable precise mask-free editing and motion alignment with only 50k training pairs.

SpongeBob: Sync-Aware Harmonious Audio-Visual Generative Editing

cs.CV · 2026-05-24 · unverdicted · novelty 6.0

SpongeBob introduces the first end-to-end audio-visual joint editing framework using sync-aware bidirectional attention and context-aware modules, plus a new dataset and benchmark, claiming 30% Sync-C and 12.5% Ctx-F1 gains over baselines.

Bernini: Latent Semantic Planning for Video Diffusion

cs.CV · 2026-05-21 · unverdicted · novelty 5.0

Bernini is a framework that uses an MLLM planner to output semantic representations for a DiT renderer to generate or edit videos, reporting SOTA benchmark performance.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • VideoCoF: Unified Video Editing with Temporal Reasoner cs.CV · 2025-12-08 · unverdicted · none · ref 19

    VideoCoF adds an explicit reasoning step using edit-region latents in video diffusion models to enable precise mask-free editing and motion alignment with only 50k training pairs.