Self-Refining Video Sampling

Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Saining Xie, Jaehong Yoon, Sung Ju Hwang · 2026 · cs.CV · arXiv 2601.18577

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open full Pith review browse 4 citing papers arXiv PDF

abstract

Modern video generators still struggle with complex physical dynamics, often falling short of physical realism. Existing approaches address this using external verifiers or additional training on augmented data, which is computationally expensive and still limited in capturing fine-grained motion. In this work, we present self-refining video sampling, a simple method that uses a pre-trained video generator trained on large-scale datasets as its own self-refiner. By interpreting the generator as a denoising autoencoder, we enable iterative inner-loop refinement at inference time without any external verifier or additional training. We further introduce an uncertainty-aware refinement strategy that selectively refines regions based on self-consistency, which prevents artifacts caused by over-refinement. Experiments on state-of-the-art video generators demonstrate significant improvements in motion coherence and physics alignment, achieving over 70% human preference compared to the default sampler and guidance-based sampler.

citation-role summary

background 3 method 1

citation-polarity summary

background 3 use method 1

representative citing papers

CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models

cs.CV · 2026-05-09 · unverdicted · novelty 7.0

CollabVR improves video reasoning performance by coupling vision-language models and video generation models in a closed-loop step-level collaboration that detects and repairs generation failures.

$h$-control: Training-Free Camera Control via Block-Conditional Gibbs Refinement

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

h-control augments hard-replacement guidance with block-conditional pseudo-Gibbs refinement on unobserved latent sites and adaptive 3D patch freezing to achieve superior FVD on RealEstate10K and DAVIS.

Human Cognition in Machines: A Unified Perspective of World Models

cs.RO · 2026-04-17 · unverdicted · novelty 6.0

The paper introduces a unified framework for world models that fully incorporates all cognitive functions from Cognitive Architecture Theory, highlights under-researched areas in motivation and meta-cognition, and proposes Epistemic World Models as a new category for scientific discovery agents.

On the Robustness of Distribution Support under Diffusion Guidance

cs.LG · 2026-05-08 · unverdicted · novelty 4.0 · 2 refs

Establishes robustness of distribution support for guided diffusion processes under exact score access across DDIM, DDPM, and exponential integrator discretizations.

citing papers explorer

Showing 1 of 1 citing paper after filters.

On the Robustness of Distribution Support under Diffusion Guidance cs.LG · 2026-05-08 · unverdicted · none · ref 3 · 2 links · internal anchor
Establishes robustness of distribution support for guided diffusion processes under exact score access across DDIM, DDPM, and exponential integrator discretizations.

Self-Refining Video Sampling

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer