Video Pixel Networks

Nal Kalchbrenner , Aaron van den Oord , Karen Simonyan , Ivo Danihelka , Oriol Vinyals , Alex Graves , Koray Kavukcuoglu

Authors on Pith no claims yet

classification 💻 cs.CV cs.LG

keywords videopixelbenchmarkmodelaction-conditionalapproachesarchitecturebest

0 comments

read the original abstract

We propose a probabilistic video model, the Video Pixel Network (VPN), that estimates the discrete joint distribution of the raw pixel values in a video. The model and the neural architecture reflect the time, space and color structure of video tensors and encode it as a four-dimensional dependency chain. The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth. The VPN also produces detailed samples on the action-conditional Robotic Pushing benchmark and generalizes to the motion of novel objects.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

VideoGPT: Video Generation using VQ-VAE and Transformers
cs.CV 2021-04 accept novelty 6.0

VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.