DRAW: A Recurrent Neural Network For Image Generation

Alex Graves; Daan Wierstra; Danilo Jimenez Rezende; Ivo Danihelka; Karol Gregor

arxiv: 1502.04623 · v2 · pith:2SOCW6LGnew · submitted 2015-02-16 · 💻 cs.CV · cs.LG· cs.NE

DRAW: A Recurrent Neural Network For Image Generation

Karol Gregor , Ivo Danihelka , Alex Graves , Danilo Jimenez Rezende , Daan Wierstra This is my paper

classification 💻 cs.CV cs.LGcs.NE

keywords drawgenerationimageimagesnetworkneuralrecurrentallows

0 comments

read the original abstract

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images. The system substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Adaptive Computation Time for Recurrent Neural Networks
cs.NE 2016-03 accept novelty 8.0

ACT lets RNNs dynamically adapt computation depth per input via a differentiable halting unit, yielding large gains on synthetic tasks and structural insights on language data.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
cs.LG 2015-11 accept novelty 8.0

DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.
Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks
cs.LG 2019-07 unverdicted novelty 6.0

Proposes GP-RNN model using RNNs for nonlinear non-Markovian dynamics and GPs for embedding, with bi-LSTM inference, that outperforms prior methods on neural data especially with limited samples.
RobustTP: End-to-End Trajectory Prediction for Heterogeneous Road-Agents in Dense Traffic with Noisy Sensor Inputs
cs.RO 2019-07 unverdicted novelty 4.0

RobustTP uses a non-linear motion model plus instance segmentation to create noisy trajectories, then an LSTM-CNN to predict 5-second future positions of heterogeneous agents in dense traffic, claiming up to 18% ADE a...
Autoencoding sensory substitution
q-bio.NC 2019-07 unverdicted novelty 4.0

Deep recurrent autoencoders convert images to shortened audio signals that incorporate hearing models, enabling above-chance hand posture discrimination and object reaching after a few hours of training instead of months.