GAIN: Missing Data Imputation using Generative Adversarial Nets

James Jordon; Jinsung Yoon; Mihaela van der Schaar

arxiv: 1806.02920 · v1 · pith:GLLRK6HB · submitted 2018-06-07 · cs.LG · stat.ML

GAIN: Missing Data Imputation using Generative Adversarial Nets

Jinsung Yoon , James Jordon , Mihaela van der Schaar This is my paper

Reviewed by Pith T0 review T1 audit T2 compute T3 formal T4 kernel pith:GLLRK6HB record.json open to challenge →

classification cs.LG stat.ML

keywords componentsdataimputationvectoradversarialgaingenerativehint

0 comments

read the original abstract

We propose a novel method for imputing missing data by adapting the well-known Generative Adversarial Nets (GAN) framework. Accordingly, we call our method Generative Adversarial Imputation Nets (GAIN). The generator (G) observes some components of a real data vector, imputes the missing components conditioned on what is actually observed, and outputs a completed vector. The discriminator (D) then takes a completed vector and attempts to determine which components were actually observed and which were imputed. To ensure that D forces G to learn the desired distribution, we provide D with some additional information in the form of a hint vector. The hint reveals to D partial information about the missingness of the original sample, which is used by D to focus its attention on the imputation quality of particular components. This hint ensures that G does in fact learn to generate according to the true data distribution. We tested our method on various datasets and found that GAIN significantly outperforms state-of-the-art imputation methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Latent Diffusion for Missing Data
cs.LG 2026-05 unverdicted novelty 5.0

A VAE-based latent diffusion model trained on incomplete data maintains sample quality and imputation performance up to 50% missingness while pixel-space diffusion degrades.