Revisiting Classifier Two-Sample Tests

David Lopez-Paz , Maxime Oquab

Authors on Pith no claims yet

classification 📊 stat.ML

keywords teststwo-samplec2stclassifierbinarydatasetdistributionexamples

read the original abstract

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis "$P = Q$" is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Proton Structure from Neural Simulation-Based Inference at the LHC
hep-ph 2026-04 unverdicted novelty 8.0

Neural simulation-based inference on unbinned top-quark pair data at 13 TeV yields improved gluon PDF precision over traditional binned analyses while incorporating experimental and theoretical uncertainties.
Bayesian Rain Field Reconstruction using Commercial Microwave Links and Diffusion Model Priors
cs.LG 2026-05 unverdicted novelty 7.0

Diffusion model priors enable training-free Bayesian sampling for more accurate rain field reconstruction from path-integrated commercial microwave link measurements than Gaussian process baselines.
Generative Modeling with Orbit-Space Particle Flow Matching
cs.GR 2026-05 unverdicted novelty 7.0

OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.
Pre-trained Tabular Foundation Models as Versatile Summary Networks for Neural Posterior Estimation
cs.LG 2026-05 unverdicted novelty 6.0

Pre-trained TabPFN acts as an effective training-free summary network for neural posterior estimation, matching or outperforming standard methods while preserving useful marginal and location information in the posterior.
Learning to Test: Physics-Informed Representation for Dynamical Instability Detection
cs.LG 2026-04 unverdicted novelty 6.0

A physics-informed neural representation is learned from safe data to support distributional hypothesis testing for dynamical instability in stochastic DAE systems without repeated simulations.
Demystifying MMD GANs
stat.ML 2018-01 accept novelty 6.0

MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.
Neural Posterior Estimation of Terrain Parameters from Radar Sounder Data
eess.SP 2026-05 unverdicted novelty 5.0

Neural posterior estimation trained on simulated radar data enables probabilistic inference of terrain parameters from real Mars radar sounder profiles while conditioning on reference surface assumptions.
Machine Learning Techniques for Astrophysics and Cosmology: Simulation-Based Inference
astro-ph.CO 2026-05 unverdicted novelty 2.0

Simulation-based inference uses neural networks trained on simulations to enable parameter inference in cosmology and astrophysics where traditional likelihood calculations are intractable.