hub

Revisiting Classifier Two-Sample Tests

· 2016 · stat.ML · arXiv 1610.06545

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

open full Pith review browse 13 citing papers arXiv PDF

abstract

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis "$P = Q$" is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Proton Structure from Neural Simulation-Based Inference at the LHC

hep-ph · 2026-04-14 · unverdicted · novelty 8.0

Neural simulation-based inference on unbinned top-quark pair data at 13 TeV yields improved gluon PDF precision over traditional binned analyses while incorporating experimental and theoretical uncertainties.

Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation

stat.ML · 2026-01-29 · unverdicted · novelty 7.0

Introduces the first amortized neural posterior estimator conditioned on both data and temperature β for generalized Bayesian inference, matching MCMC performance on standard SBI benchmarks.

Hard-Label Black-Box Attacks on 3D Point Clouds

cs.CV · 2024-11-30 · unverdicted · novelty 7.0

A spectrum-aware decision boundary algorithm enables effective hard-label black-box adversarial attacks on 3D point cloud models by fusing spectral information across classes and performing curvature-aware iterative optimization.

Generative Modeling with Orbit-Space Particle Flow Matching

cs.GR · 2026-05-04 · unverdicted · novelty 7.0

OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.

Asymptotically Optimal Tests for One- and Two-Sample Problems

cs.IT · 2026-01-16 · unverdicted · novelty 6.0

Hoeffding's relative entropy threshold test between empirical distributions is asymptotically optimal for both one- and two-sample hypothesis testing, with a strong converse for the two-sample case.

A discriminative approach for finding and characterizing positivity violations using decision trees

stat.ML · 2019-07-18 · unverdicted · novelty 6.0

Decision trees partition covariate space to detect positivity violations in causal inference, augmented by random forests to quantify violation robustness within each subspace.

Demystifying MMD GANs

stat.ML · 2018-01-04 · accept · novelty 6.0

MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.

Pre-trained Tabular Foundation Models as Versatile Summary Networks for Neural Posterior Estimation

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Pre-trained TabPFN acts as an effective training-free summary network for neural posterior estimation, matching or outperforming standard methods while preserving useful marginal and location information in the posterior.

Learning to Test: Physics-Informed Representation for Dynamical Instability Detection

cs.LG · 2026-04-13 · unverdicted · novelty 6.0

A physics-informed neural representation is learned from safe data to support distributional hypothesis testing for dynamical instability in stochastic DAE systems without repeated simulations.

SynVA: A Modular Toolkit for Vessel Generation and Aneurysm Editing

cs.CV · 2026-05-13 · unverdicted · novelty 5.0

SynVA toolkit generates realistic vascular meshes and anatomically plausible aneurysms, releasing 50,000 labeled samples for medical vision tasks.

Neural Posterior Estimation of Terrain Parameters from Radar Sounder Data

eess.SP · 2026-05-05 · unverdicted · novelty 5.0

Neural posterior estimation trained on simulated radar data enables probabilistic inference of terrain parameters from real Mars radar sounder profiles while conditioning on reference surface assumptions.

Machine Learning Techniques for Astrophysics and Cosmology: Simulation-Based Inference

astro-ph.CO · 2026-05-11 · unverdicted · novelty 2.0

Simulation-based inference uses neural networks trained on simulations to enable parameter inference in cosmology and astrophysics where traditional likelihood calculations are intractable.

Bayesian Rain Field Reconstruction using Commercial Microwave Links and Diffusion Model Priors

cs.LG · 2026-05-06

citing papers explorer

Showing 13 of 13 citing papers.

Proton Structure from Neural Simulation-Based Inference at the LHC hep-ph · 2026-04-14 · unverdicted · none · ref 132
Neural simulation-based inference on unbinned top-quark pair data at 13 TeV yields improved gluon PDF precision over traditional binned analyses while incorporating experimental and theoretical uncertainties.
Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation stat.ML · 2026-01-29 · unverdicted · none · ref 9 · internal anchor
Introduces the first amortized neural posterior estimator conditioned on both data and temperature β for generalized Bayesian inference, matching MCMC performance on standard SBI benchmarks.
Hard-Label Black-Box Attacks on 3D Point Clouds cs.CV · 2024-11-30 · unverdicted · none · ref 91 · internal anchor
A spectrum-aware decision boundary algorithm enables effective hard-label black-box adversarial attacks on 3D point cloud models by fusing spectral information across classes and performing curvature-aware iterative optimization.
Generative Modeling with Orbit-Space Particle Flow Matching cs.GR · 2026-05-04 · unverdicted · none · ref 126
OGPP is a particle flow-matching method using orbit-space canonicalization and geometric paths that achieves lower error and fewer steps than prior approaches on 3D benchmarks.
Asymptotically Optimal Tests for One- and Two-Sample Problems cs.IT · 2026-01-16 · unverdicted · none · ref 7 · internal anchor
Hoeffding's relative entropy threshold test between empirical distributions is asymptotically optimal for both one- and two-sample hypothesis testing, with a strong converse for the two-sample case.
A discriminative approach for finding and characterizing positivity violations using decision trees stat.ML · 2019-07-18 · unverdicted · none · ref 9 · internal anchor
Decision trees partition covariate space to detect positivity violations in causal inference, augmented by random forests to quantify violation robustness within each subspace.
Demystifying MMD GANs stat.ML · 2018-01-04 · accept · none · ref 33 · internal anchor
MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.
Pre-trained Tabular Foundation Models as Versatile Summary Networks for Neural Posterior Estimation cs.LG · 2026-05-08 · unverdicted · none · ref 14
Pre-trained TabPFN acts as an effective training-free summary network for neural posterior estimation, matching or outperforming standard methods while preserving useful marginal and location information in the posterior.
Learning to Test: Physics-Informed Representation for Dynamical Instability Detection cs.LG · 2026-04-13 · unverdicted · none · ref 48
A physics-informed neural representation is learned from safe data to support distributional hypothesis testing for dynamical instability in stochastic DAE systems without repeated simulations.
SynVA: A Modular Toolkit for Vessel Generation and Aneurysm Editing cs.CV · 2026-05-13 · unverdicted · none · ref 49 · internal anchor
SynVA toolkit generates realistic vascular meshes and anatomically plausible aneurysms, releasing 50,000 labeled samples for medical vision tasks.
Neural Posterior Estimation of Terrain Parameters from Radar Sounder Data eess.SP · 2026-05-05 · unverdicted · none · ref 36
Neural posterior estimation trained on simulated radar data enables probabilistic inference of terrain parameters from real Mars radar sounder profiles while conditioning on reference surface assumptions.
Machine Learning Techniques for Astrophysics and Cosmology: Simulation-Based Inference astro-ph.CO · 2026-05-11 · unverdicted · none · ref 37
Simulation-based inference uses neural networks trained on simulations to enable parameter inference in cosmology and astrophysics where traditional likelihood calculations are intractable.
Bayesian Rain Field Reconstruction using Commercial Microwave Links and Diffusion Model Priors cs.LG · 2026-05-06 · unreviewed · ref 145

Revisiting Classifier Two-Sample Tests

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer