pith. sign in

hub

Revisiting Classifier Two-Sample Tests

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it
abstract

The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis "$P = Q$" is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.

hub tools

citation-role summary

background 1 method 1

citation-polarity summary

representative citing papers

Hard-Label Black-Box Attacks on 3D Point Clouds

cs.CV · 2024-11-30 · unverdicted · novelty 7.0

A spectrum-aware decision boundary algorithm enables effective hard-label black-box adversarial attacks on 3D point cloud models by fusing spectral information across classes and performing curvature-aware iterative optimization.

Asymptotically Optimal Tests for One- and Two-Sample Problems

cs.IT · 2026-01-16 · unverdicted · novelty 6.0

Hoeffding's relative entropy threshold test between empirical distributions is asymptotically optimal for both one- and two-sample hypothesis testing, with a strong converse for the two-sample case.

Demystifying MMD GANs

stat.ML · 2018-01-04 · accept · novelty 6.0

MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.

citing papers explorer

Showing 13 of 13 citing papers.