Testing Equality of Conditional Distributions via Generative Models

Hanjia Gao; Linjun Huang; Xiaofeng Shao; Yun Yang

arxiv: 2606.06930 · v1 · pith:4W7OXMTJnew · submitted 2026-06-05 · 📊 stat.ME

Testing Equality of Conditional Distributions via Generative Models

Hanjia Gao , Linjun Huang , Yun Yang , Xiaofeng Shao This is my paper

Pith reviewed 2026-06-27 21:23 UTC · model grok-4.3

classification 📊 stat.ME

keywords conditional distribution testinggenerative modelsRKHS empirical processmultiplier bootstrapdouble robustnesshigh-dimensional covariatesmultivariate responses

0 comments

The pith

Cross-generating responses with conditional generative models constructs a test for equality of two conditional distributions that avoids density ratio estimation and local smoothing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a test for whether two conditional distributions are the same by training a generator on each dataset and then using each generator to produce responses at the covariate values from the other dataset. This cross-generation step aligns the covariates so that generated responses can be compared directly to observed ones. The population construction produces a discrepancy that fully characterizes equality under overlap conditions on the covariates, while the sample version yields a test statistic as the supremum of an RKHS-indexed empirical process calibrated by multiplier bootstrap. The procedure is shown to remain valid even when the generators are estimated with error, owing to a double-robustness property. A reader would care because the approach handles multivariate responses and high-dimensional covariates without the usual dimension-dependent smoothing.

Core claim

The population version of this construction yields a conditional discrepancy that characterizes equality of the two conditional distributions under suitable overlap conditions, while the sample version leads to a test statistic defined as the supremum of an RKHS-indexed empirical process with multiplier bootstrap calibration. The proposed procedure attains a double-robustness property with respect to conditional generator estimation errors.

What carries the argument

The cross-generation step that applies each sample's learned conditional generator to the covariate values observed in the other sample, producing comparable responses for direct comparison.

If this is right

The test statistic converges to a known limiting distribution under the null and diverges under the alternative.
Multiplier bootstrap consistently approximates the null distribution of the test statistic.
The test is consistent for detecting differences between the two conditional distributions.
The double-robustness property ensures the test remains valid when either generator is estimated at a suitable rate.
The procedure applies directly to multivariate responses without requiring dimension reduction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same cross-generation idea could be used to test other conditional properties such as equality of conditional means or quantiles.
The method might extend naturally to settings with censored or missing responses by modifying the generator training step.
Because of double robustness, the approach could serve as a building block for semi-parametric inference on conditional distributions when nuisance generators are fitted with flexible machine learning tools.

Load-bearing premise

The two sets of covariates must overlap sufficiently in support so that responses generated from one set remain comparable to responses observed in the other.

What would settle it

A simulation study in which the two conditional distributions are known to be identical, generators are estimated consistently, and the test nevertheless rejects the null at a rate substantially above the nominal level.

Figures

Figures reproduced from arXiv: 2606.06930 by Hanjia Gao, Linjun Huang, Xiaofeng Shao, Yun Yang.

**Figure 6.1.** Figure 6.1: Representative samples of the five input variants, illustrating the progressive degrada [PITH_FULL_IMAGE:figures/full_fig_p028_6_1.png] view at source ↗

**Figure 6.2.** Figure 6.2: Empirical distribution of p-values across four testing scenarios. Case 1 corresponds to the near-null setting; Cases 2–4 represent increasing levels of covariate degradation. The dashed red line denotes α = 0.05. 6.2 Multivariate Response with Dependence Structure Shift We next evaluate the proposed method in a multivariate-response setting where the signal lies primarily in the dependence structure betw… view at source ↗

**Figure 6.3.** Figure 6.3: Left: empirical conditional trend P(Y (2) = 1 | Y (1) = age) for the three constructed groups. Groups 1 and 3 share a decreasing trend, while Group 2 has an increasing trend. Center and right: marginal distributions of age and gender across the three groups, showing that the groups are well matched in their marginals. comes from the dependence structure between Y (1) and Y (2), rather than from their mar… view at source ↗

**Figure 6.4.** Figure 6.4: Empirical distribution of p-values under the null setting (Group 1 vs. Group 3) and the alternative setting (Group 1 vs. Group 2). The dashed red line denotes α = 0.05. 7 Discussion and Future Work This paper shows that conditional generative models can serve as useful inference tools for twosample testing of conditional distributions, not merely as devices for prediction or simulation. The proposed fra… view at source ↗

read the original abstract

We study the problem of testing whether two conditional distributions are equal using generative models. The proposed method learns a conditional generator from each sample and uses it to create responses at covariate values observed in the other sample, allowing generated and observed responses to be compared directly. By aligning covariates through cross-generation, the approach avoids conditional density-ratio estimation and local smoothing over high-dimensional covariates. The population version of this construction yields a conditional discrepancy that characterizes equality of the two conditional distributions under suitable overlap conditions, while the sample version leads to a test statistic defined as the supremum of an RKHS-indexed empirical process with multiplier bootstrap calibration. A computationally efficient algorithm for evaluating the statistic and its bootstrap analogue is developed based on alternating maximization and the kernel trick. Theoretically, we derive the limiting distribution of the test statistic under both the null and alternative hypotheses, prove bootstrap validity and consistency of the resulting test, and show that the proposed procedure attains a double-robustness property with respect to conditional generator estimation errors. Simulations and real data applications suggest that the proposed method performs well for multivariate responses and high-dimensional covariates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's cross-generation trick using separate conditional generators is the real novelty here, giving a way to test equality without density ratios or smoothing, plus a double-robustness claim, but the overlap conditions look like the part that could limit real-world use.

read the letter

The punchline is that this work builds a test by fitting one generator per sample then swapping covariates to generate comparable responses. That framing is distinct from prior conditional testing methods and sidesteps the usual high-dimensional headaches.

The construction itself is the main advance. By aligning through cross-generation the approach avoids conditional density-ratio estimation entirely, which is a practical win for multivariate responses and high-dimensional covariates. The RKHS supremum statistic with multiplier bootstrap is a clean choice, and the alternating maximization plus kernel trick algorithm keeps computation feasible. The claimed double robustness to generator estimation errors is the part worth checking closely; if it holds without extra assumptions it would be a genuine plus.

The soft spot is the overlap requirement. The abstract ties the population discrepancy's characterizing property directly to suitable overlap on the covariates. When supports do not overlap enough the cross-generation step produces non-comparable responses, so the test can lose its ability to detect differences. The paper needs to show how sensitive size and power are to marginal overlap and give users a way to diagnose it.

This is aimed at statisticians working on nonparametric testing or causal methods with complex conditionals. Readers who want a new tool with limiting distribution and bootstrap theory will get value from it.

It deserves peer review because the idea is specific enough to evaluate and the claims are falsifiable. I would send it out.

Referee Report

2 major / 2 minor

Summary. The paper proposes a method for testing equality of two conditional distributions by learning conditional generators from each sample and cross-generating responses at the other's observed covariates. This yields a population conditional discrepancy that characterizes equality under overlap conditions, and a sample test statistic as the supremum of an RKHS-indexed empirical process calibrated by multiplier bootstrap. An efficient algorithm uses alternating maximization and the kernel trick. The manuscript derives limiting distributions under null and alternative, proves bootstrap validity and consistency, establishes double robustness to generator estimation errors, and reports favorable simulation and real-data performance for multivariate responses and high-dimensional covariates.

Significance. If the results hold, the work is significant for offering a smoothing-free and conditional density-ratio-free procedure for testing conditional distribution equality, with the double-robustness property providing practical flexibility in generator estimation. The RKHS supremum construction and bootstrap calibration are technically appealing strengths.

major comments (2)

[Abstract / population construction] Abstract and population discrepancy section: the claim that the conditional discrepancy characterizes equality of the two conditional distributions is tied to 'suitable overlap conditions,' but the precise mathematical form of these conditions (e.g., the required common support or density lower bound between the two covariate distributions) is not stated explicitly; without this, it is impossible to verify necessity and sufficiency for the characterizing property.
[Theoretical results on limiting distribution and double robustness] Double-robustness claim (theoretical results): the abstract asserts double robustness with respect to conditional generator estimation errors, yet the specific convergence rates required of the two generators (and how their errors interact) for the limiting distribution and bootstrap validity to remain valid are not detailed; this is load-bearing for the asymptotic guarantees.

minor comments (2)

[Algorithm section] The description of the alternating maximization algorithm would benefit from explicit pseudocode or convergence criteria to aid reproducibility.
[Simulations] Simulation section: the reported settings for high-dimensional covariates should include the dimension values and sample sizes explicitly in a table for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. We address the two major comments below and will revise the manuscript accordingly to improve clarity on the stated points.

read point-by-point responses

Referee: Abstract / population construction] Abstract and population discrepancy section: the claim that the conditional discrepancy characterizes equality of the two conditional distributions is tied to 'suitable overlap conditions,' but the precise mathematical form of these conditions (e.g., the required common support or density lower bound between the two covariate distributions) is not stated explicitly; without this, it is impossible to verify necessity and sufficiency for the characterizing property.

Authors: We agree that the overlap conditions require an explicit statement. The population discrepancy section will be revised to include the precise conditions: the two covariate distributions must share common support with a uniform positive lower bound on the density ratio (or equivalent overlap measure) to ensure the characterizing property holds with necessity and sufficiency. This will be added both in the main text and referenced in the abstract. revision: yes
Referee: [Theoretical results on limiting distribution and double robustness] Double-robustness claim (theoretical results): the abstract asserts double robustness with respect to conditional generator estimation errors, yet the specific convergence rates required of the two generators (and how their errors interact) for the limiting distribution and bootstrap validity to remain valid are not detailed; this is load-bearing for the asymptotic guarantees.

Authors: The double-robustness result is established in the theoretical section by showing that the cross-generation error terms vanish in the limit under product-rate conditions on the two generator estimators. However, the referee is correct that the abstract does not detail these rates. We will revise the abstract to briefly state the required rates (e.g., each generator error o_p(n^{-1/4}) with their product o_p(n^{-1/2})) and add a pointer to the relevant theorem for the interaction of the errors. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation is self-contained.

full rationale

The population conditional discrepancy is constructed directly from cross-generation of responses using the two conditional generators; its characterizing property (zero iff conditionals equal, under overlap) follows from the definition of the discrepancy measure rather than from any fitted parameter or self-citation. The sample statistic is the sup of an RKHS-indexed empirical process with multiplier bootstrap; limiting distribution, validity, consistency, and double-robustness are derived from standard empirical-process arguments once generators are treated as fixed. No load-bearing self-citation, no fitted input renamed as prediction, and no ansatz smuggled via prior work appear in the abstract or described chain. The overlap condition is an explicit assumption required for the characterization, not a hidden definitional loop.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard RKHS properties, multiplier bootstrap theory, and the unstated technical conditions needed for the limiting distribution and double-robustness result; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

standard math Standard results on empirical processes and multiplier bootstrap in RKHS hold for the constructed statistic.
Invoked to obtain the limiting distribution under null and alternative.
domain assumption Suitable overlap conditions on the covariate distributions hold.
Required for the population discrepancy to characterize equality of conditional distributions.

pith-pipeline@v0.9.1-grok · 5720 in / 1456 out tokens · 10278 ms · 2026-06-27T21:23:31.353234+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

181 extracted references · 2 canonical work pages

[1]

arXiv preprint arXiv:2301.02739 , year=

Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values , author=. arXiv preprint arXiv:2301.02739 , year=

arXiv
[2]

IEEE Transactions on Neural Networks and Learning Systems , year=

Significance tests of feature relevance for a black-box learner , author=. IEEE Transactions on Neural Networks and Learning Systems , year=
[3]

The Journal of Machine Learning Research , volume=

Double generative adversarial networks for conditional independence testing , author=. The Journal of Machine Learning Research , volume=. 2021 , publisher=

2021
[4]

Journal of the American Statistical Association , volume =

A deep generative approach to conditional sampling , author=. Journal of the American Statistical Association , volume =. 2023 , publisher=

2023
[5]

Journal of the American Statistical Association , volume=

A two-sample conditional distribution test using conformal prediction and weighted rank sum , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=

2024
[6]

Rectifier nonlinearities improve neural network acoustic models , author=. Proc. icml , volume=. 2013 , organization=

2013
[7]

Advances in neural information processing systems , volume=

Generative adversarial nets , author=. Advances in neural information processing systems , volume=
[8]

International conference on machine learning , pages=

Arjovsky, Martin and Chintala, Soumith and Bottou, L. International conference on machine learning , pages=. 2017 , organization=

2017
[9]

Improved training of

Gulrajani, Ishaan and Ahmed, Faruk and Arjovsky, Martin and Dumoulin, Vincent and Courville, Aaron C , journal=. Improved training of
[10]

Proceedings of the IEEE International Conference on Computer Vision , pages=

Least squares generative adversarial networks , author=. Proceedings of the IEEE International Conference on Computer Vision , pages=
[11]

International conference on machine learning , pages=

Batch normalization: Accelerating deep network training by reducing internal covariate shift , author=. International conference on machine learning , pages=. 2015 , organization=

2015
[12]

arXiv preprint arXiv:1412.6980 , year=

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

Pith/arXiv arXiv
[13]

Proceedings of the National Academy of Sciences , volume=

Universal inference , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

2020
[14]

Advances in neural information processing systems , volume=

Pruning neural networks without any data by iteratively conserving synaptic flow , author=. Advances in neural information processing systems , volume=
[15]

arXiv preprint arXiv:1803.03635 , year=

The lottery ticket hypothesis: Finding sparse, trainable neural networks , author=. arXiv preprint arXiv:1803.03635 , year=

Pith/arXiv arXiv
[16]

arXiv preprint arXiv:1903.01611 , year=

Stabilizing the lottery ticket hypothesis , author=. arXiv preprint arXiv:1903.01611 , year=

arXiv 1903
[17]

The Journal of Machine Learning Research , volume=

Lassonet: A neural network with feature sparsity , author=. The Journal of Machine Learning Research , volume=. 2021 , publisher=

2021
[18]

Annales de l'IHP Probabilit

On consistency of kernel density estimators for randomly censored data: rates holding uniformly over adaptive intervals , author=. Annales de l'IHP Probabilit
[19]

Annales de l'Institut Henri Poincare (B) Probability and Statistics , volume=

Rates of strong uniform consistency for multivariate kernel density estimators , author=. Annales de l'Institut Henri Poincare (B) Probability and Statistics , volume=. 2002 , organization=

2002
[20]

The Annals of Probability , pages=

Some limit theorems for empirical processes , author=. The Annals of Probability , pages=. 1984 , publisher=

1984
[21]

Concentration inequalities and asymptotic results for ratio type empirical processes , author=
[22]

Journal of Theoretical Probability , volume=

Uniform and universal Glivenko-Cantelli classes , author=. Journal of Theoretical Probability , volume=. 1991 , publisher=

1991
[23]

The Annals of Probability , pages=

Bootstrapping general empirical measures , author=. The Annals of Probability , pages=. 1990 , publisher=

1990
[24]

Bernoulli , pages=

Empirical processes and applications: an overview , author=. Bernoulli , pages=. 1996 , publisher=

1996
[25]

1996 , publisher=

Weak Convergence and Empirical Processes: With Applications to Statistics , author=. 1996 , publisher=

1996
[26]

The Annals of Probability , volume=

Laws of the iterated logarithm for censored data , author=. The Annals of Probability , volume=. 1999 , publisher=

1999
[27]

Lecture Notes, Columbia University , volume=

A gentle introduction to empirical process theory and applications , author=. Lecture Notes, Columbia University , volume=
[28]

Conference On Learning Theory , pages=

Approximation beats concentration? An approximation view on inference with smooth radial kernels , author=. Conference On Learning Theory , pages=. 2018 , organization=

2018
[29]

Probability Theory and Related Fields , volume=

Comparison and anti-concentration bounds for maxima of Gaussian random vectors , author=. Probability Theory and Related Fields , volume=. 2015 , publisher=

2015
[30]

The Annals of Statistics , volume=

Anti-concentration and honest, adaptive confidence bands , author=. The Annals of Statistics , volume=. 2014 , publisher=

2014
[31]

The Annals of Statistics , volume=

Gaussian approximation of suprema of empirical processes , author=. The Annals of Statistics , volume=
[32]

Inventiones mathematicae , volume=

Entropy and the combinatorial dimension , author=. Inventiones mathematicae , volume=. 2003 , publisher=

2003
[33]

Journal of the ACM (JACM) , volume=

Scale-sensitive dimensions, uniform convergence, and learnability , author=. Journal of the ACM (JACM) , volume=. 1997 , publisher=

1997
[34]

, author=

Universal Kernels. , author=. Journal of Machine Learning Research , volume=
[35]

Inventiones mathematicae , volume=

New concentration inequalities in product spaces , author=. Inventiones mathematicae , volume=. 1996 , publisher=

1996
[36]

The Annals of Probability , pages=

Sharper bounds for Gaussian and empirical processes , author=. The Annals of Probability , pages=. 1994 , publisher=

1994
[37]

Journal of Theoretical Probability , volume=

A note on conditional versus joint unconditional weak convergence in bootstrap consistency results , author=. Journal of Theoretical Probability , volume=. 2019 , publisher=

2019
[38]

, author=

Universality, Characteristic Kernels and RKHS Embedding of Measures. , author=. Journal of Machine Learning Research , volume=
[39]

The Journal of Machine Learning Research , volume=

Universal multi-task kernels , author=. The Journal of Machine Learning Research , volume=. 2008 , publisher=

2008
[40]

The Journal of Machine Learning Research , volume=

A kernel two-sample test , author=. The Journal of Machine Learning Research , volume=. 2012 , publisher=

2012
[41]

2008 , publisher=

Introduction to Empirical Processes and Semiparametric Inference , author=. 2008 , publisher=

2008
[42]

2002 , pages =

Foundations of Modern Probability , author =. 2002 , pages =. doi:10.1007/978-1-4757-4015-8 , url =

work page doi:10.1007/978-1-4757-4015-8 2002
[43]

Annales de l'IHP Probabilit

Exchangeable random measures , author=. Annales de l'IHP Probabilit
[44]

arXiv preprint arXiv:1505.03906 , year=

Training generative neural networks via maximum mean discrepancy optimization , author=. arXiv preprint arXiv:1505.03906 , year=

Pith/arXiv arXiv
[45]

International conference on machine learning , pages=

Generative moment matching networks , author=. International conference on machine learning , pages=. 2015 , organization=

2015
[46]

On gradient regularizers for

Arbel, Michael and Sutherland, Danica J and Bi. On gradient regularizers for. Advances in neural information processing systems , volume=
[47]

Mroueh, Youssef and Li, Chun-Liang and Sercu, Tom and Raj, Anant and Cheng, Yu , booktitle=. Sobolev
[48]

Journal of Machine Learning Research , volume=

How well generative adversarial networks learn distributions , author=. Journal of Machine Learning Research , volume=
[49]

Minimax distribution estimation in

Singh, Shashank and P. Minimax distribution estimation in. arXiv preprint arXiv:1802.08855 , year=

arXiv
[50]

Approximability of Discriminators Implies Diversity in

Bai, Yu and Ma, Tengyu and Risteski, Andrej , booktitle=. Approximability of Discriminators Implies Diversity in
[51]

Estimation of smooth densities in

Weed, Jonathan and Berthet, Quentin , booktitle=. Estimation of smooth densities in. 2019 , organization=

2019
[52]

International Conference on Machine Learning , pages=

Sgd learns one-layer networks in wgans , author=. International Conference on Machine Learning , pages=. 2020 , organization=

2020
[53]

arXiv preprint arXiv:2002.03938 , year=

Distribution approximation and statistical estimation guarantees of generative adversarial networks , author=. arXiv preprint arXiv:2002.03938 , year=

arXiv 2002
[55]

Journal of statistical planning and inference , volume=

Improving predictive inference under covariate shift by weighting the log-likelihood function , author=. Journal of statistical planning and inference , volume=. 2000 , publisher=

2000
[56]

, author=

Covariate shift adaptation by importance weighted cross validation. , author=. Journal of Machine Learning Research , volume=
[57]

Advances in neural information processing systems , volume=

Conformal prediction under covariate shift , author=. Advances in neural information processing systems , volume=
[58]

Journal of Machine Learning Research , volume=

Augmented transfer regression learning with semi-non-parametric nuisance models , author=. Journal of Machine Learning Research , volume=
[59]

Journal of machine learning research , volume=

An error analysis of generative adversarial networks for learning distributions , author=. Journal of machine learning research , volume=
[60]

Journal of the American Statistical Association , volume=

Bootstrap test for difference between means in nonparametric regression , author=. Journal of the American Statistical Association , volume=. 1990 , publisher=

1990
[61]

Journal of the American Statistical Association , volume=

Comparison of regression curves using quasi-residuals , author=. Journal of the American Statistical Association , volume=. 1995 , publisher=

1995
[62]

Journal of the American Statistical Association , volume=

Smoothing parameter selection for power optimality in testing of regression curves , author=. Journal of the American Statistical Association , volume=. 1997 , publisher=

1997
[63]

Journal of the American Statistical Association , volume=

Test of significance when data are curves , author=. Journal of the American Statistical Association , volume=. 1998 , publisher=

1998
[64]

The Annals of Statistics , volume=

Nonparametric comparison of regression curves: an empirical process approach , author=. The Annals of Statistics , volume=. 2003 , publisher=

2003
[65]

Econometrica: Journal of the Econometric Society , pages=

A conditional Kolmogorov test , author=. Econometrica: Journal of the Econometric Society , pages=. 1997 , publisher=

1997
[66]

Econometric Theory , volume=

A consistent test of conditional parametric distributions , author=. Econometric Theory , volume=. 2000 , publisher=

2000
[67]

Econometric Theory , volume=

A nonparametric bootstrap test of conditional distributions , author=. Econometric Theory , volume=. 2006 , publisher=

2006
[68]

Journal of Econometrics , volume=

Distribution-free specification tests of conditional models , author=. Journal of Econometrics , volume=. 2008 , publisher=

2008
[69]

wiley interdisciplinary reviews: Computational statistics , volume=

Energy distance , author=. wiley interdisciplinary reviews: Computational statistics , volume=. 2016 , publisher=

2016
[70]

arXiv preprint arXiv:1411.1784 , year=

Conditional generative adversarial nets , author=. arXiv preprint arXiv:1411.1784 , year=

Pith/arXiv arXiv
[71]

arXiv preprint arXiv:1511.06434 , year=

Unsupervised representation learning with deep convolutional generative adversarial networks , author=. arXiv preprint arXiv:1511.06434 , year=

Pith/arXiv arXiv
[72]

arXiv preprint arXiv:1312.6114 , year=

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

Pith/arXiv arXiv
[73]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=
[74]

Neural information processing: 20th international conference, ICONIP 2013, daegu, korea, november 3-7, 2013

Challenges in representation learning: A report on three machine learning contests , author=. Neural information processing: 20th international conference, ICONIP 2013, daegu, korea, november 3-7, 2013. Proceedings, Part III 20 , pages=. 2013 , organization=

2013
[75]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Stargan v2: Diverse image synthesis for multiple domains , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[76]

arXiv preprint arXiv:1812.11806 , year=

An introduction to domain adaptation and transfer learning , author=. arXiv preprint arXiv:1812.11806 , year=

Pith/arXiv arXiv
[78]

arXiv preprint arXiv:2210.08149 , year=

Distance and kernel-based measures for global and local two-sample conditional distribution testing , author=. arXiv preprint arXiv:2210.08149 , year=

arXiv
[79]

arXiv preprint arXiv:2410.16636 , year=

General frameworks for conditional two-sample testing , author=. arXiv preprint arXiv:2410.16636 , year=

Pith/arXiv arXiv
[80]

1994 , publisher=

Mixture density networks , author=. 1994 , publisher=

1994
[81]

and Li, R

Cai, Z. and Li, R. and Zhang, Y. , title =. Journal of Machine Learning Research , year =
[82]

, title =

Chatterjee, Anirban and Niu, Ziang and Bhattacharya, Bhaswar B. , title =. arXiv preprint , year =. 2407.16550 , note =

arXiv

Showing first 80 references.

[1] [1]

arXiv preprint arXiv:2301.02739 , year=

Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values , author=. arXiv preprint arXiv:2301.02739 , year=

arXiv

[2] [2]

IEEE Transactions on Neural Networks and Learning Systems , year=

Significance tests of feature relevance for a black-box learner , author=. IEEE Transactions on Neural Networks and Learning Systems , year=

[3] [3]

The Journal of Machine Learning Research , volume=

Double generative adversarial networks for conditional independence testing , author=. The Journal of Machine Learning Research , volume=. 2021 , publisher=

2021

[4] [4]

Journal of the American Statistical Association , volume =

A deep generative approach to conditional sampling , author=. Journal of the American Statistical Association , volume =. 2023 , publisher=

2023

[5] [5]

Journal of the American Statistical Association , volume=

A two-sample conditional distribution test using conformal prediction and weighted rank sum , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=

2024

[6] [6]

Rectifier nonlinearities improve neural network acoustic models , author=. Proc. icml , volume=. 2013 , organization=

2013

[7] [7]

Advances in neural information processing systems , volume=

Generative adversarial nets , author=. Advances in neural information processing systems , volume=

[8] [8]

International conference on machine learning , pages=

Arjovsky, Martin and Chintala, Soumith and Bottou, L. International conference on machine learning , pages=. 2017 , organization=

2017

[9] [9]

Improved training of

Gulrajani, Ishaan and Ahmed, Faruk and Arjovsky, Martin and Dumoulin, Vincent and Courville, Aaron C , journal=. Improved training of

[10] [10]

Proceedings of the IEEE International Conference on Computer Vision , pages=

Least squares generative adversarial networks , author=. Proceedings of the IEEE International Conference on Computer Vision , pages=

[11] [11]

International conference on machine learning , pages=

Batch normalization: Accelerating deep network training by reducing internal covariate shift , author=. International conference on machine learning , pages=. 2015 , organization=

2015

[12] [12]

arXiv preprint arXiv:1412.6980 , year=

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

Pith/arXiv arXiv

[13] [13]

Proceedings of the National Academy of Sciences , volume=

Universal inference , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

2020

[14] [14]

Advances in neural information processing systems , volume=

Pruning neural networks without any data by iteratively conserving synaptic flow , author=. Advances in neural information processing systems , volume=

[15] [15]

arXiv preprint arXiv:1803.03635 , year=

The lottery ticket hypothesis: Finding sparse, trainable neural networks , author=. arXiv preprint arXiv:1803.03635 , year=

Pith/arXiv arXiv

[16] [16]

arXiv preprint arXiv:1903.01611 , year=

Stabilizing the lottery ticket hypothesis , author=. arXiv preprint arXiv:1903.01611 , year=

arXiv 1903

[17] [17]

The Journal of Machine Learning Research , volume=

Lassonet: A neural network with feature sparsity , author=. The Journal of Machine Learning Research , volume=. 2021 , publisher=

2021

[18] [18]

Annales de l'IHP Probabilit

On consistency of kernel density estimators for randomly censored data: rates holding uniformly over adaptive intervals , author=. Annales de l'IHP Probabilit

[19] [19]

Annales de l'Institut Henri Poincare (B) Probability and Statistics , volume=

Rates of strong uniform consistency for multivariate kernel density estimators , author=. Annales de l'Institut Henri Poincare (B) Probability and Statistics , volume=. 2002 , organization=

2002

[20] [20]

The Annals of Probability , pages=

Some limit theorems for empirical processes , author=. The Annals of Probability , pages=. 1984 , publisher=

1984

[21] [21]

Concentration inequalities and asymptotic results for ratio type empirical processes , author=

[22] [22]

Journal of Theoretical Probability , volume=

Uniform and universal Glivenko-Cantelli classes , author=. Journal of Theoretical Probability , volume=. 1991 , publisher=

1991

[23] [23]

The Annals of Probability , pages=

Bootstrapping general empirical measures , author=. The Annals of Probability , pages=. 1990 , publisher=

1990

[24] [24]

Bernoulli , pages=

Empirical processes and applications: an overview , author=. Bernoulli , pages=. 1996 , publisher=

1996

[25] [25]

1996 , publisher=

Weak Convergence and Empirical Processes: With Applications to Statistics , author=. 1996 , publisher=

1996

[26] [26]

The Annals of Probability , volume=

Laws of the iterated logarithm for censored data , author=. The Annals of Probability , volume=. 1999 , publisher=

1999

[27] [27]

Lecture Notes, Columbia University , volume=

A gentle introduction to empirical process theory and applications , author=. Lecture Notes, Columbia University , volume=

[28] [28]

Conference On Learning Theory , pages=

Approximation beats concentration? An approximation view on inference with smooth radial kernels , author=. Conference On Learning Theory , pages=. 2018 , organization=

2018

[29] [29]

Probability Theory and Related Fields , volume=

Comparison and anti-concentration bounds for maxima of Gaussian random vectors , author=. Probability Theory and Related Fields , volume=. 2015 , publisher=

2015

[30] [30]

The Annals of Statistics , volume=

Anti-concentration and honest, adaptive confidence bands , author=. The Annals of Statistics , volume=. 2014 , publisher=

2014

[31] [31]

The Annals of Statistics , volume=

Gaussian approximation of suprema of empirical processes , author=. The Annals of Statistics , volume=

[32] [32]

Inventiones mathematicae , volume=

Entropy and the combinatorial dimension , author=. Inventiones mathematicae , volume=. 2003 , publisher=

2003

[33] [33]

Journal of the ACM (JACM) , volume=

Scale-sensitive dimensions, uniform convergence, and learnability , author=. Journal of the ACM (JACM) , volume=. 1997 , publisher=

1997

[34] [34]

, author=

Universal Kernels. , author=. Journal of Machine Learning Research , volume=

[35] [35]

Inventiones mathematicae , volume=

New concentration inequalities in product spaces , author=. Inventiones mathematicae , volume=. 1996 , publisher=

1996

[36] [36]

The Annals of Probability , pages=

Sharper bounds for Gaussian and empirical processes , author=. The Annals of Probability , pages=. 1994 , publisher=

1994

[37] [37]

Journal of Theoretical Probability , volume=

A note on conditional versus joint unconditional weak convergence in bootstrap consistency results , author=. Journal of Theoretical Probability , volume=. 2019 , publisher=

2019

[38] [38]

, author=

Universality, Characteristic Kernels and RKHS Embedding of Measures. , author=. Journal of Machine Learning Research , volume=

[39] [39]

The Journal of Machine Learning Research , volume=

Universal multi-task kernels , author=. The Journal of Machine Learning Research , volume=. 2008 , publisher=

2008

[40] [40]

The Journal of Machine Learning Research , volume=

A kernel two-sample test , author=. The Journal of Machine Learning Research , volume=. 2012 , publisher=

2012

[41] [41]

2008 , publisher=

Introduction to Empirical Processes and Semiparametric Inference , author=. 2008 , publisher=

2008

[42] [42]

2002 , pages =

Foundations of Modern Probability , author =. 2002 , pages =. doi:10.1007/978-1-4757-4015-8 , url =

work page doi:10.1007/978-1-4757-4015-8 2002

[43] [43]

Annales de l'IHP Probabilit

Exchangeable random measures , author=. Annales de l'IHP Probabilit

[44] [44]

arXiv preprint arXiv:1505.03906 , year=

Training generative neural networks via maximum mean discrepancy optimization , author=. arXiv preprint arXiv:1505.03906 , year=

Pith/arXiv arXiv

[45] [45]

International conference on machine learning , pages=

Generative moment matching networks , author=. International conference on machine learning , pages=. 2015 , organization=

2015

[46] [46]

On gradient regularizers for

Arbel, Michael and Sutherland, Danica J and Bi. On gradient regularizers for. Advances in neural information processing systems , volume=

[47] [47]

Mroueh, Youssef and Li, Chun-Liang and Sercu, Tom and Raj, Anant and Cheng, Yu , booktitle=. Sobolev

[48] [48]

Journal of Machine Learning Research , volume=

How well generative adversarial networks learn distributions , author=. Journal of Machine Learning Research , volume=

[49] [49]

Minimax distribution estimation in

Singh, Shashank and P. Minimax distribution estimation in. arXiv preprint arXiv:1802.08855 , year=

arXiv

[50] [50]

Approximability of Discriminators Implies Diversity in

Bai, Yu and Ma, Tengyu and Risteski, Andrej , booktitle=. Approximability of Discriminators Implies Diversity in

[51] [51]

Estimation of smooth densities in

Weed, Jonathan and Berthet, Quentin , booktitle=. Estimation of smooth densities in. 2019 , organization=

2019

[52] [52]

International Conference on Machine Learning , pages=

Sgd learns one-layer networks in wgans , author=. International Conference on Machine Learning , pages=. 2020 , organization=

2020

[53] [53]

arXiv preprint arXiv:2002.03938 , year=

Distribution approximation and statistical estimation guarantees of generative adversarial networks , author=. arXiv preprint arXiv:2002.03938 , year=

arXiv 2002

[54] [55]

Journal of statistical planning and inference , volume=

Improving predictive inference under covariate shift by weighting the log-likelihood function , author=. Journal of statistical planning and inference , volume=. 2000 , publisher=

2000

[55] [56]

, author=

Covariate shift adaptation by importance weighted cross validation. , author=. Journal of Machine Learning Research , volume=

[56] [57]

Advances in neural information processing systems , volume=

Conformal prediction under covariate shift , author=. Advances in neural information processing systems , volume=

[57] [58]

Journal of Machine Learning Research , volume=

Augmented transfer regression learning with semi-non-parametric nuisance models , author=. Journal of Machine Learning Research , volume=

[58] [59]

Journal of machine learning research , volume=

An error analysis of generative adversarial networks for learning distributions , author=. Journal of machine learning research , volume=

[59] [60]

Journal of the American Statistical Association , volume=

Bootstrap test for difference between means in nonparametric regression , author=. Journal of the American Statistical Association , volume=. 1990 , publisher=

1990

[60] [61]

Journal of the American Statistical Association , volume=

Comparison of regression curves using quasi-residuals , author=. Journal of the American Statistical Association , volume=. 1995 , publisher=

1995

[61] [62]

Journal of the American Statistical Association , volume=

Smoothing parameter selection for power optimality in testing of regression curves , author=. Journal of the American Statistical Association , volume=. 1997 , publisher=

1997

[62] [63]

Journal of the American Statistical Association , volume=

Test of significance when data are curves , author=. Journal of the American Statistical Association , volume=. 1998 , publisher=

1998

[63] [64]

The Annals of Statistics , volume=

Nonparametric comparison of regression curves: an empirical process approach , author=. The Annals of Statistics , volume=. 2003 , publisher=

2003

[64] [65]

Econometrica: Journal of the Econometric Society , pages=

A conditional Kolmogorov test , author=. Econometrica: Journal of the Econometric Society , pages=. 1997 , publisher=

1997

[65] [66]

Econometric Theory , volume=

A consistent test of conditional parametric distributions , author=. Econometric Theory , volume=. 2000 , publisher=

2000

[66] [67]

Econometric Theory , volume=

A nonparametric bootstrap test of conditional distributions , author=. Econometric Theory , volume=. 2006 , publisher=

2006

[67] [68]

Journal of Econometrics , volume=

Distribution-free specification tests of conditional models , author=. Journal of Econometrics , volume=. 2008 , publisher=

2008

[68] [69]

wiley interdisciplinary reviews: Computational statistics , volume=

Energy distance , author=. wiley interdisciplinary reviews: Computational statistics , volume=. 2016 , publisher=

2016

[69] [70]

arXiv preprint arXiv:1411.1784 , year=

Conditional generative adversarial nets , author=. arXiv preprint arXiv:1411.1784 , year=

Pith/arXiv arXiv

[70] [71]

arXiv preprint arXiv:1511.06434 , year=

Unsupervised representation learning with deep convolutional generative adversarial networks , author=. arXiv preprint arXiv:1511.06434 , year=

Pith/arXiv arXiv

[71] [72]

arXiv preprint arXiv:1312.6114 , year=

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

Pith/arXiv arXiv

[72] [73]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

[73] [74]

Neural information processing: 20th international conference, ICONIP 2013, daegu, korea, november 3-7, 2013

Challenges in representation learning: A report on three machine learning contests , author=. Neural information processing: 20th international conference, ICONIP 2013, daegu, korea, november 3-7, 2013. Proceedings, Part III 20 , pages=. 2013 , organization=

2013

[74] [75]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Stargan v2: Diverse image synthesis for multiple domains , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

[75] [76]

arXiv preprint arXiv:1812.11806 , year=

An introduction to domain adaptation and transfer learning , author=. arXiv preprint arXiv:1812.11806 , year=

Pith/arXiv arXiv

[76] [78]

arXiv preprint arXiv:2210.08149 , year=

Distance and kernel-based measures for global and local two-sample conditional distribution testing , author=. arXiv preprint arXiv:2210.08149 , year=

arXiv

[77] [79]

arXiv preprint arXiv:2410.16636 , year=

General frameworks for conditional two-sample testing , author=. arXiv preprint arXiv:2410.16636 , year=

Pith/arXiv arXiv

[78] [80]

1994 , publisher=

Mixture density networks , author=. 1994 , publisher=

1994

[79] [81]

and Li, R

Cai, Z. and Li, R. and Zhang, Y. , title =. Journal of Machine Learning Research , year =

[80] [82]

, title =

Chatterjee, Anirban and Niu, Ziang and Bhattacharya, Bhaswar B. , title =. arXiv preprint , year =. 2407.16550 , note =

arXiv