arxiv: 2605.10206 · v1 · submitted 2026-05-11 · 🧮 math.ST · cs.LG· stat.ML· stat.TH

Recognition: 2 theorem links

· Lean Theorem

Extended Wasserstein-GAN Approach to Causal Distribution Learning: Density-Free Estimation and Minimax Optimality

Shu Tamano , Masaaki Imaizumi

Authors on Pith no claims yet

Pith reviewed 2026-05-12 05:00 UTC · model grok-4.3

classification 🧮 math.ST cs.LGstat.MLstat.TH

keywords Wasserstein GANcausal distribution estimationminimax optimalityBesov spacesinterventional distributionscounterfactual inferencedensity-free estimation

0 comments

The pith

GANICE estimates conditional interventional distributions by minimizing averaged Wasserstein risk and proves minimax optimality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes GANICE as a GAN-based method for distributional causal inference. It identifies the conditional interventional distribution of outcomes given each treatment and covariate combination as the precise target quantity. The approach minimizes the averaged Wasserstein risk of this distribution using an extended Wasserstein distance and a cellwise critic in the dual formulation. This construction avoids density estimation or ratio methods and yields minimax optimality over Besov spaces. A reader would care because full distributional estimates support quantile and tail risk calculations that average treatment effect methods cannot provide.

Core claim

GANICE clarifies the conditional interventional distribution for each treatment-covariate state as the causal estimation target. It estimates the conditional distribution such that its averaged Wasserstein risk is minimized. The method achieves these properties through the introduction of the extended Wasserstein distance, the incorporation of a cellwise critic in its dual, and an optimality proof based on Besov space theory.

What carries the argument

The extended Wasserstein distance with cellwise critic in its dual, which directly minimizes averaged risk for conditional interventional distributions without density estimation.

If this is right

The estimator consistently recovers full outcome distributions including quantiles and tail probabilities under interventions.
It provides theoretical minimax optimality guarantees that prior GAN-based causal methods lacked.
Experiments show consistent outperformance over existing density-reliant GAN approaches for counterfactual estimation.
The method supplies a density-free route to policy-dependent uncertainty quantification in causal settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The cellwise critic structure may scale to high-dimensional covariates by partitioning the covariate space more finely.
The optimality result could extend to other smoothness classes beyond Besov spaces if the proof technique generalizes.
Practitioners could integrate the full distributional output into downstream decision rules that optimize risk measures rather than means.
The framework might combine with longitudinal or time-varying treatment settings to track evolving conditional distributions.

Load-bearing premise

The conditional interventional distributions belong to Besov spaces and standard causal assumptions such as no unmeasured confounding hold for identifiability.

What would settle it

A simulation where the true distributions lie in a known Besov space but the GANICE estimator fails to attain the minimax convergence rate, or where performance collapses after introducing unmeasured confounding.

Figures

Figures reproduced from arXiv: 2605.10206 by Masaaki Imaizumi, Shu Tamano.

**Figure 2.** Figure 2: Predictive interval widths across nominal coverage levels. Widths should be interpreted together [PITH_FULL_IMAGE:figures/full_fig_p052_2.png] view at source ↗

**Figure 3.** Figure 3: Probability integral transform diagnostics. For calibrated predictive distributions, PIT histograms [PITH_FULL_IMAGE:figures/full_fig_p053_3.png] view at source ↗

**Figure 4.** Figure 4: Jobs randomized-arm CDF diagnostics. The panels compare model-implied interventional arm [PITH_FULL_IMAGE:figures/full_fig_p053_4.png] view at source ↗

**Figure 5.** Figure 5: Objective ablation on IHDP. The full method is compared with variants that remove cell [PITH_FULL_IMAGE:figures/full_fig_p054_5.png] view at source ↗

read the original abstract

Distributional causal inference requires estimating not only average treatment effects but also interventional outcome distributions, including quantiles, tail risks, and policy-dependent uncertainty. As a method for distributional causal inference, generative adversarial network (GAN)-based counterfactual methods are flexible tools for this task. However, these methods have several limitations. First, the objectives of certain techniques do not coincide with the statistical risk of the identifiable causal target, and therefore provide limited theoretical guarantees regarding estimable counterfactual distributions or optimality. Second, they tend to rely on unstable density-based methods, such as density ratio estimation. In this paper, we propose GANICE (GAN for Interventional Conditional Estimation) with several advantages: it (i) clarifies the conditional interventional distribution for each treatment--covariate state as the causal estimation target; (ii) estimates the conditional distribution such that its averaged Wasserstein risk is minimized; (iii) establishes minimax optimality. GANICE achieves these advantages through the introduction of the extended Wasserstein distance, the incorporation of a cellwise critic in its dual, and an optimality proof based on Besov space theory. Our experiments demonstrate that GANICE consistently outperforms existing methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GANICE gives a Wasserstein GAN for estimating conditional interventional distributions by directly minimizing averaged Wasserstein risk, with a minimax optimality claim that turns on whether the cellwise critic preserves Besov rates.

read the letter

The main takeaway is that this paper sets up a GAN called GANICE to estimate the full conditional interventional distribution for each treatment-covariate combination, with the training objective aligned to average Wasserstein risk and a proof of minimax optimality under Besov regularity via an extended Wasserstein distance and cellwise critic in the dual. That combination is the concrete new piece beyond prior GAN counterfactual work, which often optimized something else or relied on density ratios. The paper does well by making the statistical target explicit and by avoiding density estimation, which removes one source of instability in earlier methods. The experiments are presented as showing consistent gains, which is useful if the setups hold up. The soft spot is the optimality argument. The cellwise critic must match the approximation rates of the standard Kantorovich dual over Besov balls; any extra partitioning error would loosen the upper bound relative to the lower bound, and the stress-test note flags exactly that risk. The proof draws on external Besov theory, which is independent grounding, but the extension step needs to be checked line by line. Identifiability also requires the usual no-unmeasured-confounding assumption plus the target lying in Besov spaces, both standard but limiting in practice. This work is for people in causal statistics and ML who care about full distributional estimates rather than just means. A reader looking for a method with explicit risk alignment and rate claims will find the framing and proof strategy worth examining. It deserves a serious referee because the target and objective are cleanly stated and the technical ingredients are non-routine, even though the rate preservation step is the part that needs referee scrutiny. I would send it to review and ask the referees to focus on the cellwise critic construction and the exact rate calculation.

Referee Report

3 major / 2 minor

Summary. The paper proposes GANICE, a Wasserstein-GAN variant for distributional causal inference. It identifies the conditional interventional distribution (given treatment and covariates) as the target, estimates it by minimizing an averaged Wasserstein risk via a novel extended Wasserstein distance whose dual employs a cellwise critic, and proves minimax optimality of the resulting estimator under Besov-space regularity assumptions on the conditional distributions. The method is density-free and is shown in experiments to outperform prior GAN-based counterfactual approaches.

Significance. If the optimality result is fully rigorous, the work supplies a theoretically grounded, risk-aligned alternative to existing GAN counterfactual estimators that often optimize mismatched objectives or rely on density ratios. The explicit use of the extended Wasserstein distance and cellwise critic to achieve minimax rates over Besov balls would constitute a non-trivial technical contribution to the intersection of causal inference and distribution estimation.

major comments (3)

[optimality proof / Besov-space argument] The minimax-optimality argument (presumably in the section containing the Besov-space proof) asserts that the cellwise critic attains the same approximation rates as the standard Kantorovich dual. However, the partitioning into cells necessarily introduces an additional discretization error whose dependence on cell size and on the Besov smoothness index is not shown to be negligible relative to the lower bound; without an explicit bound on this term the upper bound may fail to match the lower bound.
[definition of extended Wasserstein distance and its dual] The definition of the extended Wasserstein distance is constructed so that its expectation equals the averaged Wasserstein risk of the conditional interventional distributions. It is not immediately clear from the dual formulation whether this equality continues to hold exactly once the critic is restricted to be cellwise; any mismatch would break the alignment between the GAN objective and the statistical risk that the paper claims.
[identifiability and causal assumptions] The identifiability step relies on standard no-unmeasured-confounding and positivity assumptions to equate the extended distance to the causal target. The manuscript should state explicitly whether these assumptions are also used to guarantee that the cellwise critic can be realized by a neural network without further approximation error that would degrade the rate.

minor comments (2)

[experiments] The experimental section would benefit from a precise description of how the averaged Wasserstein risk is estimated on held-out data (e.g., number of Monte-Carlo samples per cell, choice of ground metric).
[method] Notation for the cellwise critic (indicator functions or partition indicators) should be introduced once and used consistently to avoid ambiguity when the dual is written.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below, indicating where revisions will be made to clarify and strengthen the technical arguments.

read point-by-point responses

Referee: The minimax-optimality argument (presumably in the section containing the Besov-space proof) asserts that the cellwise critic attains the same approximation rates as the standard Kantorovich dual. However, the partitioning into cells necessarily introduces an additional discretization error whose dependence on cell size and on the Besov smoothness index is not shown to be negligible relative to the lower bound; without an explicit bound on this term the upper bound may fail to match the lower bound.

Authors: We agree that an explicit bound on the discretization error is required for a fully rigorous matching of upper and lower bounds. In the revised manuscript we will insert a new auxiliary lemma that quantifies the discretization error as a function of cell diameter and the Besov smoothness index. We will then select the cell size (as a function of sample size) so that this term is of strictly lower order than the minimax rate, ensuring the upper bound continues to match the lower bound. revision: yes
Referee: The definition of the extended Wasserstein distance is constructed so that its expectation equals the averaged Wasserstein risk of the conditional interventional distributions. It is not immediately clear from the dual formulation whether this equality continues to hold exactly once the critic is restricted to be cellwise; any mismatch would break the alignment between the GAN objective and the statistical risk that the paper claims.

Authors: The equality is preserved exactly under the cellwise restriction. Because the extended distance is an integral over the covariate space and the cells form a partition, the dual objective separates across cells; optimizing the cellwise critic on each cell recovers the same value as the unrestricted dual. We will add a short proposition in the revised version that formally verifies this equality holds with no mismatch. revision: yes
Referee: The identifiability step relies on standard no-unmeasured-confounding and positivity assumptions to equate the extended distance to the causal target. The manuscript should state explicitly whether these assumptions are also used to guarantee that the cellwise critic can be realized by a neural network without further approximation error that would degrade the rate.

Authors: The no-unmeasured-confounding and positivity assumptions are used only to identify the conditional interventional distribution as the target; they play no role in the neural-network approximation analysis. The approximation error of the cellwise critic by neural networks is controlled separately via standard Besov-space approximation results for neural networks. We will revise the relevant sections to separate these two arguments explicitly and to state that the NN approximation rate does not depend on the causal assumptions beyond identifiability. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external Besov theory and explicit definitions

full rationale

The paper defines an extended Wasserstein distance and cellwise critic explicitly to target the averaged Wasserstein risk of conditional interventional distributions, then invokes standard Besov space approximation theory for the minimax proof. This chain does not reduce any prediction or optimality claim to a fitted parameter or self-referential definition by construction. No load-bearing step collapses to renaming a known result or to an unverified self-citation chain; the identifiability assumptions (no unmeasured confounding) are stated separately from the distance construction. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on standard mathematical theory for the optimality proof and introduces new components for the estimation procedure; no free parameters are mentioned.

axioms (1)

standard math Besov space theory applies to establish minimax optimality of the estimator
Invoked for the optimality proof as stated in the abstract.

invented entities (2)

extended Wasserstein distance no independent evidence
purpose: To define the risk measure for estimating conditional interventional distributions in the GAN objective
Introduced as a key technical component to achieve density-free estimation and optimality.
cellwise critic no independent evidence
purpose: Incorporated in the dual formulation to handle conditional aspects of the distribution estimation
New element added to the GAN architecture for the causal task.

pith-pipeline@v0.9.0 · 5513 in / 1430 out tokens · 80514 ms · 2026-05-12T05:00:00.475582+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

extended Wasserstein distance ... diagonal admissible couplings ... cellwise outcome-Lipschitz critics ... Besov control of discontinuous critics
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

anisotropic dyadic partition ... finite-resolution critic class F(m)1,0

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

106 extracted references · 106 canonical work pages · 2 internal anchors

[1]

, title =

Abadie, A. , title =. Journal of the American Statistical Association , volume =. 2002 , publisher =

work page 2002
[2]

Abdisa, A. G. and Zhou, Y. and Qiu, Y. , title =. Computational Statistics , volume =. 2026 , publisher =

work page 2026
[3]

and Chintala, S

Arjovsky, M. and Chintala, S. and Bottou, L. , title =. Proceedings of the 34th International Conference on Machine Learning , pages =. 2017 , publisher =

work page 2017
[4]

and Tibshirani, J

Athey, S. and Tibshirani, J. and Wager, S. , title =. The Annals of Statistics , volume =

work page
[5]

and Lacour, C

Bertin, K. and Lacour, C. and Rivoirard, V. , title =. Annales de l'Institut Henri

work page
[6]

and Jordon, J

Bica, I. and Jordon, J. and van der Schaar, M. , title =. Advances in Neural Information Processing Systems , volume =

work page
[7]

and Foster, D

Bilodeau, B. and Foster, D. J. and Roy, D. M. , title =. Annals of Statistics , volume =. 2023 , publisher =

work page 2023
[8]

and Comte, F

Brunel, E. and Comte, F. and Lacour, C. , title =. Sankhya A , volume =. 2010 , publisher =

work page 2010
[9]

and Hirata, T

Byambadalai, U. and Hirata, T. and Oka, T. and Yasui, S. , title =. Proceedings of the 42nd International Conference on Machine Learning , pages =. 2025 , publisher =

work page 2025
[10]

and Oka, T

Byambadalai, U. and Oka, T. and Yasui, S. , title =. Proceedings of the 41st International Conference on Machine Learning , pages =. 2024 , publisher =

work page 2024
[11]

Cattaneo, M. D. and Chandak, R. and Jansson, M. and Ma, X. , title =. Bernoulli , volume =. 2024 , publisher =

work page 2024
[12]

, title =

Chaudhuri, P. , title =. The Annals of Statistics , volume =. 1991 , publisher =

work page 1991
[13]

and Hagemann, P

Chemseddine, J. and Hagemann, P. and Steidl, G. and Wald, C. , title =. Journal of Machine Learning Research , volume =

work page
[14]

and Fern

Chernozhukov, V. and Fern. Inference on counterfactual distributions , journal =

work page
[15]

Cover, T. M. and Thomas, J. A. , title =. 2006 , edition =

work page 2006
[16]

Dabrowska, D. M. , title =. The Annals of Statistics , volume =

work page
[17]

Dehejia, R. H. and Wahba, S. , title =. Journal of the American Statistical Association , volume =

work page
[18]

and Zaoui, A

Dombry, C. and Zaoui, A. , title =. Advances in Neural Information Processing Systems , volume =

work page
[19]

, title =

Efromovich, S. , title =. The Annals of Statistics , volume =

work page
[20]

, title =

Efromovich, S. , title =. Annals of the Institute of Statistical Mathematics , volume =. 2010 , publisher =

work page 2010
[21]

and Maume-Deschamps, V

Elie-Dit-Cosaque, K. and Maume-Deschamps, V. , title =. Electronic Journal of Statistics , volume =. 2022 , publisher =

work page 2022
[22]

and Yao, Q

Fan, J. and Yao, Q. and Tong, H. , title =. Biometrika , volume =

work page
[23]

and Farmen, M

Fan, J. and Farmen, M. and Gijbels, I. , title =. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 1998 , publisher =

work page 1998
[24]

Fan, J. and M. Conditional. IEEE Transactions on Information Theory , volume =. 2025 , publisher =

work page 2025
[25]

, title =

Firpo, S. , title =. Econometrica , volume =

work page
[26]

and Peracchi, F

Foresi, S. and Peracchi, F. , title =. Journal of the American Statistical Association , volume =. 1995 , publisher =

work page 1995
[27]

and Huang, X

Ge, Q. and Huang, X. and Fang, S. and Guo, S. and Liu, Y. and Lin, W. and Xiong, M. , title =. Frontiers in Genetics , volume =. 2020 , publisher =

work page 2020
[28]

and Raftery, A

Gneiting, T. and Raftery, A. E. , title =. Journal of the American Statistical Association , volume =. 2007 , publisher =

work page 2007
[29]

and Pouget-Abadie, J

Goodfellow, I. and Pouget-Abadie, J. and Mirza, M. and Xu, B. and Warde-Farley, D. and Ozair, S. and Courville, A. and Bengio, Y. , title =. Advances in Neural Information Processing Systems , year =

work page
[30]

and Borgwardt, K

Gretton, A. and Borgwardt, K. M. and Rasch, M. J. and Sch. A kernel two-sample test , journal =. 2012 , url =

work page 2012
[31]

and Wolff, R

Hall, P. and Wolff, R. C. and Yao, Q. , title =. Journal of the American Statistical Association , volume =. 1999 , publisher =

work page 1999
[32]

and Racine, J

Hall, P. and Racine, J. and Li, Q. , title =. Journal of the American Statistical Association , volume =. 2004 , publisher =

work page 2004
[33]

Hill, J. L. , title =. Journal of Computational and Graphical Statistics , volume =

work page
[34]

and Lepski, O

Hoffmann, M. and Lepski, O. , title =. The Annals of Statistics , volume =

work page
[35]

and Hsu, A

Hosseini, B. and Hsu, A. W. and Taghvaei, A. , title =. SIAM/ASA Journal on Uncertainty Quantification , volume =. 2025 , publisher =

work page 2025
[36]

and Kneib, T

Hothorn, T. and Kneib, T. and B. Conditional transformation models , journal =. 2014 , publisher =

work page 2014
[37]

Hu, J. Y.-C. and Wu, W. and Lee, Y.-C. and Huang, Y.-C. and Chen, M. and Liu, H. , title =. The 13th International Conference on Learning Representations , year =

work page
[38]

and Sun, R

Huan, C. and Sun, R. and Song, X. , title =. Journal of Causal Inference , volume =. 2024 , publisher =

work page 2024
[39]

Hyndman, R. J. and Yao, Q. , title =. Journal of Nonparametric Statistics , volume =. 2002 , publisher =

work page 2002
[40]

and Lee, A

Izbicki, R. and Lee, A. B. , title =. Electronic Journal of Statistics , volume =

work page
[41]

and Luedtke, A

Jain, S. and Luedtke, A. , title =. arXiv preprint arXiv:2603.16829 , year =

work page arXiv
[42]

Jang, K. J. and Hwang, G. , title =. Machine Learning , volume =. 2026 , publisher =

work page 2026
[43]

and Oprescu, M

Kallus, N. and Oprescu, M. , title =. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics , pages =. 2023 , publisher =

work page 2023
[44]

Kennedy, E. H. and Balakrishnan, S. and Wasserman, L. A. , title =. Biometrika , volume =. 2023 , publisher =

work page 2023
[45]

Kennedy, E. H. , title =. Electronic Journal of Statistics , volume =

work page
[46]

and Lepski, O

Kerkyacharian, G. and Lepski, O. and Picard, D. , title =. Probability Theory and Related Fields , volume =

work page
[47]

and Migliorini, G

Kerrigan, G. and Migliorini, G. and Smyth, P. , title =. Advances in Neural Information Processing Systems , volume =

work page
[48]

and Lee, K

Kim, Y.-g. and Lee, K. and Choi, Y. and Won, J.-H. and Paik, M. C. , title =. arXiv preprint arXiv:2308.10145 , year =

work page arXiv
[49]

and Lee, K

Kim, Y.-g. and Lee, K. and Paik, M. C. , title =. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =. 2023 , doi =

work page 2023
[50]

and Bassett, Jr., G

Koenker, R. and Bassett, Jr., G. , title =. Econometrica , volume =

work page
[51]

and Yang, Y

Kumar, S. and Yang, Y. and Lin, L. , title =. arXiv preprint arXiv:2410.02025 , year =

work page arXiv
[52]

LaLonde, R. J. , title =. The American Economic Review , volume =

work page
[53]

and Racine, J

Li, Q. and Racine, J. S. , title =. Journal of Business & Economic Statistics , volume =. 2008 , publisher =

work page 2008
[54]

and Neykov, M

Li, M. and Neykov, M. and Balakrishnan, S. , title =. Electronic Journal of Statistics , volume =. 2022 , publisher =

work page 2022
[55]

and Kuang, K

Li, Y. and Kuang, K. and Li, B. and Cui, P. and Tao, J. and Yang, H. and Wu, F. , title =. Proceedings of the 2020 KDD Workshop on Causal Discovery , pages =. 2020 , publisher =

work page 2020
[56]

and Fukumizu, K

Luedtke, A. and Fukumizu, K. , title =. arXiv preprint arXiv:2509.16842 , year =

work page arXiv
[57]

and Melnychuk, V

Ma, Y. and Melnychuk, V. and Schweisthal, J. and Feuerriegel, S. , title =. Advances in Neural Information Processing Systems , volume =

work page
[58]

, title =

Martin, J. , title =. arXiv preprint arXiv:2103.13906 , year =

work page arXiv
[59]

, title =

Meinshausen, N. , title =. Journal of Machine Learning Research , volume =

work page
[60]

and Frauen, D

Melnychuk, V. and Frauen, D. and Feuerriegel, S. , title =. Proceedings of the 40th International Conference on Machine Learning , pages =. 2023 , publisher =

work page 2023
[61]

, title =

Mielniczuk, J. , title =. Statistics & Probability Letters , volume =. 1987 , publisher =

work page 1987
[62]

Conditional Generative Adversarial Nets

Mirza, M. and Osindero, S. , title =. arXiv preprint arXiv:1411.1784 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[63]

Confidence and uncertainty assessment for distributional random forests , journal =

N. Confidence and uncertainty assessment for distributional random forests , journal =

work page
[64]

Proceedings of the 29th International Conference on Artificial Intelligence and Statistics , year =

N. Proceedings of the 29th International Conference on Artificial Intelligence and Statistics , year =

work page
[65]

Neumann, M. H. and von Sachs, R. , title =. The Annals of Statistics , volume =. 1997 , publisher =

work page 1997
[66]

, title =

Neyman, J. , title =. Roczniki Nauk Rolniczych , volume =

work page
[67]

and Ye, M

Nie, L. and Ye, M. and Liu, Q. and Nicolae, D. , title =. The 9th International Conference on Learning Representations , year =

work page
[68]

, title =

Nobel, A. , title =. The Annals of Statistics , volume =. 1996 , publisher =

work page 1996
[69]

and Pelenis, J

Norets, A. and Pelenis, J. , title =. Econometric Theory , volume =. 2014 , publisher =

work page 2014
[70]

and Pati, D

Norets, A. and Pati, D. , title =. Econometric Theory , volume =. 2017 , publisher =

work page 2017
[71]

and Imaizumi, M

Norimatsu, Y. and Imaizumi, M. , title =. Proceedings of the Fourth Conference on Causal Learning and Reasoning , pages =. 2025 , publisher =

work page 2025
[72]

and Yasui, S

Oka, T. and Yasui, S. and Hayakawa, Y. and Byambadalai, U. , title =. Econometric Reviews , volume =. 2026 , publisher =

work page 2026
[73]

and Shalit, U

Park, J. and Shalit, U. and Sch. Conditional distributional treatment effect with kernel conditional mean embeddings and. Proceedings of the 38th International Conference on Machine Learning , pages =. 2021 , publisher =

work page 2021
[74]

and Dunson, D

Pati, D. and Dunson, D. B. and Tokdar, S. T. , title =. Journal of Multivariate Analysis , volume =. 2013 , publisher =

work page 2013
[75]

, title =

Plancade, S. , title =. arXiv preprint arXiv:1110.5927 , year =

work page arXiv
[76]

and Zhu, J

Ren, Y. and Zhu, J. and Li, J. and Luo, Y. , title =. Advances in Neural Information Processing Systems , volume =

work page
[77]

, title =

Rothe, C. , title =. Journal of Econometrics , volume =. 2010 , publisher =

work page 2010
[78]

, title =

Rothe, C. , title =. Econometrica , volume =

work page
[79]

Rubin, D. B. , title =. Journal of Educational Psychology , volume =. 1974 , publisher =

work page 1974
[80]

and Goodfellow, I

Salimans, T. and Goodfellow, I. and Zaremba, W. and Cheung, V. and Radford, A. and Chen, X. , title =. Advances in Neural Information Processing Systems , volume =

work page

Showing first 80 references.