Targeted Regularization for Causal Effect Estimation with Exponential Dispersion Family Outcomes

Enzheng Hua; Jiahong Li; Jiecheng Guo; Jixing Xu; Peng Zhen; Zeqin Yang; Zhichao Zou

arxiv: 2502.07295 · v2 · pith:S6MTHPJWnew · submitted 2025-02-11 · 💻 cs.LG

Targeted Regularization for Causal Effect Estimation with Exponential Dispersion Family Outcomes

Jiahong Li , Zeqin Yang , Jixing Xu , Enzheng Hua , Zhichao Zou , Peng Zhen , Jiecheng Guo This is my paper

Pith reviewed 2026-05-25 08:23 UTC · model grok-4.3

classification 💻 cs.LG

keywords causal effect estimationtargeted regularizationneural networksexponential dispersion familyvon Mises expansionaverage dose functiondouble robustnesssemiparametric estimation

0 comments

The pith

A targeted regularization framework derived from von Mises expansions corrects first-order bias for causal effect estimation with Exponential Dispersion Family outcomes in neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a unified targeted regularization approach to extend desirable semiparametric properties to neural network estimators of causal effects when outcomes belong to the Exponential Dispersion Family. It begins by deriving the von Mises expansion of the average dose function of canonical functions for discrete treatments and the sieve-projected version for continuous treatments. The expansion supplies the form of a regularization term that corrects first-order bias at the distributional level. This term is added to the training objective of a neural network that simultaneously learns the outcome regression, the propensity score, and a fluctuation parameter.

Core claim

We propose a unified targeted regularization framework for the Exponential Dispersion Family (EDF) to address this limitation. Specifically, we first derive the von Mises expansion of the average dose function of canonical functions (ADCF) for discrete treatments and of the sieve-projected ADCF for continuous treatments. Second, we use this expansion to construct a unified targeted regularization, that corrects first-order bias at the distributional level. We integrate this objective into a NN architecture that jointly estimates the outcome model, propensity score model, and fluctuation parameter end-to-end.

What carries the argument

The von Mises expansion of the average dose function of canonical functions (ADCF) or its sieve-projected counterpart, which supplies the explicit form of the targeted regularization term that corrects first-order bias.

If this is right

The method applies to binary, count, and other non-continuous Exponential Dispersion Family outcomes in addition to continuous ones.
The neural network jointly optimizes the outcome model, propensity score model, and fluctuation parameter in a single end-to-end training procedure.
The resulting estimator inherits first-order bias correction at the distributional level and the associated semiparametric convergence properties.
Double robustness holds when either the outcome model or the propensity score model is correctly specified.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same expansion-based construction could be examined for outcome families outside the Exponential Dispersion Family.
Empirical tests on count-valued data from health or social domains would reveal whether the joint estimation procedure scales to realistic sample sizes.

Load-bearing premise

The von Mises expansion of the ADCF can be turned into a regularization penalty inside the neural network loss whose joint optimization produces the claimed first-order bias correction.

What would settle it

A simulation in which the proposed regularization term fails to reduce first-order bias of the causal effect estimator relative to an unregularized neural network on Exponential Dispersion Family outcomes would falsify the central claim.

Figures

Figures reproduced from arXiv: 2502.07295 by Enzheng Hua, Jiahong Li, Jiecheng Guo, Jixing Xu, Peng Zhen, Zeqin Yang, Zhichao Zou.

**Figure 1.** Figure 1: Network architecture. and asymptotically normal estimation. In next section, we will show how to combine µˆ and πˆ to obtain doubly robust estimator with desirable properties. 5. Targeted Regularization for Exponential Family Outcomes In section 5.1, we derive the von-Mises expansion of ADCF, which enables us to construct a doubly robust estimator by removing the estimated first-order bias. Based on the do… view at source ↗

**Figure 2.** Figure 2: Sensitivity analysis on simulation data of binary treatment setting training (67%), validation (23%), and test (10%). The validation dataset is used for hyperparameter selection and early-stopping. Besides, we perform 5 replications for each dataset to report the mean and standard deviation of the corresponding metric on test set. 6.4. Result and Analysis 6.4.1. OVERALL PERFORMANCE [PITH_FULL_IMAGE:figu… view at source ↗

read the original abstract

Neural Networks (NNs) for causal effect estimation have shown strong empirical performance, yet endowing them with desirable semiparametric properties -- doubly robustness and fast convergence rates -- remains challenging. A common approach to address this is targeted regularization, which modifies the objective function of NNs. However, existing work on neural causal effect estimation is largely limited to continuous outcomes, restricting its applicability to settings involving binary, count, or other skewed outcomes commonly encountered in practice. We propose a unified targeted regularization framework for the Exponential Dispersion Family (EDF) to address this limitation. Specifically, we first derive the von Mises expansion of the average dose function of canonical functions (ADCF) for discrete treatments and of the sieve-projected ADCF for continuous treatments. Second, we use this expansion to construct a unified targeted regularization, that corrects first-order bias at the distributional level. We integrate this objective into a NN architecture that jointly estimates the outcome model, propensity score model, and fluctuation parameter end-to-end. Experimental results demonstrate the effectiveness of our method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends targeted regularization to EDF outcomes with a von Mises expansion of the ADCF, filling a real gap but with light experimental detail.

read the letter

The paper's core move is to take the targeted regularization approach that worked for continuous outcomes and make it apply to the full exponential dispersion family. They derive the von Mises expansion of the average dose function of canonical functions for discrete treatments and the sieve-projected version for continuous treatments, then turn that expansion into a single regularization term that the neural net optimizes jointly with the outcome model, propensity model, and fluctuation parameter. This produces first-order bias correction at the distributional level for binary, count, and similar outcomes. That is the actual new piece relative to the continuous-only literature it cites, and the construction follows the standard targeted learning template once the expansion exists. The stress-test note is right that no internal inconsistency shows up in the stated steps. The approach is technically coherent on its own terms. The main soft spots are practical rather than foundational. The abstract gives only a high-level description of the experiments, so it is difficult to assess how large the gains are over baselines or how sensitive results are to the choice of fluctuation parameter during joint training. For continuous treatments the sieve projection adds another layer whose finite-sample behavior is not spelled out. These are the usual issues that show up when a method moves from derivation to implementation, not load-bearing flaws. This work is aimed at people doing neural causal estimation who routinely encounter non-Gaussian outcomes. A reader already familiar with targeted regularization or double robustness in ML settings will see the value in the unified EDF treatment. It is grounded enough and addresses a clear limitation, so it deserves a serious referee even if revisions will likely be needed on the empirical side and on tuning guidance.

Referee Report

0 major / 2 minor

Summary. The paper proposes a unified targeted regularization framework for neural-network causal effect estimation with outcomes from the Exponential Dispersion Family (EDF). It derives the von Mises expansion of the average dose function of canonical functions (ADCF) for discrete treatments and of the sieve-projected ADCF for continuous treatments, then constructs a single regularization term from this expansion that is optimized jointly with the outcome model, propensity model, and fluctuation parameter inside an NN to achieve first-order bias correction at the distributional level.

Significance. If the derivation and the resulting semiparametric properties hold, the framework would extend targeted-learning techniques to the broad class of EDF outcomes (binary, count, skewed continuous) that are common in practice but currently underserved by existing neural causal estimators. The end-to-end joint optimization and the unified treatment of discrete/continuous cases are practical strengths.

minor comments (2)

The abstract states that the NN 'jointly estimates the outcome model, propensity score model, and fluctuation parameter end-to-end,' but the precise form of the joint loss (including how the fluctuation parameter enters the regularization term) should be written explicitly in the methods section for reproducibility.
Experimental results are mentioned but no details on the EDF link functions, simulation designs, or real-data outcomes are provided in the abstract; the main text should include a table or section summarizing these choices and the corresponding performance metrics.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of its potential significance in extending targeted regularization to EDF outcomes, and recommendation for minor revision. The report does not list any specific major comments.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation begins with the external von Mises expansion (a standard semiparametric tool) applied to the ADCF or sieve-projected ADCF, then constructs the regularization term from that expansion. No self-citation is load-bearing, no fitted fluctuation parameter is renamed as a prediction, and the joint NN optimization is an implementation detail rather than a definitional reduction. The central claim therefore retains independent mathematical content outside its own fitted values.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the applicability of the von Mises expansion to the ADCF for EDF outcomes and on the assumption that joint NN optimization of the resulting objective yields the desired bias correction.

free parameters (1)

fluctuation parameter
Estimated end-to-end inside the NN as part of the targeted regularization objective.

axioms (1)

standard math von Mises expansion of the average dose function of canonical functions (ADCF) for discrete treatments and sieve-projected ADCF for continuous treatments
Invoked to construct the first-order bias correction term.

pith-pipeline@v0.9.0 · 5726 in / 1283 out tokens · 38398 ms · 2026-05-25T08:23:17.640735+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we first derive the von Mises expansion of the average dose function of canonical functions (ADCF) ... to construct a unified targeted regularization, that corrects first-order bias at the distributional level
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we generalize functional targeted regularization to exponential families

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

[1]

Counterfactual representation learning with balancing weights

Assaad, S., Zeng, S., Tao, C., Datta, S., Mehta, N., Henao, R., Li, F., and Carin, L. Counterfactual representation learning with balancing weights. In International Conference on Artificial Intelligence and Statistics, pp.\ 1972--1980. PMLR, 2021

work page 1972
[2]

Estimating the effects of continuous-valued interventions using generative adversarial networks

Bica, I., Jordon, J., and van der Schaar, M. Estimating the effects of continuous-valued interventions using generative adversarial networks. Advances in Neural Information Processing Systems, 33: 0 16434--16445, 2020

work page 2020
[3]

Double/debiased/neyman machine learning of treatment effects

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., and Newey, W. Double/debiased/neyman machine learning of treatment effects. American Economic Review, 107 0 (5): 0 261–65, May 2017. doi:10.1257/aer.p20171038. URL https://www.aeaweb.org/articles?id=10.1257/aer.p20171038

work page doi:10.1257/aer.p20171038 2017
[4]

Double/debiased machine learning for treatment and structural parameters

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21 0 (1): 0 C1--C68, 01 2018. ISSN 1368-4221. doi:10.1111/ectj.12097. URL https://doi.org/10.1111/ectj.12097

work page doi:10.1111/ectj.12097 2018
[5]

A., and Wu, C

Chiang, C.-T., Rice, J. A., and Wu, C. O. Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. Journal of the American Statistical Association, 96 0 (454): 0 605--619, 2001

work page 2001
[6]

and Zhang, W

Fan, J. and Zhang, W. Statistical estimation in varying coefficient models. The annals of Statistics, 27 0 (5): 0 1491--1518, 1999

work page 1999
[7]

H., Liang, T., and Misra, S

Farrell, M. H., Liang, T., and Misra, S. Deep neural networks for estimation and inference. Econometrica, 89 0 (1): 0 181--213, 2021. doi:https://doi.org/10.3982/ECTA16901. URL https://onlinelibrary.wiley.com/doi/abs/10.3982/ECTA16901

work page doi:10.3982/ecta16901 2021
[8]

and Hastie, T

Gao, Z. and Hastie, T. Estimating heterogeneous treatment effects for general responses, 2022. URL https://arxiv.org/abs/2103.04277

work page arXiv 2022
[9]

A., Goodman, S

Glass, T. A., Goodman, S. N., Hern \'a n, M. A., and Samet, J. M. Causal inference in public health. Annual review of public health, 34 0 (1): 0 61--75, 2013

work page 2013
[10]

and Greiner, R

Hassanpour, N. and Greiner, R. Counterfactual regression with importance sampling weights. In IJCAI, pp.\ 5880--5887. Macao, 2019

work page 2019
[11]

and Tibshirani, R

Hastie, T. and Tibshirani, R. Varying-coefficient models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 55 0 (4): 0 757--779, 1993

work page 1993
[12]

Learning representations for counterfactual inference

Johansson, F., Shalit, U., and Sontag, D. Learning representations for counterfactual inference. In International conference on machine learning, pp.\ 3020--3029. PMLR, 2016

work page 2016
[13]

Learning Weighted Representations for Generalization Across Designs

Johansson, F. D., Kallus, N., Shalit, U., and Sontag, D. Learning weighted representations for generalization across designs. arXiv preprint arXiv:1802.08598, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[14]

and Ester, M

Kazemi, A. and Ester, M. Adversarially balanced representation for continuous treatment effect estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp.\ 13085--13093, 2024

work page 2024
[15]

Kennedy, E. H. Towards optimal doubly robust estimation of heterogeneous causal effects . Electronic Journal of Statistics, 17 0 (2): 0 3008 -- 3049, 2023 a . doi:10.1214/23-EJS2157. URL https://doi.org/10.1214/23-EJS2157

work page doi:10.1214/23-ejs2157 2023
[16]

Kennedy, E. H. Semiparametric doubly robust targeted double machine learning: a review, 2023 b . URL https://arxiv.org/abs/2203.06469

work page arXiv 2023
[17]

H., Balakrishnan, S., and Wasserman, L

Kennedy, E. H., Balakrishnan, S., and Wasserman, L. A. Semiparametric counterfactual density estimation. Biometrika, 110 0 (4): 0 875--896, 03 2023. ISSN 1464-3510. doi:10.1093/biomet/asad017. URL https://doi.org/10.1093/biomet/asad017

work page doi:10.1093/biomet/asad017 2023
[18]

Matching via dimensionality reduction for estimation of treatment effects in digital marketing campaigns

Li, S., Vlassis, N., Kawale, J., and Fu, Y. Matching via dimensionality reduction for estimation of treatment effects in digital marketing campaigns. In IJCAI, volume 16, pp.\ 3768--3774, 2016

work page 2016
[19]

M., Sontag, D., Zemel, R., and Welling, M

Louizos, C., Shalit, U., Mooij, J. M., Sontag, D., Zemel, R., and Welling, M. Causal effect inference with deep latent-variable models. Advances in neural information processing systems, 30, 2017

work page 2017
[20]

Varying coefficient neural network with functional targeted regularization for estimating continuous treatment effects

Nie, L., Ye, M., qiang liu, and Nicolae, D. Varying coefficient neural network with functional targeted regularization for estimating continuous treatment effects. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=RmB-88r9dL

work page 2021
[21]

and Wager, S

Nie, X. and Wager, S. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108 0 (2): 0 299--319, 09 2020. ISSN 0006-3444. doi:10.1093/biomet/asaa076. URL https://doi.org/10.1093/biomet/asaa076

work page doi:10.1093/biomet/asaa076 2020
[22]

and Tsaftaris, S

Sanchez, P. and Tsaftaris, S. A. Diffusion causal models for counterfactual estimation. arXiv preprint arXiv:2202.10166, 2022

work page arXiv 2022
[23]

M., and Karlen, W

Schwab, P., Linhardt, L., Bauer, S., Buhmann, J. M., and Karlen, W. Learning counterfactual representations for estimating individual dose-response curves. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.\ 5612--5619, 2020

work page 2020
[24]

D., and Sontag, D

Shalit, U., Johansson, F. D., and Sontag, D. Estimating individual treatment effect: generalization bounds and algorithms. In International conference on machine learning, pp.\ 3076--3085. PMLR, 2017

work page 2017
[25]

Adapting neural networks for the estimation of treatment effects

Shi, C., Blei, D., and Veitch, V. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019

work page 2019
[26]

and Rose, S

van der Laan, M. and Rose, S. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Series in Statistics. Springer New York, 2011. ISBN 9781441997821. URL https://books.google.com.hk/books?id=RGnSX5aCAgQC

work page 2011
[27]

van der Vaart, A. W. Semiparametric statistics. In Lectures on Probability Theory and Statistics, volume 1781 of Lecture Notes in Mathematics, pp.\ 331--457. Springer, 2002. doi:10.1007/978-3-540-45744-8_4

work page doi:10.1007/978-3-540-45744-8_4 2002
[28]

and Athey, S

Wager, S. and Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113 0 (523): 0 1228--1242, 2018

work page 2018
[29]

Generalization bounds for estimating causal effects of continuous treatments

Wang, X., Lyu, S., Wu, X., Wu, T., and Chen, H. Generalization bounds for estimating causal effects of continuous treatments. Advances in Neural Information Processing Systems, 35: 0 8605--8617, 2022

work page 2022
[30]

N., Collisson, E

Weinstein, J. N., Collisson, E. A., Mills, G. B., Shaw, K. R., Ozenberger, B. A., Ellrott, K., Shmulevich, I., Sander, C., and Stuart, J. M. The cancer genome atlas pan-cancer analysis project. Nature genetics, 45 0 (10): 0 1113--1120, 2013

work page 2013
[31]

W\"uthrich, M. V. and Merz, M. Statistical Foundations of Actuarial Learning and its Applications. Springer Actuarial, June 2022. doi:10.1007/978-3-031-12409-9. URL https://link.springer.com/book/10.1007/978-3-031-12409-9

work page doi:10.1007/978-3-031-12409-9 2022
[32]

Ganite: Estimation of individualized treatment effects using generative adversarial nets

Yoon, J., Jordon, J., and Van Der Schaar, M. Ganite: Estimation of individualized treatment effects using generative adversarial nets. In International conference on learning representations, 2018

work page 2018
[33]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[1] [1]

Counterfactual representation learning with balancing weights

Assaad, S., Zeng, S., Tao, C., Datta, S., Mehta, N., Henao, R., Li, F., and Carin, L. Counterfactual representation learning with balancing weights. In International Conference on Artificial Intelligence and Statistics, pp.\ 1972--1980. PMLR, 2021

work page 1972

[2] [2]

Estimating the effects of continuous-valued interventions using generative adversarial networks

Bica, I., Jordon, J., and van der Schaar, M. Estimating the effects of continuous-valued interventions using generative adversarial networks. Advances in Neural Information Processing Systems, 33: 0 16434--16445, 2020

work page 2020

[3] [3]

Double/debiased/neyman machine learning of treatment effects

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., and Newey, W. Double/debiased/neyman machine learning of treatment effects. American Economic Review, 107 0 (5): 0 261–65, May 2017. doi:10.1257/aer.p20171038. URL https://www.aeaweb.org/articles?id=10.1257/aer.p20171038

work page doi:10.1257/aer.p20171038 2017

[4] [4]

Double/debiased machine learning for treatment and structural parameters

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21 0 (1): 0 C1--C68, 01 2018. ISSN 1368-4221. doi:10.1111/ectj.12097. URL https://doi.org/10.1111/ectj.12097

work page doi:10.1111/ectj.12097 2018

[5] [5]

A., and Wu, C

Chiang, C.-T., Rice, J. A., and Wu, C. O. Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. Journal of the American Statistical Association, 96 0 (454): 0 605--619, 2001

work page 2001

[6] [6]

and Zhang, W

Fan, J. and Zhang, W. Statistical estimation in varying coefficient models. The annals of Statistics, 27 0 (5): 0 1491--1518, 1999

work page 1999

[7] [7]

H., Liang, T., and Misra, S

Farrell, M. H., Liang, T., and Misra, S. Deep neural networks for estimation and inference. Econometrica, 89 0 (1): 0 181--213, 2021. doi:https://doi.org/10.3982/ECTA16901. URL https://onlinelibrary.wiley.com/doi/abs/10.3982/ECTA16901

work page doi:10.3982/ecta16901 2021

[8] [8]

and Hastie, T

Gao, Z. and Hastie, T. Estimating heterogeneous treatment effects for general responses, 2022. URL https://arxiv.org/abs/2103.04277

work page arXiv 2022

[9] [9]

A., Goodman, S

Glass, T. A., Goodman, S. N., Hern \'a n, M. A., and Samet, J. M. Causal inference in public health. Annual review of public health, 34 0 (1): 0 61--75, 2013

work page 2013

[10] [10]

and Greiner, R

Hassanpour, N. and Greiner, R. Counterfactual regression with importance sampling weights. In IJCAI, pp.\ 5880--5887. Macao, 2019

work page 2019

[11] [11]

and Tibshirani, R

Hastie, T. and Tibshirani, R. Varying-coefficient models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 55 0 (4): 0 757--779, 1993

work page 1993

[12] [12]

Learning representations for counterfactual inference

Johansson, F., Shalit, U., and Sontag, D. Learning representations for counterfactual inference. In International conference on machine learning, pp.\ 3020--3029. PMLR, 2016

work page 2016

[13] [13]

Learning Weighted Representations for Generalization Across Designs

Johansson, F. D., Kallus, N., Shalit, U., and Sontag, D. Learning weighted representations for generalization across designs. arXiv preprint arXiv:1802.08598, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[14] [14]

and Ester, M

Kazemi, A. and Ester, M. Adversarially balanced representation for continuous treatment effect estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp.\ 13085--13093, 2024

work page 2024

[15] [15]

Kennedy, E. H. Towards optimal doubly robust estimation of heterogeneous causal effects . Electronic Journal of Statistics, 17 0 (2): 0 3008 -- 3049, 2023 a . doi:10.1214/23-EJS2157. URL https://doi.org/10.1214/23-EJS2157

work page doi:10.1214/23-ejs2157 2023

[16] [16]

Kennedy, E. H. Semiparametric doubly robust targeted double machine learning: a review, 2023 b . URL https://arxiv.org/abs/2203.06469

work page arXiv 2023

[17] [17]

H., Balakrishnan, S., and Wasserman, L

Kennedy, E. H., Balakrishnan, S., and Wasserman, L. A. Semiparametric counterfactual density estimation. Biometrika, 110 0 (4): 0 875--896, 03 2023. ISSN 1464-3510. doi:10.1093/biomet/asad017. URL https://doi.org/10.1093/biomet/asad017

work page doi:10.1093/biomet/asad017 2023

[18] [18]

Matching via dimensionality reduction for estimation of treatment effects in digital marketing campaigns

Li, S., Vlassis, N., Kawale, J., and Fu, Y. Matching via dimensionality reduction for estimation of treatment effects in digital marketing campaigns. In IJCAI, volume 16, pp.\ 3768--3774, 2016

work page 2016

[19] [19]

M., Sontag, D., Zemel, R., and Welling, M

Louizos, C., Shalit, U., Mooij, J. M., Sontag, D., Zemel, R., and Welling, M. Causal effect inference with deep latent-variable models. Advances in neural information processing systems, 30, 2017

work page 2017

[20] [20]

Varying coefficient neural network with functional targeted regularization for estimating continuous treatment effects

Nie, L., Ye, M., qiang liu, and Nicolae, D. Varying coefficient neural network with functional targeted regularization for estimating continuous treatment effects. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=RmB-88r9dL

work page 2021

[21] [21]

and Wager, S

Nie, X. and Wager, S. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108 0 (2): 0 299--319, 09 2020. ISSN 0006-3444. doi:10.1093/biomet/asaa076. URL https://doi.org/10.1093/biomet/asaa076

work page doi:10.1093/biomet/asaa076 2020

[22] [22]

and Tsaftaris, S

Sanchez, P. and Tsaftaris, S. A. Diffusion causal models for counterfactual estimation. arXiv preprint arXiv:2202.10166, 2022

work page arXiv 2022

[23] [23]

M., and Karlen, W

Schwab, P., Linhardt, L., Bauer, S., Buhmann, J. M., and Karlen, W. Learning counterfactual representations for estimating individual dose-response curves. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.\ 5612--5619, 2020

work page 2020

[24] [24]

D., and Sontag, D

Shalit, U., Johansson, F. D., and Sontag, D. Estimating individual treatment effect: generalization bounds and algorithms. In International conference on machine learning, pp.\ 3076--3085. PMLR, 2017

work page 2017

[25] [25]

Adapting neural networks for the estimation of treatment effects

Shi, C., Blei, D., and Veitch, V. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019

work page 2019

[26] [26]

and Rose, S

van der Laan, M. and Rose, S. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Series in Statistics. Springer New York, 2011. ISBN 9781441997821. URL https://books.google.com.hk/books?id=RGnSX5aCAgQC

work page 2011

[27] [27]

van der Vaart, A. W. Semiparametric statistics. In Lectures on Probability Theory and Statistics, volume 1781 of Lecture Notes in Mathematics, pp.\ 331--457. Springer, 2002. doi:10.1007/978-3-540-45744-8_4

work page doi:10.1007/978-3-540-45744-8_4 2002

[28] [28]

and Athey, S

Wager, S. and Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113 0 (523): 0 1228--1242, 2018

work page 2018

[29] [29]

Generalization bounds for estimating causal effects of continuous treatments

Wang, X., Lyu, S., Wu, X., Wu, T., and Chen, H. Generalization bounds for estimating causal effects of continuous treatments. Advances in Neural Information Processing Systems, 35: 0 8605--8617, 2022

work page 2022

[30] [30]

N., Collisson, E

Weinstein, J. N., Collisson, E. A., Mills, G. B., Shaw, K. R., Ozenberger, B. A., Ellrott, K., Shmulevich, I., Sander, C., and Stuart, J. M. The cancer genome atlas pan-cancer analysis project. Nature genetics, 45 0 (10): 0 1113--1120, 2013

work page 2013

[31] [31]

W\"uthrich, M. V. and Merz, M. Statistical Foundations of Actuarial Learning and its Applications. Springer Actuarial, June 2022. doi:10.1007/978-3-031-12409-9. URL https://link.springer.com/book/10.1007/978-3-031-12409-9

work page doi:10.1007/978-3-031-12409-9 2022

[32] [32]

Ganite: Estimation of individualized treatment effects using generative adversarial nets

Yoon, J., Jordon, J., and Van Der Schaar, M. Ganite: Estimation of individualized treatment effects using generative adversarial nets. In International conference on learning representations, 2018

work page 2018

[33] [33]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page