pith. sign in

arxiv: 2606.29076 · v1 · pith:MHB3TL2Nnew · submitted 2026-06-27 · 📊 stat.ME · stat.ML

Learning heterogeneous treatment effects under principal stratification

Pith reviewed 2026-06-30 08:23 UTC · model grok-4.3

classification 📊 stat.ME stat.ML
keywords principal stratificationheterogeneous treatment effectsdoubly robust estimationcausal inferencemachine learningprincipal ignorabilityodds ratio sensitivity
0
0 comments X

The pith

A doubly cross-fit doubly robust learner estimates conditional principal causal effects by resolving nested nuisances.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops methods to identify and estimate heterogeneous treatment effects within principal strata, which are subpopulations defined by potential intermediate outcomes. Existing approaches mostly studied average effects, but within-stratum variation matters for tailoring treatments to individuals. The authors combine principal ignorability with an odds ratio sensitivity model to relax the usual monotonicity assumption. They introduce a new machine learning estimator that uses double cross-fitting and sequential orthogonal learning to handle the complex nuisance parameters, and prove it achieves oracle efficiency with valid uniform confidence bands.

Core claim

We propose a novel doubly cross-fit doubly robust machine learner that resolves the nested nuisance structure inherent to principal stratification. Leveraging sequential orthogonal learning with regularized least-squares sieves, we derive L² and uniform limit theory, establish oracle efficiency, and construct uniform confidence bands for the proposed estimator. This allows estimation of conditional principal causal effects under principal ignorability and odds ratio sensitivity parameterization.

What carries the argument

The doubly cross-fit doubly robust machine learner using sequential orthogonal learning with regularized least-squares sieves to handle nested nuisance functions in principal stratification.

Load-bearing premise

The identification relies on principal ignorability together with an odds ratio sensitivity parameterization that replaces the monotonicity assumption.

What would settle it

If the proposed estimator fails to achieve the claimed oracle efficiency or the uniform confidence bands do not cover in repeated simulations under the stated assumptions, the central claims would be falsified.

Figures

Figures reproduced from arXiv: 2606.29076 by Fan Li, Jiaqi Tong.

Figure 1
Figure 1. Figure 1: Schematic of the doubly cross-fit pseudo-outcome regression procedure via se [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Simulation results presenting the root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p027_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The estimated CSACE curves and associated 95% pointwise and uniform con [PITH_FULL_IMAGE:figures/full_fig_p030_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p073_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p073_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p074_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Simulation results presenting box plots of the Root mean integrated squared [PITH_FULL_IMAGE:figures/full_fig_p074_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Simulation results displaying box plots of the pointwise estimation errors for the [PITH_FULL_IMAGE:figures/full_fig_p075_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Simulation results presenting box plots of the Root mean integrated squared [PITH_FULL_IMAGE:figures/full_fig_p075_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Simulation results displaying the pointwise bias and empirical pointwise cover [PITH_FULL_IMAGE:figures/full_fig_p076_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p076_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p077_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p077_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p078_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p078_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p079_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p079_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p080_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Simulation results presenting the Root mean integrated squared error (MISE), [PITH_FULL_IMAGE:figures/full_fig_p080_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: The estimated curves and associated 95% pointwise and uniform confidence [PITH_FULL_IMAGE:figures/full_fig_p081_20.png] view at source ↗
read the original abstract

Principal stratification provides a foundational framework for causal inference with intermediate outcomes by defining causal effects within subpopulations, yet existing work has largely focused on average effects across strata rather than treatment effect heterogeneity within strata. Such within-stratum heterogeneity informs individualized treatment decisions but the associated methods are sparse. We address this gap by studying the identification and estimation of the conditional principal causal effects under principal ignorability combined with an odds ratio sensitivity parameterization, which relaxes the monotonicity assumption. To efficiently learn these estimands, we propose a novel doubly cross-fit doubly robust machine learner that resolves the nested nuisance structure inherent to principal stratification. Leveraging sequential orthogonal learning with regularized least-squares sieves, we derive $\mathcal{L}^2$ and uniform limit theory, establish oracle efficiency, and construct uniform confidence bands for the proposed estimator. We use simulations to demonstrate the finite-sample performance of our estimator, and provide an empirical analysis of a randomized trial in acute lung injury, revealing informative patterns of treatment effect heterogeneity within the always-survivor subpopulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops identification results for conditional principal causal effects (CPCEs) under principal ignorability combined with an odds-ratio sensitivity parameterization that relaxes monotonicity. It proposes a doubly cross-fit doubly robust machine learner that employs sequential orthogonal learning on regularized least-squares sieves to handle the nested nuisance structure of principal scores and outcome regressions, derives L² and uniform limit theory, establishes oracle efficiency, and constructs uniform confidence bands. Finite-sample behavior is examined in simulations, and the method is applied to a randomized trial in acute lung injury to illustrate within-stratum heterogeneity among always-survivors.

Significance. If the limit theory and oracle-efficiency claims hold, the work would address a genuine gap by enabling estimation of heterogeneous effects inside principal strata without relying on monotonicity. The combination of sensitivity analysis, sequential orthogonalization, and uniform bands is a substantive technical contribution to causal machine learning. The empirical illustration, while illustrative, demonstrates potential practical value once the theoretical guarantees are confirmed.

major comments (2)
  1. [§4.2, Theorem 4.1] §4.2, Theorem 4.1 (oracle efficiency): the sequential influence function is stated to achieve the efficiency bound, yet the derivation does not explicitly display the remainder term arising from estimation of the odds-ratio sensitivity parameter inside the cross-fitting scheme. Without a displayed rate condition on this parameter or a separate orthogonality argument, it is unclear whether the claimed oracle property survives when the sensitivity model is only approximately correct.
  2. [§4.3] §4.3 (uniform bands): the construction of uniform confidence bands relies on the L² rate plus a maximal inequality for the sieve estimators, but the paper provides no explicit bound on the sieve approximation error for the principal-score nuisance under the odds-ratio parameterization. This rate is load-bearing for the uniform coverage claim.
minor comments (2)
  1. [§2–§3] Notation for the sensitivity parameter is introduced in §2 but not carried consistently into the influence-function expressions in §3; a single displayed equation linking the two would improve readability.
  2. [Simulation section] Simulation tables report coverage but do not tabulate the estimated sensitivity parameter or its variability; adding these columns would help readers assess sensitivity to the parameterization.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and insightful comments. We address the major comments point-by-point below, providing clarifications and indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§4.2, Theorem 4.1] §4.2, Theorem 4.1 (oracle efficiency): the sequential influence function is stated to achieve the efficiency bound, yet the derivation does not explicitly display the remainder term arising from estimation of the odds-ratio sensitivity parameter inside the cross-fitting scheme. Without a displayed rate condition on this parameter or a separate orthogonality argument, it is unclear whether the claimed oracle property survives when the sensitivity model is only approximately correct.

    Authors: We thank the referee for this observation. In the proposed framework, the odds-ratio sensitivity parameter is a fixed, user-specified tuning parameter that defines the sensitivity model and is not estimated from the data. Consequently, there is no estimation remainder term associated with it in the cross-fitting procedure. The sequential orthogonal learning is designed to achieve oracle efficiency with respect to the estimated nuisances (principal scores and outcome regressions) under the fixed sensitivity model. When the sensitivity model is misspecified, the estimator targets the parameter under the assumed model, consistent with the goals of sensitivity analysis. We will add a clarifying remark in §4.2 to explicitly state that the sensitivity parameter is fixed and discuss the implications for the influence function. revision: partial

  2. Referee: [§4.3] §4.3 (uniform bands): the construction of uniform confidence bands relies on the L² rate plus a maximal inequality for the sieve estimators, but the paper provides no explicit bound on the sieve approximation error for the principal-score nuisance under the odds-ratio parameterization. This rate is load-bearing for the uniform coverage claim.

    Authors: We agree that an explicit bound on the sieve approximation error for the principal-score nuisance is necessary for the uniform coverage result. The principal scores are estimated under the odds-ratio parameterization using regularized least-squares sieves, and the approximation error is controlled by the standard sieve theory assumptions (e.g., the entropy conditions and smoothness of the target functions). We will include an explicit statement of the required rate condition (such as the approximation error being o_p(n^{-1/4})) in the revised version of §4.3 and the appendix to make the uniform band construction fully rigorous. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation chain derives new estimator and limit theory from stated assumptions

full rationale

The abstract and description present identification under principal ignorability plus odds-ratio sensitivity, followed by a proposed doubly cross-fit doubly robust learner using sequential orthogonal learning on sieves. The claimed L²/uniform theory, oracle efficiency, and bands are presented as derived results rather than tautological renamings or self-citations that reduce the target quantities to fitted inputs by construction. No self-definitional steps, fitted-input predictions, or load-bearing self-citation chains are exhibited in the provided material. The work is therefore scored as self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on principal ignorability for identification and the odds ratio sensitivity model to relax monotonicity; these are domain assumptions in causal inference without independent verification in the abstract.

free parameters (1)
  • odds ratio sensitivity parameter
    Introduced to parameterize departure from monotonicity; value or range not specified in abstract but required for the model.
axioms (1)
  • domain assumption principal ignorability
    Combined with sensitivity parameterization to identify conditional principal causal effects.

pith-pipeline@v0.9.1-grok · 5692 in / 1173 out tokens · 30017 ms · 2026-06-30T08:23:52.116040+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

110 extracted references · 16 canonical work pages · 4 internal anchors

  1. [1]

    Biometrika , volume=

    Quasi-oracle estimation of heterogeneous treatment effects , author=. Biometrika , volume=. 2021 , publisher=

  2. [2]

    Journal of Educational and Behavioral Statistics , volume=

    Estimation of causal effects via principal stratification when some outcomes are truncated by “death” , author=. Journal of Educational and Behavioral Statistics , volume=. 2003 , publisher=

  3. [3]

    American Journal of Epidemiology , volume=

    Eliminating ambiguous treatment effects using estimands , author=. American Journal of Epidemiology , volume=. 2023 , publisher=

  4. [4]

    Statistical Applications in Genetics and Molecular Biology , volume=

    Super learner , author=. Statistical Applications in Genetics and Molecular Biology , volume=. 2007 , publisher=

  5. [5]

    American Journal of Evaluation , volume=

    Principal stratification: A tool for understanding variation in program effects across endogenous subgroups , author=. American Journal of Evaluation , volume=. 2015 , publisher=

  6. [6]

    Econometrica: Journal of the Econometric Society , pages=

    Root-N-consistent semiparametric regression , author=. Econometrica: Journal of the Econometric Society , pages=. 1988 , publisher=

  7. [7]

    The Annals of Statistics , volume=

    Minimax rates for heterogeneous causal effect estimation , author=. The Annals of Statistics , volume=. 2024 , publisher=

  8. [8]

    Electronic Journal of Statistics , volume=

    Towards optimal doubly robust estimation of heterogeneous causal effects , author=. Electronic Journal of Statistics , volume=. 2023 , publisher=

  9. [9]

    The Annals of Statistics , volume=

    Orthogonal statistical learning , author=. The Annals of Statistics , volume=. 2023 , publisher=

  10. [10]

    Biometrika , volume=

    Estimation of local treatment effects under the binary instrumental variable model , author=. Biometrika , volume=. 2021 , publisher=

  11. [11]

    Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation

    Cross-fitting and fast remainder rates for semiparametric estimation , author=. arXiv preprint arXiv:1801.09138 , year=

  12. [12]

    The international journal of biostatistics , volume=

    Super-learning of an optimal dynamic treatment rule , author=. The international journal of biostatistics , volume=. 2016 , publisher=

  13. [13]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Selective inference for effect modification via the lasso , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2022 , publisher=

  14. [14]

    2022 , eprint=

    Estimation and Inference on Heterogeneous Treatment Effects in High-Dimensional Dynamic Panels under Weak Dependence , author=. 2022 , eprint=

  15. [15]

    Journal of the American Statistical Association , volume=

    Estimation and inference of heterogeneous treatment effects using random forests , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

  16. [16]

    The International Journal of Biostatistics , volume=

    Statistical inference for variable importance , author=. The International Journal of Biostatistics , volume=. 2006 , publisher=

  17. [17]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=

    A nonparametric framework for treatment effect modifier discovery in high dimensions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , pages=. 2024 , publisher=

  18. [18]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Estimating heterogeneous treatment effects with right-censored data via causal survival forests , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2023 , publisher=

  19. [19]

    The International Journal of Biostatistics , volume=

    Doubly robust adaptive LASSO for effect modifier discovery , author=. The International Journal of Biostatistics , volume=. 2022 , publisher=

  20. [20]

    Proceedings of the National Academy of Sciences , volume=

    Recursive partitioning for heterogeneous causal effects , author=. Proceedings of the National Academy of Sciences , volume=. 2016 , publisher=

  21. [21]

    The Econometrics Journal , volume=

    Double machine learning-based programme evaluation under unconfoundedness , author=. The Econometrics Journal , volume=. 2022 , publisher=

  22. [22]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

    Jiang, Zhichao and Yang, Shu and Ding, Peng , title = ". Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2022 , month =. doi:10.1111/rssb.12538 , url =

  23. [23]

    Biometrics , volume=

    An estimator for treatment comparisons among survivors in randomized trials , author=. Biometrics , volume=. 2005 , publisher=

  24. [24]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

    Ding, Peng and Lu, Jiannan , title = ". Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2016 , month =. doi:10.1111/rssb.12191 , url =

  25. [25]

    , title =

    Tsybakov, Alexandre B. , title =. 2009 , publisher =

  26. [26]

    Biometrika , volume=

    Decomposition, identification and multiply robust estimation of natural mediation effects with multiple mediators , author=. Biometrika , volume=. 2022 , publisher=

  27. [27]

    arXiv preprint arXiv:2411.03489 , year=

    A Bayesian nonparametric approach to mediation and spillover effects with multiple mediators in cluster-randomized trials , author=. arXiv preprint arXiv:2411.03489 , year=

  28. [28]

    Annals of statistics , volume=

    Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness, and sensitivity analysis , author=. Annals of statistics , volume=. 2012 , publisher=

  29. [29]

    Probabilistic and causal inference: the works of Judea Pearl , pages=

    Direct and indirect effects , author=. Probabilistic and causal inference: the works of Judea Pearl , pages=

  30. [30]

    arXiv preprint arXiv:2404.18256 , year=

    Semiparametric causal mediation analysis in cluster-randomized experiments , author=. arXiv preprint arXiv:2404.18256 , year=

  31. [31]

    Identification, inference and sensitivity analysis for causal mediation effects , author=

  32. [32]

    Journal of the American Statistical Association , pages=

    Model-robust and efficient covariate adjustment for cluster-randomized experiments , author=. Journal of the American Statistical Association , pages=. 2024 , publisher=

  33. [33]

    2006 , publisher=

    Semiparametric theory and missing data , author=. 2006 , publisher=

  34. [34]

    arXiv preprint arXiv:2203.06469 , year=

    Semiparametric doubly robust targeted double machine learning: a review , author=. arXiv preprint arXiv:2203.06469 , year=

  35. [35]

    American journal of epidemiology , volume=

    Assessing natural direct and indirect effects through multiple pathways , author=. American journal of epidemiology , volume=. 2014 , publisher=

  36. [36]

    Journal of Business & Economic Statistics , volume=

    Estimating density ratio of marginals to joint: Applications to causal inference , author=. Journal of Business & Economic Statistics , volume=. 2023 , publisher=

  37. [37]

    The Journal of Machine Learning Research , volume=

    A least-squares approach to direct importance estimation , author=. The Journal of Machine Learning Research , volume=. 2009 , publisher=

  38. [38]

    The Econometrics Journal , volume=

    Debiased machine learning of conditional average treatment effects and other causal functions , author=. The Econometrics Journal , volume=. 2021 , publisher=

  39. [39]

    Journal of econometrics , volume=

    Convergence rates and asymptotic normality for series estimators , author=. Journal of econometrics , volume=. 1997 , publisher=

  40. [40]

    Unpublished manuscript , volume=

    Nonparametric conditional density estimation , author=. Unpublished manuscript , volume=. 2004 , publisher=

  41. [41]

    Journal of Econometrics , volume=

    Some new asymptotic theory for least squares series: Pointwise and uniform results , author=. Journal of Econometrics , volume=. 2015 , publisher=

  42. [42]

    2006 , publisher=

    An introduction to copulas , author=. 2006 , publisher=

  43. [43]

    Copula Theory and Its Applications: Proceedings of the Workshop Held in Warsaw, 25-26 September 2009 , pages=

    Copula estimation , author=. Copula Theory and Its Applications: Proceedings of the Workshop Held in Warsaw, 25-26 September 2009 , pages=. 2010 , organization=

  44. [44]

    Biometrika , volume=

    A semiparametric estimation procedure of dependence parameters in multivariate families of distributions , author=. Biometrika , volume=. 1995 , publisher=

  45. [45]

    Journal of the American Statistical Association , volume=

    Causal inference for social network data , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=

  46. [46]

    Biometrika , volume=

    Semiparametric counterfactual density estimation , author=. Biometrika , volume=. 2023 , publisher=

  47. [47]

    arXiv preprint arXiv:2309.12425 , year=

    Principal stratification with continuous post-treatment variables: Nonparametric identification and semiparametric estimation , author=. arXiv preprint arXiv:2309.12425 , year=

  48. [48]

    Handbook of econometrics , volume=

    Large sample sieve estimation of semi-nonparametric models , author=. Handbook of econometrics , volume=. 2007 , publisher=

  49. [49]

    The annals of applied statistics , volume=

    A Bayesian machine learning approach for estimating heterogeneous survivor causal effects: applications to a critical care trial , author=. The annals of applied statistics , volume=

  50. [50]

    Epidemiology , volume=

    Marginal structural models and causal inference in epidemiology , author=. Epidemiology , volume=. 2000 , publisher=

  51. [51]

    Journal of the American Statistical Association , volume=

    Independence weights for causal inference with continuous treatments , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=

  52. [52]

    Journal of the American Statistical Association , volume=

    Model-robust and efficient covariate adjustment for cluster-randomized experiments , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=

  53. [53]

    Biometrika , volume=

    Characterization of parameters with a mixed bias property , author=. Biometrika , volume=. 2021 , publisher=

  54. [54]

    Generalized random forests , author=

  55. [55]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Non-parametric methods for doubly robust estimation of continuous treatment effects , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2017 , publisher=

  56. [56]

    arXiv preprint arXiv:2208.00872 , year=

    Towards R-learner of conditional average treatment effects with a continuous treatment: T-identification, estimation, and inference , author=. arXiv preprint arXiv:2208.00872 , year=

  57. [57]

    arXiv preprint arXiv:2504.03063 , year=

    Nonparametric Estimation of Local Treatment Effects with Continuous Instruments , author=. arXiv preprint arXiv:2504.03063 , year=

  58. [58]

    arXiv preprint arXiv:2501.06969 , year=

    Doubly Robust Inference on Causal Derivative Effects for Continuous Treatments , author=. arXiv preprint arXiv:2501.06969 , year=

  59. [59]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Robust causal inference with continuous instruments using the local instrumental variable curve , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2019 , publisher=

  60. [60]

    1993 , publisher=

    Efficient and adaptive estimation for semiparametric models , author=. 1993 , publisher=

  61. [61]

    Boosting with early stopping: Convergence and consistency , author=

  62. [62]

    Econometrica , volume=

    Deep neural networks for estimation and inference , author=. Econometrica , volume=. 2021 , publisher=

  63. [63]

    Adaptive Concentration of Regression Trees, with Application to Random Forests

    Adaptive concentration of regression trees, with application to random forests , author=. arXiv preprint arXiv:1503.06388 , year=

  64. [64]

    Quantifying Individual Risk for Binary Outcomes

    Quantifying Individual Risk for Binary Outcome , author=. arXiv preprint arXiv:2402.10537 , year=

  65. [65]

    2009 , publisher=

    The elements of statistical learning: data mining, inference, and prediction , author=. 2009 , publisher=

  66. [66]

    IEEE Transactions on Information Theory , volume=

    Improved rates and asymptotic normality for nonparametric neural network estimators , author=. IEEE Transactions on Information Theory , volume=. 1999 , publisher=

  67. [67]

    IEEE Transactions on Information theory , volume=

    Universal approximation bounds for superpositions of a sigmoidal function , author=. IEEE Transactions on Information theory , volume=. 2002 , publisher=

  68. [68]

    Journal of Approximation Theory , volume=

    Random approximants and neural networks , author=. Journal of Approximation Theory , volume=. 1996 , publisher=

  69. [69]

    Approximation by combinations of ReLU and squared ReLU ridge functions with ell\^

    Klusowski, Jason M and Barron, Andrew R , journal=. Approximation by combinations of ReLU and squared ReLU ridge functions with ell\^. 2018 , publisher=

  70. [70]

    Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks

    Risk bounds for high-dimensional ridge function combinations including neural networks , author=. arXiv preprint arXiv:1607.01434 , year=

  71. [71]

    Journal of the Royal Statistical Society Series A: Statistics in Society , volume=

    Matching methods for truncation by death problems , author=. Journal of the Royal Statistical Society Series A: Statistics in Society , volume=. 2023 , publisher=

  72. [72]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Causal inference from 2K factorial designs by using potential outcomes , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2015 , publisher=

  73. [73]

    Biometrics , volume=

    Principal stratification in causal inference , author=. Biometrics , volume=. 2002 , publisher=

  74. [74]

    Imbens and Joshua D

    Guido W. Imbens and Joshua D. Angrist , journal =. Identification and Estimation of Local Average Treatment Effects , urldate =

  75. [75]

    Statistical Science , pages=

    Causal inference through potential outcomes and principal stratification: application to studies with ``censoring'' due to death , author=. Statistical Science , pages=. 2006 , publisher=

  76. [76]

    Statistica Sinica , year=

    Semiparametric principal stratification analysis beyond monotonicity , author=. Statistica Sinica , year=

  77. [77]

    Journal of Machine Learning Research , volume=

    High-dimensional L2-boosting: Rate of Convergence , author=. Journal of Machine Learning Research , volume=

  78. [78]

    Rama Cont and Jean-Philippe Bouchaud

    Chernozhukov, Victor and Chetverikov, Denis and Demirer, Mert and Duflo, Esther and Hansen, Christian and Newey, Whitney and Robins, James , title = ". The Econometrics Journal , volume =. 2018 , abstract = ". doi:10.1111/ectj.12097 , url =

  79. [79]

    Numerische Mathematik , volume=

    Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation , author=. Numerische Mathematik , volume=. 1978 , publisher=

  80. [80]

    Statistical Science , volume=

    Flexible smoothing with B-splines and penalties , author=. Statistical Science , volume=. 1996 , publisher=

Showing first 80 references.