arxiv: 2604.10412 · v1 · submitted 2026-04-12 · 📊 stat.ML · cs.LG· stat.ME

Recognition: unknown

Orthogonal machine learning for conditional odds and risk ratios

Jiacheng Ge , Iv\'an D\'iaz

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:39 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.ME

keywords conditional odds ratioconditional risk ratioorthogonal machine learningdoubly robust estimationR-learnerDR-learnertreatment effect heterogeneitynonparametric causal inference

0 comments

The pith

Orthogonal risk functions for conditional odds and risk ratios produce pseudo-outcomes with second-order remainder properties like those for average treatment effects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops estimators for conditional odds ratios and risk ratios that measure how treatment effects vary across groups. It generalizes orthogonal machine learning techniques previously used for average treatment effects to these ratio measures by deriving new risk functions. The resulting pseudo-outcomes have errors that depend on products rather than sums of nuisance estimation errors, allowing flexible machine learning for those nuisances. Simulations across hundreds of data scenarios show the new estimators reduce bias and mean squared error compared with standard alternatives in complex settings. An analysis of national health survey data demonstrates that the approach reveals treatment heterogeneity missed by traditional regression.

Core claim

We derive orthogonal risk functions for the OR and RR and show that the associated pseudo-outcomes satisfy second-order conditional-mean remainder properties analogous to the ATE case. We also evaluate estimators for the conditional ATE, OR, and RR in a comprehensive nonparametric Monte Carlo simulation study to compare them with common alternatives under hundreds of different data-generating distributions. Our numerical studies provide empirical guidance for choosing an estimator, showing that nonparametric estimators significantly reduce bias and mean squared error in more complex settings.

What carries the argument

Orthogonal risk functions for the odds ratio and risk ratio that generate pseudo-outcomes satisfying second-order conditional-mean remainder properties.

If this is right

The estimators achieve lower bias and mean squared error than parametric models when data patterns are complex.
Treatment targeting improves by uncovering heterogeneity obscured by standard regression.
The approach extends the benefits of DR-learner and R-learner methods from ATE to OR and RR parameters.
Real-world applications such as NHANES data analysis yield improved decision rules for interventions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar orthogonalization may apply to other nonlinear causal effect measures beyond OR and RR.
Practitioners gain most when they pair these methods with flexible, high-quality nuisance estimators.
The methods support more reliable subgroup-specific treatment recommendations in precision health settings.
The simulation design provides a reusable template for evaluating estimators of other conditional causal parameters.

Load-bearing premise

Nuisance functions such as propensity scores and outcome regressions can be estimated at rates fast enough that their product terms dominate the remainder and vanish asymptotically.

What would settle it

A Monte Carlo experiment in which the proposed OR and RR estimators fail to show lower mean squared error than non-orthogonal alternatives despite accurate nuisance function fits in complex data-generating processes.

Figures

Figures reproduced from arXiv: 2604.10412 by Iv\'an D\'iaz, Jiacheng Ge.

**Figure 2.** Figure 2: Conditional OR: interaction order 3, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p017_2.png] view at source ↗

**Figure 3.** Figure 3: Conditional OR: interaction order 1, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: Conditional OR: interaction order 1, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

**Figure 5.** Figure 5: NHANES application for the effect of physical activity on sleep trouble. The left panel shows [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

**Figure 6.** Figure 6: Conditional OR: interaction order 3, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p033_6.png] view at source ↗

**Figure 7.** Figure 7: Conditional OR: interaction order 3, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p034_7.png] view at source ↗

**Figure 8.** Figure 8: Conditional OR: interaction order 2, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p035_8.png] view at source ↗

**Figure 9.** Figure 9: Conditional OR: interaction order 2, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p036_9.png] view at source ↗

**Figure 10.** Figure 10: Conditional OR: interaction order 2, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p037_10.png] view at source ↗

**Figure 11.** Figure 11: Conditional OR: interaction order 2, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p038_11.png] view at source ↗

**Figure 12.** Figure 12: Conditional OR: interaction order 1, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p039_12.png] view at source ↗

**Figure 13.** Figure 13: Conditional OR: interaction order 1, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p040_13.png] view at source ↗

**Figure 14.** Figure 14: Conditional RR: interaction order 3, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p041_14.png] view at source ↗

**Figure 15.** Figure 15: Conditional RR: interaction order 3, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p042_15.png] view at source ↗

**Figure 16.** Figure 16: Conditional RR: interaction order 3, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p043_16.png] view at source ↗

**Figure 17.** Figure 17: Conditional RR: interaction order 3, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p044_17.png] view at source ↗

**Figure 18.** Figure 18: Conditional RR: interaction order 2, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p045_18.png] view at source ↗

**Figure 19.** Figure 19: Conditional RR: interaction order 2, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p046_19.png] view at source ↗

**Figure 20.** Figure 20: Conditional RR: interaction order 2, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p047_20.png] view at source ↗

**Figure 21.** Figure 21: Conditional RR: interaction order 2, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p048_21.png] view at source ↗

**Figure 22.** Figure 22: Conditional RR: interaction order 1, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p049_22.png] view at source ↗

**Figure 23.** Figure 23: Conditional RR: interaction order 1, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p050_23.png] view at source ↗

**Figure 24.** Figure 24: Conditional RR: interaction order 1, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p051_24.png] view at source ↗

**Figure 25.** Figure 25: Conditional RR: interaction order 1, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p052_25.png] view at source ↗

**Figure 26.** Figure 26: Conditional ATE: interaction order 3, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p053_26.png] view at source ↗

**Figure 27.** Figure 27: Conditional ATE: interaction order 3, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p054_27.png] view at source ↗

**Figure 28.** Figure 28: Conditional ATE: interaction order 3, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p055_28.png] view at source ↗

**Figure 29.** Figure 29: Conditional ATE: interaction order 3, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p056_29.png] view at source ↗

**Figure 30.** Figure 30: Conditional ATE: interaction order 2, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p057_30.png] view at source ↗

**Figure 31.** Figure 31: Conditional ATE: interaction order 2, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p058_31.png] view at source ↗

**Figure 32.** Figure 32: Conditional ATE: interaction order 2, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p059_32.png] view at source ↗

**Figure 33.** Figure 33: Conditional ATE: interaction order 2, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p060_33.png] view at source ↗

**Figure 34.** Figure 34: Conditional ATE: interaction order 1, sample size 2000 [PITH_FULL_IMAGE:figures/full_fig_p061_34.png] view at source ↗

**Figure 35.** Figure 35: Conditional ATE: interaction order 1, sample size 1000 [PITH_FULL_IMAGE:figures/full_fig_p062_35.png] view at source ↗

**Figure 36.** Figure 36: Conditional ATE: interaction order 1, sample size 500 [PITH_FULL_IMAGE:figures/full_fig_p063_36.png] view at source ↗

**Figure 37.** Figure 37: Conditional ATE: interaction order 1, sample size 200 [PITH_FULL_IMAGE:figures/full_fig_p064_37.png] view at source ↗

read the original abstract

Conditional effects are commonly used measures for understanding how treatment effects vary across different groups, and are often used to target treatments/interventions to groups who benefit most. In this work we review existing methods and propose novel ones, focusing on the odds ratio (OR) and the risk ratio (RR). While estimation of the conditional average treatment effect (ATE) has been widely studied, estimators for the OR and RR lag behind, and cutting edge estimators such as those based on doubly robust transformations or orthogonal risk functions have not been generalized to these parameters. We propose such a generalization here, focusing on the DR-learner and the R-learner. We derive orthogonal risk functions for the OR and RR and show that the associated pseudo-outcomes satisfy second-order conditional-mean remainder properties analogous to the ATE case. We also evaluate estimators for the conditional ATE, OR, and RR in a comprehensive nonparametric Monte Carlo simulation study to compare them with common alternatives under hundreds of different data-generating distributions. Our numerical studies provide empirical guidance for choosing an estimator. For instance, they show that while parametric models are useful in very simple settings, the proposed nonparametric estimators significantly reduce bias and mean squared error in the more complex settings expected in the real world. We illustrate the methods in the analysis of physical activity and sleep trouble in U.S. adults using data from the National Health and Nutrition Examination Survey (NHANES). The results demonstrate that our estimators uncover substantial treatment effect heterogeneity that is obscured by traditional regression approaches and lead to improved treatment decision rules, highlighting the importance of data-adaptive methods for advancing precision health research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives orthogonal risks and pseudo-outcomes for conditional OR and RR that keep the second-order remainder property, backed by large simulations and an NHANES example, but the nuisance rate requirements are not fully stress-tested.

read the letter

The core advance is the extension of the DR-learner and R-learner to conditional odds ratios and risk ratios. They derive the corresponding orthogonal risk functions and show that the pseudo-outcomes have conditional-mean remainders that are products of nuisance errors, just like the ATE case. That is the non-routine part and it is done cleanly enough to be usable. The Monte Carlo study across hundreds of distributions is a real strength; it gives concrete guidance on when the nonparametric versions beat parametric models in bias and MSE, and the NHANES analysis shows the methods can surface heterogeneity that standard regression hides. Those pieces are practical and worth having. The soft spot is the rate condition. The second-order property only delivers sqrt(n) consistency when the propensity and outcome regressions converge faster than n^{-1/4}. The abstract and stress-test note suggest the simulations did not push the boundary cases where conditional probabilities approach 0 or 1 and standard nonparametric estimators often slow down. If the full derivations spell out the exact smoothness or dimension requirements and the sims actually cover slow-nuisance regimes, that concern shrinks; otherwise it stays material. The work is aimed at statisticians and epidemiologists who need heterogeneous effects on the OR or RR scale rather than ATE. A reader already using orthogonal methods for ATE will pick up the extension quickly and get usable code-like guidance from the simulations. It deserves peer review because the generalization is substantive, the empirical comparison is broad, and the gap it fills is real, even if a referee will want tighter verification of the remainder rates and boundary behavior.

Referee Report

2 major / 2 minor

Summary. The paper derives orthogonal risk functions and associated pseudo-outcomes for the conditional odds ratio (OR) and risk ratio (RR) that exhibit second-order conditional-mean remainder properties analogous to those for the conditional average treatment effect. It proposes generalizations of the DR-learner and R-learner to these functionals, evaluates the resulting estimators against common alternatives in a large-scale nonparametric Monte Carlo study across hundreds of data-generating distributions, and applies the methods to NHANES data on physical activity and sleep trouble to demonstrate improved detection of treatment effect heterogeneity.

Significance. If the second-order remainder properties hold under standard nonparametric nuisance estimators, the work extends doubly robust orthogonal learning to two important conditional effect measures for binary outcomes, enabling more robust estimation of heterogeneous treatment effects in settings where ATE alone is insufficient. The comprehensive simulation design across diverse distributions is a clear strength and supplies practical guidance on estimator choice; the real-data illustration shows the methods can uncover heterogeneity missed by parametric regression.

major comments (2)

[§3.3, Eq. (12)] §3.3, Eq. (12) and surrounding derivation: the claimed second-order remainder for the RR pseudo-outcome is stated as the product of propensity and outcome-regression errors, but the text does not derive or cite the precise rate conditions (e.g., each nuisance converging faster than n^{-1/4}) under which this product is o_p(n^{-1/2}) when the conditional probabilities approach the boundary; without these conditions the advertised sqrt(n) consistency is not guaranteed for standard estimators such as random forests or kernels.
[§5.1] §5.1, simulation design: none of the 200+ data-generating processes include regimes in which the propensity score approaches 0 or 1 while the outcome regression remains bounded away from 0/1; this omission leaves untested the finite-sample behavior of the second-order property precisely where the product remainder is most likely to degrade.

minor comments (2)

[§3 and §4] Notation for the OR and RR pseudo-outcomes is introduced in §3 but reused with slight variations in §4; a single consolidated definition table would improve readability.
[§6] The NHANES application in §6 reports point estimates and confidence intervals but does not include a sensitivity analysis to the choice of nuisance estimators or bandwidths.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments, which have identified important points for clarification and strengthening. We address each major comment below and will incorporate revisions accordingly.

read point-by-point responses

Referee: [§3.3, Eq. (12)] §3.3, Eq. (12) and surrounding derivation: the claimed second-order remainder for the RR pseudo-outcome is stated as the product of propensity and outcome-regression errors, but the text does not derive or cite the precise rate conditions (e.g., each nuisance converging faster than n^{-1/4}) under which this product is o_p(n^{-1/2}) when the conditional probabilities approach the boundary; without these conditions the advertised sqrt(n) consistency is not guaranteed for standard estimators such as random forests or kernels.

Authors: We appreciate the referee highlighting the need for explicit rate conditions. The derivation in §3.3 shows that the remainder for the RR pseudo-outcome is the product of the propensity-score and outcome-regression errors. Under the maintained assumption that the conditional probabilities are bounded away from 0 and 1, standard nonparametric rates (each nuisance estimator o_p(n^{-1/4})) make the product o_p(n^{-1/2}), as is standard in the orthogonal-learning literature. Near boundaries, additional regularity or trimming is indeed required, analogous to conditions in doubly robust estimation for binary outcomes. We will revise the text to state these conditions explicitly and cite the relevant rate results for product remainders. revision: yes
Referee: [§5.1] §5.1, simulation design: none of the 200+ data-generating processes include regimes in which the propensity score approaches 0 or 1 while the outcome regression remains bounded away from 0/1; this omission leaves untested the finite-sample behavior of the second-order property precisely where the product remainder is most likely to degrade.

Authors: The referee is correct that our simulation design did not include propensity scores approaching the boundaries. The 200+ DGPs were chosen to span a wide range of complexities in the conditional effects and nuisance functions, but we agree that boundary regimes are a natural stress test for the product remainder. In the revision we will add a targeted set of simulations with propensity scores near 0 and 1 (while keeping outcome regressions bounded away from 0/1) to evaluate finite-sample behavior in these cases. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation of OR/RR orthogonal risks

full rationale

The paper derives orthogonal risk functions and associated pseudo-outcomes for conditional OR and RR as a direct mathematical generalization of the ATE framework, establishing second-order remainder properties through algebraic manipulation of the relevant expressions. This process does not reduce any claimed result to its inputs by construction, nor does it rely on fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations. The derivations stand as independent content, with the simulation study and NHANES application providing separate empirical support rather than circular reinforcement.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on the abstract alone, the paper relies on standard causal inference assumptions but introduces no explicit new free parameters or invented entities; full details on any implicit fitting choices are unavailable.

axioms (1)

domain assumption Standard causal assumptions including consistency, no unmeasured confounding, and positivity.
Required for defining and identifying conditional OR and RR in observational data.

pith-pipeline@v0.9.0 · 5590 in / 1322 out tokens · 63819 ms · 2026-05-10T16:39:04.711219+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 3 canonical work pages

[1]

and Pearl, J

Bareinboim, E. and Pearl, J. (2012). Controlling selection bias in causal inference. In Lawrence, N. D. and Girolami, M., editors, Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics , volume 22 of Proceedings of Machine Learning Research , pages 100--108, La Palma, Canary Islands. PMLR

2012
[2]

J., Klaassen, C

Bickel, P. J., Klaassen, C. A. J., Ritov, Y., and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models . Johns Hopkins University Press

1993
[3]

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal , 21(1):C1--C68

2018
[4]

A., George, E

Chipman, H. A., George, E. I., and McCulloch, R. E. (2010). Bart: Bayesian additive regression trees. The Annals of Applied Statistics , 4(1):266--298

2010
[5]

D \' az, I., Savenkov, O., and Ballman, K. (2018). Targeted learning ensembles for optimal individualized treatment rules with time-to-event outcomes. Biometrika , 105(3):723--738

2018
[6]

L., and Schenck, E

D \' az, I., Williams, N., Hoffman, K. L., and Schenck, E. J. (2023). Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association , 118(542):846--857

2023
[7]

portable

Doi, S. A., Furuya-Kanamori, L., Xu, C., Chivese, T., Lin, L., Musa, O. A., Hindy, G., Thalib, L., and Harrell Jr, F. E. (2022). The odds ratio is “portable” across baseline risk but not the relative risk: time to do away with the log link in binomial regression. Journal of Clinical Epidemiology , 142:288--293

2022
[8]

Dunson, D. B. and Xing, C. (2009). Nonparametric bayes modeling of multivariate categorical data. Journal of the American Statistical Association , 104(487):1042--1051

2009
[9]

Foster, D. J. and Syrgkanis, V. (2023). Orthogonal statistical learning. The Annals of Statistics , 51(3):879--908

2023
[10]

Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics , 19(1):1--67

1991
[11]

Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis

Harrell (2015). Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis . Springer Series in Statistics. Springer, Cham, 2 edition

2015
[12]

Hines, O., Dukes, O., Diaz-Ordaz, K., and Vansteelandt, S. (2022). Demystifying statistical learning based on efficient influence functions. The American Statistician , 76(3):292--304

2022
[13]

W., Lemeshow, S., and Sturdivant, R

Hosmer Jr, D. W., Lemeshow, S., and Sturdivant, R. X. (2013). Applied logistic regression . John Wiley & Sons

2013
[14]

Jun, S. J. and Lee, S. (2023). Average adjusted association: Efficient estimation with high dimensional confounders. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics , volume 206 of Proceedings of Machine Learning Research , pages 5980--5996

2023
[15]

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017)

2017
[16]

Kennedy, E. H. (2023). Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics , 17(2):3008--3049

2023
[17]

H., Ma, Z., McHugh, M

Kennedy, E. H., Ma, Z., McHugh, M. D., and Small, D. S. (2017). Nonparametric methods for doubly robust estimation of continuous treatment effects. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 79(4):1229--1245

2017
[18]

R., Sekhon, J

K \"u nzel, S. R., Sekhon, J. S., Bickel, P. J., and Yu, B. (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences , 116(10):4156--4165

2019
[19]

Sequential Double Robustness in Right-Censored Longitudinal Models

Luedtke, A. R., Sofrygin, O., van der Laan, M. J., and Carone, M. (2017). Sequential double robustness in right-censored longitudinal models. arXiv preprint arXiv:1705.02459

work page Pith review arXiv 2017
[20]

Luedtke, A. R. and van der Laan, M. J. (2016). Super-learning of an optimal dynamic treatment rule. The international journal of biostatistics , 12(1):305--332

2016
[21]

Malinsky, D., Shpitser, I., and Tchetgen Tchetgen, E. J. (2022). Semiparametric inference for nonmonotone missing-not-at-random data: the no self-censoring model. Journal of the American Statistical Association , 117(539):1415--1423

2022
[22]

Mittinty, M. N. and Lynch, J. (2023). Reflection on modern methods: risk ratio regression—simple concept yet complex computation. International journal of epidemiology , 52(1):309--314

2023
[23]

Morzywo ek, P., Decruyenaere, J., and Vansteelandt, S. (2025). On weighted orthogonal learners for heterogeneous treatment effects. Statistical Science . Future paper. Preprint: arXiv:2303.12687

work page arXiv 2025
[24]

and Wager, S

Nie, X. and Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika , 108(2):299--319

2021
[25]

Pearl, J. (2009). Causality: Models, Reasoning, and Inference . Cambridge University Press, Cambridge, 2 edition

2009
[26]

Pruim, R. (2014). NHANES: Data from the US National Health and Nutrition Examination Study . R package version 2.1.0

2014
[27]

and van der Laan , M

Rubin, D. and van der Laan , M. J. (2007). A doubly robust censoring unbiased transformation. The International Journal of Biostatistics , 3(1):Article 4

2007
[28]

E., Williams, N

Rudolph, K. E., Williams, N. T., Miles, C. H., Antonelli, J., and D \' az, I. (2023). All models are wrong, but which are useful? comparing parametric and nonparametric estimation of causal effects in finite samples. Journal of Causal Inference , 11(1):20230022

2023
[29]

van der Laan , L., Carone, M., and Luedtke, A. (2024). Combining t-learning and dr-learning: a framework for oracle-efficient estimation of causal contrasts. arXiv preprint arXiv:2402.01972

work page arXiv 2024
[30]

van der Laan, M. J. (2006). Statistical inference for variable importance. The International Journal of Biostatistics , 2(1)

2006
[31]

J., Polley, E

van der Laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical Applications in Genetics and Molecular Biology , 6(1):Article 25

2007
[32]

van der Laan , M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics , 2(1)

2006