Non-overlap Average Treatment Effect Bounds

Alec McClean; Herbert P. Susmann; Iv\'an D\'iaz

arxiv: 2509.20206 · v2 · pith:BEZ7DC4Bnew · submitted 2025-09-24 · 📊 stat.ME

Non-overlap Average Treatment Effect Bounds

Herbert P. Susmann , Alec McClean , Iv\'an D\'iaz This is my paper

Pith reviewed 2026-05-18 13:47 UTC · model grok-4.3

classification 📊 stat.ME

keywords average treatment effectpartial identificationoverlapcausal inferencetargeted minimum loss estimationmultiplier bootstrap

0 comments

The pith

For bounded outcomes, partial identification bounds on the ATE can be derived without any overlap assumption.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that overlap is not required to bound the average treatment effect when the outcome is known to lie in a fixed interval. The resulting bounds have width that grows directly with the fraction of the population where overlap fails, remaining informative whenever violations are modest. Smooth approximations to these bounds are introduced so that a targeted minimum loss estimator can be applied and shown to be root-n consistent under nonparametric conditions. A multiplier bootstrap then delivers confidence sets that stay valid uniformly across all sizes of the non-overlap subpopulation and all smoothing choices. The approach therefore avoids both trimming the population and changing the target parameter when overlap is only partially violated.

Core claim

When the outcome is bounded, non-overlap bounds give partial identification of the ATE whose width equals the measure of the non-overlap set times the length of the outcome interval. Smooth versions of these bounds admit a targeted minimum loss-based estimator that is asymptotically normal, and a multiplier bootstrap constructs uniformly valid confidence intervals over all overlap regimes and smoothing parameters.

What carries the argument

Non-overlap bounds: partial identification intervals for the ATE whose width equals the size of the subpopulation where the propensity score is zero or one.

If this is right

The bounds remain informative without discarding subjects from the non-overlap region.
Smooth approximations permit root-n consistent estimation under weak conditions.
The multiplier bootstrap supplies valid intervals uniformly over overlap regimes.
Researchers can report the tightest valid interval by varying the smoothing parameter.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bounding strategy may extend directly to other functionals such as the ATT.
In observational studies with limited overlap, reporting these bounds could replace or supplement trimming.
The uniform validity result suggests similar bootstrap constructions for other non-smooth causal functionals.

Load-bearing premise

The outcome variable lies in a known bounded interval.

What would settle it

A data example or simulation in which the true ATE lies strictly outside the computed non-overlap bounds, with known outcome range and known non-overlap proportion, would contradict the partial identification claim.

Figures

Figures reproduced from arXiv: 2509.20206 by Alec McClean, Herbert P. Susmann, Iv\'an D\'iaz.

**Figure 1.** Figure 1: Example smooth approximations sl(x, c, γ) (A) and sg(x, c, γ) (B) as defined in (4) with smoothness γ ∈ {1, 0.5, 0.25, 0.1}. The next result shows that if the smooth approximation functions satisfy Property 1 then they yield smooth bounds on the ATE. Proposition 2 (Smooth non-overlap bounds). Under the conditions of Proposition 1, suppose sl(x, c, γ) and sg(x, c, γ) satisfy Property 1. Then, E [PITH_FULL_… view at source ↗

**Figure 2.** Figure 2: Uniform 95% non-overlap bounds (for γ = 0.01) on the average treatment effect (ATE) of right heart catheterization on survival. The points illustrate the lower and upper bounds with respect to a logarithmic grid of propensity score thresholds. The lines between points are solely to guide the eye. The horizontal dotted lines indicate the tightest valid 95% uncertainty interval that may be formed from the no… view at source ↗

**Figure 3.** Figure 3: Uniform 95% non-overlap bounds (for γ = 0.001) on the average treatment effect (ATE) of right heart catheterization on survival. The points illustrate the lower and upper bounds with respect to a logarithmic grid of propensity score thresholds. The lines between points are solely to guide the eye. The horizontal dotted lines indicate the tightest valid 95% uncertainty interval that may be formed from the n… view at source ↗

read the original abstract

The average treatment effect (ATE), the mean difference in potential outcomes under treatment and control, is a canonical causal effect. Overlap, which says that all subjects have non-zero probability of either treatment status, is necessary to identify and estimate the ATE. When overlap fails, the standard solution is to change the estimand, and target a trimmed effect in a subpopulation satisfying overlap. When the outcome is bounded, we demonstrate that this compromise is unnecessary. We derive non-overlap bounds: partial identification bounds on the ATE that do not require overlap. The bounds have width proportional to the size of the non-overlap subpopulation, making them informative in common scenarios when overlap violations are limited. Since the bounds are non-smooth functionals, we derive smooth approximations amenable to semiparametric efficiency theory and propose a Targeted Minimum Loss-Based estimator that is $\sqrt{n}$-consistent and asymptotically normal under nonparametric conditions. A multiplier bootstrap procedure yields uniformly valid confidence sets across all non-overlap subpopulation sizes and smoothing parameters, allowing researchers to report the tightest valid interval. Formally, we compare non-overlap confidence intervals to confidence intervals based on point estimation across multiple overlap regimes. We illustrate the method via simulation studies and real-world data applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives tight non-overlap ATE bounds for bounded outcomes plus a working TMLE and uniform bootstrap.

read the letter

This paper's main contribution is a set of non-overlap bounds on the ATE that rely on outcome boundedness rather than trimming the population. The bounds are informative precisely when overlap violations are modest, which covers a lot of real applications. The width comes out exactly as (b-a) times the non-overlap probability, which is the narrowest interval the data and boundedness assumption can support. They then smooth the functional, apply semiparametric efficiency theory, and deliver a TMLE that is root-n consistent and asymptotically normal, with a multiplier bootstrap that maintains uniform coverage over smoothing choices and overlap sizes. That combination is the practical part. The argument tracks directly from replacing unidentified counterfactuals with the known bounds in the non-overlap region, and the stress-test confirms no internal contradiction or missing regularity condition. They also run the formal comparison to point estimators across overlap regimes and include simulations plus real-data examples, which helps show when the bounds stay useful. A soft spot is the reliance on a known, reasonably tight interval for the outcome; if the interval is wide or only loosely justified the resulting bounds can still be too broad to change practice. In cases with large non-overlap the intervals may not be actionable anyway, and some readers will still prefer trimming depending on the question. This is aimed at causal researchers who hit modest overlap failures with bounded outcomes and want to keep the full population. It deserves a serious referee because the core derivation is clean and the estimator development looks careful.

Referee Report

2 major / 2 minor

Summary. The manuscript derives partial identification bounds for the average treatment effect (ATE) that avoid the overlap assumption when the outcome is known to lie in a fixed interval [a, b]. These non-overlap bounds have width exactly (b − a) times the probability of the non-overlap subpopulation. To permit estimation and inference, the authors introduce smooth approximations to the indicator functions defining the bounds, construct a Targeted Minimum Loss-Based Estimator (TMLE) that is √n-consistent and asymptotically normal under nonparametric conditions, and develop a multiplier bootstrap that delivers uniformly valid confidence sets across all non-overlap probabilities and smoothing parameters. The method is compared with point-estimation approaches under varying overlap regimes and is illustrated with simulations and real-data examples.

Significance. If the central derivation and asymptotic results hold, the paper supplies a practically useful alternative to trimming or redefining the target population when overlap fails. The bounds remain informative whenever the non-overlap mass is small, and the accompanying semiparametric estimator and uniform bootstrap allow researchers to report the tightest valid interval without sacrificing √n rates. The explicit link between bound width and non-overlap probability, together with the machine-checkable sharpness argument noted in the skeptic review, constitutes a clear methodological advance for partial identification in causal inference.

major comments (2)

[§3.1] §3.1, display (3): the claim that the non-overlap bounds are sharp requires an explicit construction showing that the lower and upper bounds are attained by some joint distribution of the observed data and potential outcomes that is compatible with the observed marginals; without this construction the partial-identification statement remains informal.
[§5.2] §5.2, Theorem 2: the uniform validity of the multiplier bootstrap is stated to hold for all smoothing parameters λ_n and all non-overlap probabilities π_n; the proof must verify that the remainder term is o_p(1) uniformly when π_n → 0 at arbitrary rates, because the influence function degenerates in that limit.

minor comments (2)

[§4] The notation for the smoothed indicator functions (e.g., the logistic or kernel approximation) should be introduced once in §4 and then used consistently; currently the same symbol appears with different definitions in the text and the appendix.
[Table 2] Table 2 reports coverage probabilities but omits the average length of the confidence intervals; adding this column would allow direct comparison of informativeness across methods.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review, which includes a positive recommendation for minor revision. We address each major comment in turn below, providing the strongest honest responses consistent with the manuscript. Revisions will be incorporated as indicated.

read point-by-point responses

Referee: [§3.1] §3.1, display (3): the claim that the non-overlap bounds are sharp requires an explicit construction showing that the lower and upper bounds are attained by some joint distribution of the observed data and potential outcomes that is compatible with the observed marginals; without this construction the partial-identification statement remains informal.

Authors: We agree that an explicit construction of extremal distributions strengthens the sharpness claim and removes any informality. In the revised manuscript we add a self-contained construction: for the lower bound we set the potential outcomes in the non-overlap region to their minimal feasible values consistent with the observed marginals of Y and the treatment mechanism; symmetrically for the upper bound. This joint distribution matches the observed data law exactly and attains the stated bounds, confirming sharpness under the maintained bounded-outcome assumption. revision: yes
Referee: [§5.2] §5.2, Theorem 2: the uniform validity of the multiplier bootstrap is stated to hold for all smoothing parameters λ_n and all non-overlap probabilities π_n; the proof must verify that the remainder term is o_p(1) uniformly when π_n → 0 at arbitrary rates, because the influence function degenerates in that limit.

Authors: The manuscript already states uniform validity over all λ_n and π_n (including sequences where π_n → 0). To address the referee’s concern about possible degeneration of the influence function, the revised appendix supplies an additional uniform bound on the remainder term that holds for arbitrary rates of π_n → 0. The argument proceeds by splitting the remainder into a term controlled by the smoothing bias (which vanishes uniformly in λ_n) and a term controlled by the empirical process, whose envelope remains integrable even as the influence function norm approaches zero; the multiplier bootstrap then inherits the same uniform o_p(1) property. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central derivation replaces unidentified counterfactual expectations in non-overlap regions with the extremal values permitted by the known outcome bounds [a, b], yielding an ATE interval whose width equals (b - a) times the probability of the non-overlap subpopulation. This follows directly from the definition of the ATE under the boundedness assumption and does not reduce to any fitted parameter, self-referential definition, or load-bearing self-citation. The subsequent smooth approximations, TMLE, and multiplier bootstrap are constructed to target this independently derived functional under standard semiparametric conditions; the core bounds remain logically prior to and independent of the estimation steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that outcomes are bounded, which enables the partial identification result without overlap.

axioms (1)

domain assumption The outcome is bounded
Invoked to derive non-overlap bounds on the ATE.

pith-pipeline@v0.9.0 · 5748 in / 1181 out tokens · 52446 ms · 2026-05-18T13:47:52.735523+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages

[1]

M., Hall, W., Huang, W.-M., Wellner, J

Begun, J. M., Hall, W., Huang, W.-M., Wellner, J. A., et al. (1983). Information and asymptotic efficiency in parametric-nonparametric models. The Annals of Statistics , 11(2):432--452

work page 1983
[2]

and Keele, L

Ben-Michael, E. and Keele, L. (2023). Using balancing weights to target the treatment effect on the treated when overlap is poor. Epidemiology , 34(5)

work page 2023
[3]

J., Klaassen, C

Bickel, P. J., Klaassen, C. A., Ritov, Y., and Wellner, J. A. (1997). Efficient and Adaptive Estimation for Semiparametric Models . Springer-Verlag

work page 1997
[4]

and Kennedy, E

Bonvini, M. and Kennedy, E. H. (2022). Sensitivity analysis via the proportion of unmeasured confounding. Journal of the American Statistical Association , 117(539):1540--1550

work page 2022
[5]

H., Balakrishnan, S., and Wasserman, L

Branson, Z., Kennedy, E. H., Balakrishnan, S., and Wasserman, L. (2024). Causal effect estimation after propensity score trimming with continuous treatments

work page 2024
[6]

Busso, M., DiNardo, J., and McCrary, J. (2014). New evidence on the finite sample properties of propensity score reweighting and matching estimators. The Review of Economics and Statistics , 96(5):885--897

work page 2014
[7]

Chen, Q., Syrgkanis, V., and Austern, M. (2022). Debiased machine learning without sample-splitting for stable estimators. Advances in Neural Information Processing Systems , 35:3096--3109

work page 2022
[8]

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., et al. (2016). Double machine learning for treatment and causal parameters. arXiv preprint arXiv:1608.00060

work page arXiv 2016
[9]

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal , 21(1):C1--C68

work page 2018
[10]

V., Thomas, C., Harrell, Frank E., J., Wagner, D., Desbiens, N., Goldman, L., Wu, A

Connors, Alfred F., J., Speroff, T., Dawson, N. V., Thomas, C., Harrell, Frank E., J., Wagner, D., Desbiens, N., Goldman, L., Wu, A. W., Califf, R. M., Fulkerson, William J., J., Vidaillet, H., Broste, S., Bellamy, P., Lynn, J., and Knaus, W. A. (1996). The effectiveness of right heart catheterization in the initial care of critically iii patients. JAMA ,...

work page 1996
[11]

C., Lilienfeld, A

Cornfield, J., Haenszel, W., Hammond, E. C., Lilienfeld, A. M., Shimkin, M. B., and Wynder, E. L. (1959). Smoking and lung cancer: Recent evidence and a discussion of some questions. JNCI: Journal of the National Cancer Institute , 22(1):173--203

work page 1959
[12]

K., Hotz, V

Crump, R. K., Hotz, V. J., Imbens, G. W., and Mitnik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika , 96(1):187--199

work page 2009
[13]

E., Gruber, S., Lee, H., Dahabreh, I

Dang, L. E., Gruber, S., Lee, H., Dahabreh, I. J., Stuart, E. A., Williamson, B. D., Wyss, R., Díaz, I., Ghosh, D., Kıcıman, E., and et al. (2023). A causal roadmap for generating high-quality real-world evidence. Journal of Clinical and Translational Science , 7(1):e212

work page 2023
[14]

and van der Laan, M

D \' az, I. and van der Laan, M. J. (2013). Assessing the causal effect of policies: an example using stochastic interventions. The international journal of biostatistics , 9(2):161--174

work page 2013
[15]

L., and Schenck, E

D \' az, I., Williams, N., Hoffman, K. L., and Schenck, E. J. (2023). Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association , 118(542):846--857

work page 2023
[16]

Fernholz, L. T. (1983). von Mises Calculus For Statistical Functionals . Springer New York, New York, NY

work page 1983
[17]

and Stuart, E

Greifer, N. and Stuart, E. A. (2023). Choosing the causal estimand for propensity score analysis of observational studies

work page 2023
[18]

and van der Laan, M

Gruber, S. and van der Laan, M. J. (2010). A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome. The International Journal of Biostatistics , 6(1)

work page 2010
[19]

z ak, A., and Walk, H

Gy\" o rfi, L., Kohler, M., Krzy\. z ak, A., and Walk, H. (2002). A Distribution-Free Theory of Nanparametric Regression . Springer-Verlag, New York

work page 2002
[20]

Hern \'a n, M. A. and Robins, J. M. (2020). Causal Inference: What If . Chapman & Hall/CRC, Boca Raton

work page 2020
[21]

and Imbens, G

Hirano, K. and Imbens, G. W. (2001). Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and Outcomes Research Methodology , 2(3):259--278

work page 2001
[22]

Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. The Review of Economics and Statistics , 86(1):4--29

work page 2004
[23]

Imbens, G. W. and Manski, C. F. (2004). Confidence intervals for partially identified parameters. Econometrica , 72(6):1845--1857

work page 2004
[24]

and Schafer, J

Kang, J. and Schafer, J. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data (with discussion). Statistical Science , 22:523--39

work page 2007
[25]

Karavani, E., Bak, P., and Shimoni, Y. (2019). A discriminative approach for finding and characterizing positivity violations using decision trees

work page 2019
[26]

Kennedy, E. H. (2019). Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association , 114(526):645--656

work page 2019
[27]

Kennedy, E. H. (2024). Semiparametric doubly robust targeted double machine learning: A review. In Handbook of Statistical Methods for Precision Medicine , chapter 10, pages 207--236. Chapman and Hall/CRC, 1 edition

work page 2024
[28]

H., Balakrishnan, S., and Wasserman, L

Kennedy, E. H., Balakrishnan, S., and Wasserman, L. A. (2023). Semiparametric counterfactual density estimation. Biometrika , 110(4):875--896

work page 2023
[29]

K., Lessler, J., and Stuart, E

Lee, B. K., Lessler, J., and Stuart, E. A. (2011). Weight trimming and propensity score weighting. PLOS ONE , 6(3):1--6

work page 2011
[30]

and Weidner, M

Lee, S. and Weidner, M. (2021a). ATbounds: Bounding Treatment Effects by Limited Information Pooling . R package version 0.1.0

work page
[31]

and Weidner, M

Lee, S. and Weidner, M. (2021b). Bounding treatment effects by pooling limited information across observations

work page
[32]

W., Bonvini, M., Zeng, Z., Keele, L., and Kennedy, E

Levis, A. W., Bonvini, M., Zeng, Z., Keele, L., and Kennedy, E. H. (2025). Covariate-assisted bounds on causal effects with instrumental variables. Journal of the Royal Statistical Society Series B: Statistical Methodology , page qkaf028

work page 2025
[33]

L., and Zaslavsky, A

Li, F., Morgan, K. L., and Zaslavsky, A. M. (2018a). Balancing covariates via propensity score weighting. Journal of the American Statistical Association , 113(521):390--400

work page
[34]

E., and Li, F

Li, F., Thomas, L. E., and Li, F. (2018b). Addressing extreme propensity scores via the overlap weights. American Journal of Epidemiology , 188(1):250--257

work page
[35]

Y., Psaty, B

Lin, D. Y., Psaty, B. M., and Kronmal, R. A. (1998). Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics , 54(3):948--963

work page 1998
[36]

Ma, X., Sasaki, Y., and Wang, Y. (2024). testing limited overlap. Econometric Theory , page 1–34

work page 2024
[37]

Manski, C. F. (1990). Nonparametric bounds on treatment effects. The American Economic Review , 80(2):319--323

work page 1990
[38]

Manski, C. F. (1997). Monotone treatment response. Econometrica , 65(6):1311--1334

work page 1997
[39]

Manski, C. F. and Pepper, J. V. (2000). Monotone instrumental variables: With an application to the returns to schooling. Econometrica , 68(4):997--1010

work page 2000
[40]

and Díaz, I

McClean, A. and Díaz, I. (2025). Propensity score weighting across counterfactual worlds: longitudinal effects under positivity violations

work page 2025
[41]

Murphy, D. J. and Cluff, L. E. (1990). The SUPPORT study. Journal of Clinical Epidemiology , 43:V--X

work page 1990
[42]

H., Huang, M.-Y., Smid, M., and Scharfstein, D

Nabi, R., Bonvini, M., Kennedy, E. H., Huang, M.-Y., Smid, M., and Scharfstein, D. O. (2024). Semiparametric sensitivity analysis: unmeasured confounding in observational studies. Biometrics , 80(4):ujae106

work page 2024
[43]

L., Porter, K

Petersen, M. L., Porter, K. E., Gruber, S., Wang, Y., and van der Laan, M. J. (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research , 21(1):31--54. PMID: 21030422

work page 2012
[44]

Petersen, M. L. and van der Laan, M. J. (2014). Causal models and learning from data: Integrating causal modeling and statistical estimation. Epidemiology , 25(3)

work page 2014
[45]

G., Gilbert, P

Richardson, A., Hudgens, M. G., Gilbert, P. B., and Fine, J. P. (2014). Nonparametric Bounds and Sensitivity Analysis of Treatment Effects . Statistical Science , 29(4):596 -- 618

work page 2014
[46]

Rizk, J. G. (2025). When and why to use overlap weighting: Clarifying its role, assumptions, and estimand in real-world studies. Journal of Clinical Epidemiology , page 111942

work page 2025
[47]

Robins, J. (1989). The analysis of randomized and non-randomized aids treatment trials using a new approach in causal inference in longitudinal studies. In Sechrest, L., Freeman, H., and Mulley, A., editors, Health Service Methodology: A Focus on AIDS , pages 113--159. U.S. Public Health Service, National Center for Health SErvices Research, Washington D.C

work page 1989
[48]

Robins, J., Li, L., Tchetgen Tchetgen , E., and van der Vaart, A. W. (2009). Quadratic semiparametric von mises calculus. Metrika , 69(2-3):227--247

work page 2009
[49]

inverse probability

Robins, J., Sued, M., Lei-Gomez, Q., and Rotnitzky, A. (2007). Comment: Performance of double-robust estimators when" inverse probability" weights are highly variable. Statistical Science , 22(4):544--559

work page 2007
[50]

Robins, J. M. (1986). A new approach to causal inference in mortality studies with sustained exposure periods - application to control of the healthy worker survivor effect. Mathematical Modelling , 7:1393--1512

work page 1986
[51]

Robins, J. M. and Rotnitzky, A. (1995). Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association , 90(429):122--129

work page 1995
[52]

Rosenbaum, P. R. (2012). Optimal matching of an optimally chosen subset in observational studies. Journal of Computational and Graphical Statistics , 21(1):57--71

work page 2012
[53]

Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika , 70(1):41--55

work page 1983
[54]

Rothe, C. (2017). robust confidence intervals for average treatment effects under limited overlap. Econometrica , 85(2):645--660

work page 2017
[55]

Schlesselman, J. J. (1978). Assessing effects of confounding variables. American Journal of Epidemiology , 108(1):3--8

work page 1978
[56]

Schomaker, M., McIlleron, H., Denti, P., and Díaz, I. (2024). Causal inference for continuous multiple time point interventions. Statistics in Medicine , 43(28):5380--5400

work page 2024
[57]

J., Avorn, J., and Glynn, R

Stürmer, T., Rothman, K. J., Avorn, J., and Glynn, R. J. (2010). Treatment effects in the presence of unmeasured confounding: Dealing with observations in the tails of the propensity score distribution—a simulation study. American Journal of Epidemiology , 172(7):843--854

work page 2010
[58]

L., Wyss, R., Ellis, A

Stürmer, T., Webster-Clark, M., Lund, J. L., Wyss, R., Ellis, A. R., Lunt, M., Rothman, K. J., and Glynn, R. J. (2021). Propensity score weighting and trimming strategies for reducing variance and bias of treatment effect estimates: A simulation study. American Journal of Epidemiology , 190(8):1659--1670

work page 2021
[59]

and Small, D

Traskin, M. and Small, D. S. (2011). Defining the study population for an observational study to ensure sufficient overlap: A tree approach. Statistics in Biosciences , 3(1):94--118

work page 2011
[60]

Tsiatis, A. A. (2006). Semiparametric Theory & Missing Data . Springer

work page 2006
[61]

van der Laan, M. J. and Luedtke, A. R. (2014). Targeted learning of an optimal dynamic treatment, and statistical inference for its mean outcome. Working Paper 317, U.C. Berkeley Division of Biostatistics Working Paper Series

work page 2014
[62]

van der Laan , M. J. and Robins, J. M. (2003). Unified Methods for Censored Longitudinal Data and Causality . Springer, New York

work page 2003
[63]

van der Laan , M. J. and Rose, S. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data . Springer, New York

work page 2011
[64]

van der Laan , M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics , 2(1)

work page 2006
[65]

van der Vaart, A. W. (1998). Asymptotic Statistics . Cambridge University Press

work page 1998
[66]

van der Vaart , A. W. and Wellner, J. A. (1996). Weak C onvergence and E mprical P rocesses . Springer-Verlag New York

work page 1996
[67]

VanderWeele, T. J. and Arah, O. A. (2011). Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders. Epidemiology , 22(1):42--52

work page 2011
[68]

and Zubizarreta, J

Visconti, G. and Zubizarreta, J. R. (2018). Handling limited overlap in observational studies with cardinality matching. Observational Studies , 4(1):217--249

work page 2018
[69]

von Mises , R. (1947). On the asymptotic distribution of differentiable statistical functions. The annals of mathematical statistics , 18(3):309--348

work page 1947
[70]

Whitehouse, J., Austern, M., and Syrgkanis, V. (2025). Inference on optimal policy values and other irregular functionals via smoothing

work page 2025
[71]

Wolf, G., Shabat, G., and Shteingart, H. (2021). Positivity validation detection and explainability via zero fraction multi-hypothesis testing and asymmetrically pruned decision trees

work page 2021
[72]

and van der Laan, M

Zheng, W. and van der Laan, M. J. (2011). Cross-validated targeted minimum-loss-based estimation. In Targeted Learning , pages 459--474. Springer

work page 2011
[73]

A., and Thomas, L

Zhou, Y., Matsouaka, R. A., and Thomas, L. (2020). Propensity score weighting under limited overlap and model misspecification. Statistical Methods in Medical Research , 29(12):3721--3756. PMID: 32693715

work page 2020
[74]

A., Chubak, J., Roy, J., and Mitra, N

Zhu, Y., Hubbard, R. A., Chubak, J., Roy, J., and Mitra, N. (2021). Core concepts in pharmacoepidemiology: Violations of the positivity assumption in the causal analysis of observational data: Consequences and statistical approaches. Pharmacoepidemiology and Drug Safety , 30(11):1471--1485

work page 2021
[75]

N., Edwards, J

Zivich, P. N., Edwards, J. K., Lofgren, E. T., Cole, S. R., Shook-Sa, B. E., and Lessler, J. (2024a). Transportability without positivity: A synthesis of statistical and simulation modeling. Epidemiology , 35(1)

work page
[76]

N., Edwards, J

Zivich, P. N., Edwards, J. K., Shook-Sa, B. E., Lofgren, E. T., Lessler, J., and Cole, S. R. (2024b). Synthesis estimators for transportability with positivity violations by a continuous covariate. Journal of the Royal Statistical Society Series A: Statistics in Society , 188(1):158--180

work page

[1] [1]

M., Hall, W., Huang, W.-M., Wellner, J

Begun, J. M., Hall, W., Huang, W.-M., Wellner, J. A., et al. (1983). Information and asymptotic efficiency in parametric-nonparametric models. The Annals of Statistics , 11(2):432--452

work page 1983

[2] [2]

and Keele, L

Ben-Michael, E. and Keele, L. (2023). Using balancing weights to target the treatment effect on the treated when overlap is poor. Epidemiology , 34(5)

work page 2023

[3] [3]

J., Klaassen, C

Bickel, P. J., Klaassen, C. A., Ritov, Y., and Wellner, J. A. (1997). Efficient and Adaptive Estimation for Semiparametric Models . Springer-Verlag

work page 1997

[4] [4]

and Kennedy, E

Bonvini, M. and Kennedy, E. H. (2022). Sensitivity analysis via the proportion of unmeasured confounding. Journal of the American Statistical Association , 117(539):1540--1550

work page 2022

[5] [5]

H., Balakrishnan, S., and Wasserman, L

Branson, Z., Kennedy, E. H., Balakrishnan, S., and Wasserman, L. (2024). Causal effect estimation after propensity score trimming with continuous treatments

work page 2024

[6] [6]

Busso, M., DiNardo, J., and McCrary, J. (2014). New evidence on the finite sample properties of propensity score reweighting and matching estimators. The Review of Economics and Statistics , 96(5):885--897

work page 2014

[7] [7]

Chen, Q., Syrgkanis, V., and Austern, M. (2022). Debiased machine learning without sample-splitting for stable estimators. Advances in Neural Information Processing Systems , 35:3096--3109

work page 2022

[8] [8]

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., et al. (2016). Double machine learning for treatment and causal parameters. arXiv preprint arXiv:1608.00060

work page arXiv 2016

[9] [9]

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal , 21(1):C1--C68

work page 2018

[10] [10]

V., Thomas, C., Harrell, Frank E., J., Wagner, D., Desbiens, N., Goldman, L., Wu, A

Connors, Alfred F., J., Speroff, T., Dawson, N. V., Thomas, C., Harrell, Frank E., J., Wagner, D., Desbiens, N., Goldman, L., Wu, A. W., Califf, R. M., Fulkerson, William J., J., Vidaillet, H., Broste, S., Bellamy, P., Lynn, J., and Knaus, W. A. (1996). The effectiveness of right heart catheterization in the initial care of critically iii patients. JAMA ,...

work page 1996

[11] [11]

C., Lilienfeld, A

Cornfield, J., Haenszel, W., Hammond, E. C., Lilienfeld, A. M., Shimkin, M. B., and Wynder, E. L. (1959). Smoking and lung cancer: Recent evidence and a discussion of some questions. JNCI: Journal of the National Cancer Institute , 22(1):173--203

work page 1959

[12] [12]

K., Hotz, V

Crump, R. K., Hotz, V. J., Imbens, G. W., and Mitnik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika , 96(1):187--199

work page 2009

[13] [13]

E., Gruber, S., Lee, H., Dahabreh, I

Dang, L. E., Gruber, S., Lee, H., Dahabreh, I. J., Stuart, E. A., Williamson, B. D., Wyss, R., Díaz, I., Ghosh, D., Kıcıman, E., and et al. (2023). A causal roadmap for generating high-quality real-world evidence. Journal of Clinical and Translational Science , 7(1):e212

work page 2023

[14] [14]

and van der Laan, M

D \' az, I. and van der Laan, M. J. (2013). Assessing the causal effect of policies: an example using stochastic interventions. The international journal of biostatistics , 9(2):161--174

work page 2013

[15] [15]

L., and Schenck, E

D \' az, I., Williams, N., Hoffman, K. L., and Schenck, E. J. (2023). Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association , 118(542):846--857

work page 2023

[16] [16]

Fernholz, L. T. (1983). von Mises Calculus For Statistical Functionals . Springer New York, New York, NY

work page 1983

[17] [17]

and Stuart, E

Greifer, N. and Stuart, E. A. (2023). Choosing the causal estimand for propensity score analysis of observational studies

work page 2023

[18] [18]

and van der Laan, M

Gruber, S. and van der Laan, M. J. (2010). A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome. The International Journal of Biostatistics , 6(1)

work page 2010

[19] [19]

z ak, A., and Walk, H

Gy\" o rfi, L., Kohler, M., Krzy\. z ak, A., and Walk, H. (2002). A Distribution-Free Theory of Nanparametric Regression . Springer-Verlag, New York

work page 2002

[20] [20]

Hern \'a n, M. A. and Robins, J. M. (2020). Causal Inference: What If . Chapman & Hall/CRC, Boca Raton

work page 2020

[21] [21]

and Imbens, G

Hirano, K. and Imbens, G. W. (2001). Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Services and Outcomes Research Methodology , 2(3):259--278

work page 2001

[22] [22]

Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. The Review of Economics and Statistics , 86(1):4--29

work page 2004

[23] [23]

Imbens, G. W. and Manski, C. F. (2004). Confidence intervals for partially identified parameters. Econometrica , 72(6):1845--1857

work page 2004

[24] [24]

and Schafer, J

Kang, J. and Schafer, J. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data (with discussion). Statistical Science , 22:523--39

work page 2007

[25] [25]

Karavani, E., Bak, P., and Shimoni, Y. (2019). A discriminative approach for finding and characterizing positivity violations using decision trees

work page 2019

[26] [26]

Kennedy, E. H. (2019). Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association , 114(526):645--656

work page 2019

[27] [27]

Kennedy, E. H. (2024). Semiparametric doubly robust targeted double machine learning: A review. In Handbook of Statistical Methods for Precision Medicine , chapter 10, pages 207--236. Chapman and Hall/CRC, 1 edition

work page 2024

[28] [28]

H., Balakrishnan, S., and Wasserman, L

Kennedy, E. H., Balakrishnan, S., and Wasserman, L. A. (2023). Semiparametric counterfactual density estimation. Biometrika , 110(4):875--896

work page 2023

[29] [29]

K., Lessler, J., and Stuart, E

Lee, B. K., Lessler, J., and Stuart, E. A. (2011). Weight trimming and propensity score weighting. PLOS ONE , 6(3):1--6

work page 2011

[30] [30]

and Weidner, M

Lee, S. and Weidner, M. (2021a). ATbounds: Bounding Treatment Effects by Limited Information Pooling . R package version 0.1.0

work page

[31] [31]

and Weidner, M

Lee, S. and Weidner, M. (2021b). Bounding treatment effects by pooling limited information across observations

work page

[32] [32]

W., Bonvini, M., Zeng, Z., Keele, L., and Kennedy, E

Levis, A. W., Bonvini, M., Zeng, Z., Keele, L., and Kennedy, E. H. (2025). Covariate-assisted bounds on causal effects with instrumental variables. Journal of the Royal Statistical Society Series B: Statistical Methodology , page qkaf028

work page 2025

[33] [33]

L., and Zaslavsky, A

Li, F., Morgan, K. L., and Zaslavsky, A. M. (2018a). Balancing covariates via propensity score weighting. Journal of the American Statistical Association , 113(521):390--400

work page

[34] [34]

E., and Li, F

Li, F., Thomas, L. E., and Li, F. (2018b). Addressing extreme propensity scores via the overlap weights. American Journal of Epidemiology , 188(1):250--257

work page

[35] [35]

Y., Psaty, B

Lin, D. Y., Psaty, B. M., and Kronmal, R. A. (1998). Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics , 54(3):948--963

work page 1998

[36] [36]

Ma, X., Sasaki, Y., and Wang, Y. (2024). testing limited overlap. Econometric Theory , page 1–34

work page 2024

[37] [37]

Manski, C. F. (1990). Nonparametric bounds on treatment effects. The American Economic Review , 80(2):319--323

work page 1990

[38] [38]

Manski, C. F. (1997). Monotone treatment response. Econometrica , 65(6):1311--1334

work page 1997

[39] [39]

Manski, C. F. and Pepper, J. V. (2000). Monotone instrumental variables: With an application to the returns to schooling. Econometrica , 68(4):997--1010

work page 2000

[40] [40]

and Díaz, I

McClean, A. and Díaz, I. (2025). Propensity score weighting across counterfactual worlds: longitudinal effects under positivity violations

work page 2025

[41] [41]

Murphy, D. J. and Cluff, L. E. (1990). The SUPPORT study. Journal of Clinical Epidemiology , 43:V--X

work page 1990

[42] [42]

H., Huang, M.-Y., Smid, M., and Scharfstein, D

Nabi, R., Bonvini, M., Kennedy, E. H., Huang, M.-Y., Smid, M., and Scharfstein, D. O. (2024). Semiparametric sensitivity analysis: unmeasured confounding in observational studies. Biometrics , 80(4):ujae106

work page 2024

[43] [43]

L., Porter, K

Petersen, M. L., Porter, K. E., Gruber, S., Wang, Y., and van der Laan, M. J. (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research , 21(1):31--54. PMID: 21030422

work page 2012

[44] [44]

Petersen, M. L. and van der Laan, M. J. (2014). Causal models and learning from data: Integrating causal modeling and statistical estimation. Epidemiology , 25(3)

work page 2014

[45] [45]

G., Gilbert, P

Richardson, A., Hudgens, M. G., Gilbert, P. B., and Fine, J. P. (2014). Nonparametric Bounds and Sensitivity Analysis of Treatment Effects . Statistical Science , 29(4):596 -- 618

work page 2014

[46] [46]

Rizk, J. G. (2025). When and why to use overlap weighting: Clarifying its role, assumptions, and estimand in real-world studies. Journal of Clinical Epidemiology , page 111942

work page 2025

[47] [47]

Robins, J. (1989). The analysis of randomized and non-randomized aids treatment trials using a new approach in causal inference in longitudinal studies. In Sechrest, L., Freeman, H., and Mulley, A., editors, Health Service Methodology: A Focus on AIDS , pages 113--159. U.S. Public Health Service, National Center for Health SErvices Research, Washington D.C

work page 1989

[48] [48]

Robins, J., Li, L., Tchetgen Tchetgen , E., and van der Vaart, A. W. (2009). Quadratic semiparametric von mises calculus. Metrika , 69(2-3):227--247

work page 2009

[49] [49]

inverse probability

Robins, J., Sued, M., Lei-Gomez, Q., and Rotnitzky, A. (2007). Comment: Performance of double-robust estimators when" inverse probability" weights are highly variable. Statistical Science , 22(4):544--559

work page 2007

[50] [50]

Robins, J. M. (1986). A new approach to causal inference in mortality studies with sustained exposure periods - application to control of the healthy worker survivor effect. Mathematical Modelling , 7:1393--1512

work page 1986

[51] [51]

Robins, J. M. and Rotnitzky, A. (1995). Semiparametric efficiency in multivariate regression models with missing data. Journal of the American Statistical Association , 90(429):122--129

work page 1995

[52] [52]

Rosenbaum, P. R. (2012). Optimal matching of an optimally chosen subset in observational studies. Journal of Computational and Graphical Statistics , 21(1):57--71

work page 2012

[53] [53]

Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika , 70(1):41--55

work page 1983

[54] [54]

Rothe, C. (2017). robust confidence intervals for average treatment effects under limited overlap. Econometrica , 85(2):645--660

work page 2017

[55] [55]

Schlesselman, J. J. (1978). Assessing effects of confounding variables. American Journal of Epidemiology , 108(1):3--8

work page 1978

[56] [56]

Schomaker, M., McIlleron, H., Denti, P., and Díaz, I. (2024). Causal inference for continuous multiple time point interventions. Statistics in Medicine , 43(28):5380--5400

work page 2024

[57] [57]

J., Avorn, J., and Glynn, R

Stürmer, T., Rothman, K. J., Avorn, J., and Glynn, R. J. (2010). Treatment effects in the presence of unmeasured confounding: Dealing with observations in the tails of the propensity score distribution—a simulation study. American Journal of Epidemiology , 172(7):843--854

work page 2010

[58] [58]

L., Wyss, R., Ellis, A

Stürmer, T., Webster-Clark, M., Lund, J. L., Wyss, R., Ellis, A. R., Lunt, M., Rothman, K. J., and Glynn, R. J. (2021). Propensity score weighting and trimming strategies for reducing variance and bias of treatment effect estimates: A simulation study. American Journal of Epidemiology , 190(8):1659--1670

work page 2021

[59] [59]

and Small, D

Traskin, M. and Small, D. S. (2011). Defining the study population for an observational study to ensure sufficient overlap: A tree approach. Statistics in Biosciences , 3(1):94--118

work page 2011

[60] [60]

Tsiatis, A. A. (2006). Semiparametric Theory & Missing Data . Springer

work page 2006

[61] [61]

van der Laan, M. J. and Luedtke, A. R. (2014). Targeted learning of an optimal dynamic treatment, and statistical inference for its mean outcome. Working Paper 317, U.C. Berkeley Division of Biostatistics Working Paper Series

work page 2014

[62] [62]

van der Laan , M. J. and Robins, J. M. (2003). Unified Methods for Censored Longitudinal Data and Causality . Springer, New York

work page 2003

[63] [63]

van der Laan , M. J. and Rose, S. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data . Springer, New York

work page 2011

[64] [64]

van der Laan , M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics , 2(1)

work page 2006

[65] [65]

van der Vaart, A. W. (1998). Asymptotic Statistics . Cambridge University Press

work page 1998

[66] [66]

van der Vaart , A. W. and Wellner, J. A. (1996). Weak C onvergence and E mprical P rocesses . Springer-Verlag New York

work page 1996

[67] [67]

VanderWeele, T. J. and Arah, O. A. (2011). Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders. Epidemiology , 22(1):42--52

work page 2011

[68] [68]

and Zubizarreta, J

Visconti, G. and Zubizarreta, J. R. (2018). Handling limited overlap in observational studies with cardinality matching. Observational Studies , 4(1):217--249

work page 2018

[69] [69]

von Mises , R. (1947). On the asymptotic distribution of differentiable statistical functions. The annals of mathematical statistics , 18(3):309--348

work page 1947

[70] [70]

Whitehouse, J., Austern, M., and Syrgkanis, V. (2025). Inference on optimal policy values and other irregular functionals via smoothing

work page 2025

[71] [71]

Wolf, G., Shabat, G., and Shteingart, H. (2021). Positivity validation detection and explainability via zero fraction multi-hypothesis testing and asymmetrically pruned decision trees

work page 2021

[72] [72]

and van der Laan, M

Zheng, W. and van der Laan, M. J. (2011). Cross-validated targeted minimum-loss-based estimation. In Targeted Learning , pages 459--474. Springer

work page 2011

[73] [73]

A., and Thomas, L

Zhou, Y., Matsouaka, R. A., and Thomas, L. (2020). Propensity score weighting under limited overlap and model misspecification. Statistical Methods in Medical Research , 29(12):3721--3756. PMID: 32693715

work page 2020

[74] [74]

A., Chubak, J., Roy, J., and Mitra, N

Zhu, Y., Hubbard, R. A., Chubak, J., Roy, J., and Mitra, N. (2021). Core concepts in pharmacoepidemiology: Violations of the positivity assumption in the causal analysis of observational data: Consequences and statistical approaches. Pharmacoepidemiology and Drug Safety , 30(11):1471--1485

work page 2021

[75] [75]

N., Edwards, J

Zivich, P. N., Edwards, J. K., Lofgren, E. T., Cole, S. R., Shook-Sa, B. E., and Lessler, J. (2024a). Transportability without positivity: A synthesis of statistical and simulation modeling. Epidemiology , 35(1)

work page

[76] [76]

N., Edwards, J

Zivich, P. N., Edwards, J. K., Shook-Sa, B. E., Lofgren, E. T., Lessler, J., and Cole, S. R. (2024b). Synthesis estimators for transportability with positivity violations by a continuous covariate. Journal of the Royal Statistical Society Series A: Statistics in Society , 188(1):158--180

work page