Sample size and power calculations for causal inference of observational studies
Pith reviewed 2026-05-23 04:54 UTC · model grok-4.3
The pith
To calculate the minimal sample size for an observational causal study, it suffices to know two parameters quantifying the confounder-treatment and confounder-outcome associations in addition to standard randomized trial inputs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By analyzing the variance of an inverse probability weighting estimator of the average treatment effect, we decompose the power calculation into three components: propensity score distribution, potential outcome distribution, and their correlation. We show that to determine the minimal sample size of an observational study, in addition to the standard inputs in the power calculation of randomized trials, it is sufficient to have two parameters, which quantify the strength of the confounder-treatment and the confounder-outcome association, respectively. For the former, we propose using the Bhattacharyya coefficient, which measures the covariate overlap and, together with the treatment比例,leads
What carries the argument
Variance decomposition of the inverse probability weighting estimator for average treatment effect, with propensity score distribution identified from Bhattacharyya coefficient plus treatment proportion and outcome correlation bounded by R-squared of outcome on covariates.
If this is right
- Minimal sample size follows from standard randomized trial inputs plus the two parameters.
- Propensity score distribution is uniquely identifiable from the Bhattacharyya coefficient and treatment proportion.
- The sensitivity parameter for the outcome association is bounded by the R-squared statistic without needing full covariate distributional assumptions.
- The procedure applies under a parametric propensity score model and semiparametric restricted mean outcome model.
- An R package and online calculator implement the formulas.
Where Pith is reading between the lines
- Pilot data could be used to estimate the Bhattacharyya coefficient for study planning.
- The two-parameter approach may combine with existing sensitivity analysis techniques in causal inference.
- Similar variance decompositions could be derived for estimators other than inverse probability weighting.
- Empirical checks comparing predicted versus observed power in completed studies would test practical accuracy.
Load-bearing premise
The parametric propensity score model is correctly specified and the outcome association can be bounded using only the R-squared statistic from regressing outcome on covariates.
What would settle it
In a dataset with known confounder strengths, compute the actual variance of the IPW estimator directly and compare it to the variance predicted by the formula that uses only the two proposed parameters; a systematic mismatch would falsify the claim that these two suffice.
read the original abstract
This paper investigates the theoretical foundation and develops analytical formulas for sample size and power calculations for causal inference with observational data. By analyzing the variance of an inverse probability weighting estimator of the average treatment effect, we decompose the power calculation into three components: propensity score distribution, potential outcome distribution, and their correlation. We show that to determine the minimal sample size of an observational study, in addition to the standard inputs in the power calculation of randomized trials, it is sufficient to have two parameters, which quantify the strength of the confounder-treatment and the confounder-outcome association, respectively. For the former, we propose using the Bhattacharyya coefficient, which measures the covariate overlap and, together with the treatment proportion, leads to a uniquely identifiable and easily computable propensity score distribution. For the latter, we propose a sensitivity parameter bounded by the R-squared statistic of the regression of the outcome on covariates. Our procedure relies on a parametric propensity score model and a semiparametric restricted mean outcome model, but does not require distributional assumptions on the multivariate covariates. We develop an associated R package PSpower and an online calculator.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops analytical formulas for sample size and power calculations for estimating the ATE via IPW in observational studies. It decomposes the IPW variance into propensity-score distribution, potential-outcome distribution, and their correlation components. The central claim is that, beyond the usual RCT inputs, only two additional parameters suffice: the Bhattacharyya coefficient (measuring confounder-treatment association and, with treatment proportion, yielding a uniquely identifiable PS distribution under a parametric PS model) and a sensitivity parameter for confounder-outcome association bounded by the R² from regressing the outcome on covariates. The procedure uses a parametric PS model and semiparametric restricted-mean outcome model without distributional assumptions on the multivariate covariates; an R package and online calculator are provided.
Significance. If the identifiability and bounding arguments hold, the work supplies a practical, low-input framework for power analysis in observational causal inference, which is a frequent practical need. The explicit variance decomposition and software release are strengths that would aid reproducibility and adoption. The avoidance of full covariate-distribution assumptions is a positive feature relative to simulation-based alternatives.
major comments (2)
- [Abstract and PS-distribution derivation] Abstract (final paragraph) and the section deriving the PS distribution: the claim that the Bhattacharyya coefficient plus treatment proportion 'leads to a uniquely identifiable' PS distribution under the parametric PS model is load-bearing for the two-parameter sufficiency result. Because the BC equals a single functional E[sqrt(e(X)(1-e(X)))] (normalized by π), uniqueness of the full law of the weights 1/e(X) and 1/(1-e(X)) requires an explicit statement of the low-dimensional parametric family imposed on the distribution of e(X) itself; the manuscript's statement that no assumptions are made on the multivariate law of X leaves open whether this family is an additional modeling choice or is derived.
- [Variance decomposition] Variance-decomposition section (around the IPW variance formula): the correlation term between the PS weights and the potential outcomes must be shown to be either bounded or eliminated by the two parameters without introducing further user inputs; otherwise the reduction to exactly two extra parameters does not follow from the decomposition alone.
minor comments (2)
- [Abstract] The abstract states reliance on 'a parametric propensity score model' but does not name the family (e.g., logistic, beta, etc.); adding this detail would improve immediate readability.
- [Software and examples] Figure captions or the software section could include a small numerical example showing how the two parameters translate into a concrete minimal n, to illustrate the formulas.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive major comments. Both points identify areas where additional explicit statements and derivations would strengthen the manuscript. We agree that clarifications are warranted and will revise accordingly.
read point-by-point responses
-
Referee: [Abstract and PS-distribution derivation] Abstract (final paragraph) and the section deriving the PS distribution: the claim that the Bhattacharyya coefficient plus treatment proportion 'leads to a uniquely identifiable' PS distribution under the parametric PS model is load-bearing for the two-parameter sufficiency result. Because the BC equals a single functional E[sqrt(e(X)(1-e(X)))] (normalized by π), uniqueness of the full law of the weights 1/e(X) and 1/(1-e(X)) requires an explicit statement of the low-dimensional parametric family imposed on the distribution of e(X) itself; the manuscript's statement that no assumptions are made on the multivariate law of X leaves open whether this family is an additional modeling choice or is derived.
Authors: We agree that the uniqueness result requires an explicit statement of the parametric family on the distribution of e(X). The manuscript already states reliance on a parametric propensity score model; under this model the BC together with the treatment proportion π uniquely determines the parameters of the induced distribution of e(X) (and hence the law of the IPW weights). To remove any ambiguity, we will revise the relevant section and abstract to name the specific low-dimensional parametric family for e(X) (derived directly from the parametric PS model) and to clarify that no further distributional assumptions on the multivariate law of X are introduced beyond those already declared. revision: yes
-
Referee: [Variance decomposition] Variance-decomposition section (around the IPW variance formula): the correlation term between the PS weights and the potential outcomes must be shown to be either bounded or eliminated by the two parameters without introducing further user inputs; otherwise the reduction to exactly two extra parameters does not follow from the decomposition alone.
Authors: The referee correctly notes that the correlation term must be controlled by the two parameters. The R²-bounded sensitivity parameter for the confounder-outcome association is intended to bound the feasible range of this correlation (via its effect on the covariance between weights and potential outcomes) without additional user-specified inputs. We will add an explicit bounding argument or derivation in the variance-decomposition section demonstrating that the correlation is indeed governed by these two quantities alone, thereby confirming that exactly two extra parameters suffice. revision: yes
Circularity Check
No significant circularity; parameters are external inputs under explicit parametric assumption
full rationale
The paper's derivation begins from the standard IPW variance formula for the ATE and decomposes it into propensity-score, outcome, and correlation components. It explicitly states reliance on a parametric propensity score model plus the Bhattacharyya coefficient (plus treatment proportion) to obtain an identifiable PS distribution, and on a semiparametric restricted-mean outcome model with an R-squared-bounded sensitivity parameter. Both quantities are introduced as user-supplied external inputs for the sample-size formula rather than quantities fitted from the same data used to estimate the treatment effect. No load-bearing step reduces the target sample size to a fitted constant or to a self-citation chain by construction; the central claim therefore remains independent of its own outputs.
Axiom & Free-Parameter Ledger
free parameters (2)
- Bhattacharyya coefficient
- confounder-outcome sensitivity parameter
axioms (2)
- domain assumption parametric propensity score model
- domain assumption semiparametric restricted mean outcome model
Forward citations
Cited by 2 Pith papers
-
Estimator-Aligned Prospective Sample Size Determination for Designs Using Inverse Probability of Treatment Weighting
A GEE-based stacked M-estimation framework merges propensity score and marginal structural models to directly compute the large-sample variance of the IPTW estimator from pilot data for prospective sample size plannin...
-
Externally Controlled Trials: A Review of Design and Borrowing Through a Causal Lens
A review organizes externally controlled trial methodology through causal estimands and identifiability assumptions for single-arm and hybrid designs with borrowing strategies.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.