Asymptotic inference with flexible covariate adjustment under rerandomization and stratified rerandomization
Pith reviewed 2026-05-24 00:36 UTC · model grok-4.3
The pith
Rerandomization leaves the asymptotic linearity and influence function of any M-estimator unchanged from simple randomization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. Asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. The paper also studies the asymptotic theory for efficient estimators based on data-adaptive machine learners and proves their efficiency optimality under rerandomization and stratified rerandomization.
What carries the argument
The invariance of the influence function for M-estimators under rerandomization, which shows that the first-order asymptotic behavior is unaffected by the rerandomization procedure.
Load-bearing premise
The estimators must satisfy the standard regularity conditions for asymptotic linearity under simple randomization.
What would settle it
A numerical experiment that computes the empirical influence function for an M-estimator under both simple randomization and rerandomization and finds they differ would falsify the invariance claim.
read the original abstract
Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more generally to the class of M-estimators, including the g-computation formula with generalized linear regression and doubly-robust methods, and more broadly, to efficient estimators with data-adaptive machine learners. In this paper, we develop the asymptotic theory for a more general class of covariate-adjusted estimators under rerandomization and its stratified extension. We prove that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. We further explain, drawing examples from several common M-estimators, that asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. Finally, we study the asymptotic theory for efficient estimators based on data-adaptive machine learners, and prove their efficiency optimality under rerandomization and stratified rerandomization. Our results are demonstrated via simulations and re-analyses of a cluster-randomized experiment that used stratified rerandomization.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops asymptotic theory for a broad class of M-estimators (including g-computation, doubly robust, and data-adaptive machine-learning estimators) with covariate adjustment under rerandomization and stratified rerandomization. It claims that asymptotic linearity and the influence function are identical to those under simple randomization (provided standard regularity conditions hold), that the limiting distribution may be non-Gaussian, that normality is recovered by appropriate adjustment for the rerandomization variables, and that data-adaptive efficient estimators remain asymptotically optimal under these designs. Results are supported by simulations and a re-analysis of a cluster-randomized trial.
Significance. If the derivations hold, the work is significant because the invariance of the influence function allows reuse of standard asymptotic expansions and variance estimators (with design-based corrections) across randomization schemes, while the efficiency-optimality result justifies flexible adjustment in balanced experiments. This extends prior rerandomization theory beyond linear adjustment and provides a foundation for modern causal estimators in designed experiments.
minor comments (2)
- [Abstract] Abstract: the statement that 'asymptotic normality can be achieved if rerandomization variables are appropriately adjusted' would benefit from a one-sentence pointer to the specific adjustment (e.g., the form of the additional term in the estimating equation) so readers can immediately locate the construction.
- [Theory section (likely §3 or §4)] The paper should explicitly list the regularity conditions (e.g., differentiability of the estimating function, moments, and rates for the machine-learning estimators) in a dedicated subsection of the theory section rather than leaving them implicit.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our manuscript, recognition of its significance, and recommendation for minor revision. We are pleased that the invariance of the influence function and the efficiency results under rerandomization are viewed as useful extensions of prior work.
- No specific major comments were provided in the referee report (the MAJOR COMMENTS section is empty), so we are unable to address any points point-by-point or indicate whether revisions are needed.
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper derives asymptotic linearity, influence functions, and efficiency results for M-estimators under rerandomization from standard regularity conditions on the estimators (asymptotic linearity under simple randomization) and properties of the randomization scheme. No load-bearing step reduces by the paper's own equations to a fitted parameter, self-definition, or self-citation chain; the influence-function invariance is shown to hold identically rather than by construction from data-dependent fits. The extension to data-adaptive learners and stratified rerandomization likewise rests on external regularity assumptions rather than internal re-use of the target result. This is the expected outcome for a purely theoretical asymptotic analysis.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard regularity conditions for M-estimators to possess asymptotic linearity and influence functions under simple randomization
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the asymptotic distribution ... may lead to a non-Gaussian asymptotic distribution
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Asymptotic theory of rerandomization for survival analysis
Rerandomization yields tight limiting processes with lower pointwise asymptotic variances for Kaplan-Meier and IPCW Kaplan-Meier survival estimators, while the variance of debiased ML estimators remains invariant due ...
-
Langevin-Gradient Rerandomization
LGR samples balanced treatment assignments in high-dimensional experiments via continuous relaxation and SGLD, retaining valid inference through randomization tests while being orders of magnitude faster than prior methods.
Reference graph
Works this paper leans on
-
[1]
Benkeser, D., D \' az, I., Luedtke, A., Segal, J., Scharfstein, D., and Rosenblum, M. (2021). Improving precision and power in randomized trials for covid-19 treatments using covariate adjustment, for binary, ordinal, and time-to-event outcomes. Biometrics , 77(4):1467--1481
work page 2021
-
[2]
Benkeser, D. and Van Der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA) , pages 689--696. IEEE
work page 2016
-
[3]
Bruhn, M. and McKenzie, D. (2009). In pursuit of balance: Randomization in practice in development field experiments. American economic journal: applied economics , 1(4):200--232
work page 2009
-
[4]
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal , 21(1):C1--C68
work page 2018
-
[5]
Colantuoni, E. and Rosenblum, M. (2015). Leveraging prognostic baseline variables to gain precision in randomized trials. Statistics in medicine , 34(18):2602--2617
work page 2015
-
[6]
Ding, P., Li, X., and Miratrix, L. W. (2017). Bridging finite and super population causal inference. Journal of Causal Inference , 5(2):20160027
work page 2017
-
[7]
Farrell, M. H., Liang, T., and Misra, S. (2021). Deep neural networks for estimation and inference. Econometrica , 89(1):181--213
work page 2021
-
[8]
Ivers, N. M., Halperin, I. J., Barnsley, J., Grimshaw, J. M., Shah, B. R., Tu, K., Upshur, R., and Zwarenstein, M. (2012). Allocation techniques for balance at baseline in cluster randomized trials: a methodological review. Trials , 13:1--9
work page 2012
-
[9]
Jordans, M. J., Kohrt, B. A., Sangraula, M., Turner, E. L., Wang, X., Shrestha, P., Ghimire, R., van’t Hof, E., Bryant, R. A., Dawson, K. S., et al. (2021). Effectiveness of group problem management plus, a brief psychological intervention for adults affected by humanitarian disasters in nepal: A cluster randomized controlled trial. PLoS Medicine , 18(6):e1003621
work page 2021
-
[10]
Li, F., Lokhnygina, Y., Murray, D. M., Heagerty, P. J., and DeLong, E. R. (2016). An evaluation of constrained randomization for the design and analysis of group-randomized trials. Statistics in medicine , 35(10):1565--1579
work page 2016
-
[11]
Li, F., Turner, E. L., Heagerty, P. J., Murray, D. M., Vollmer, W. M., and DeLong, E. R. (2017). An evaluation of constrained randomization for the design and analysis of group-randomized trials with binary outcomes. Statistics in medicine , 36(24):3791--3806
work page 2017
-
[12]
Li, X. and Ding, P. (2020). Rerandomization and regression adjustment. Journal of the Royal Statistical Society Series B: Statistical Methodology , 82(1):241--268
work page 2020
-
[13]
Li, X., Ding, P., and Rubin, D. B. (2018). Asymptotic theory of rerandomization in treatment--control experiments. Proceedings of the National Academy of Sciences , 115(37):9157--9162
work page 2018
-
[14]
Lu, X., Liu, T., Liu, H., and Ding, P. (2023). Design-based theory for cluster rerandomization. Biometrika , 110(2):467--483
work page 2023
-
[15]
Morgan, K. L. and Rubin, D. B. (2012). Rerandomization to improve covariate balance in experiments. Annals of Statistics , 40(2):1263--1282
work page 2012
-
[16]
Morgan, K. L. and Rubin, D. B. (2015). Rerandomization to balance tiers of covariates. Journal of the American Statistical Association , 110(512):1412--1421
work page 2015
-
[17]
Moulton, L. H. (2004). Covariate-based constrained randomization of group-randomized trials. Clinical trials , 1(3):297--305
work page 2004
-
[18]
Pirondini, L., Gregson, J., Owen, R., Collier, T., and Pocock, S. (2022). Covariate adjustment in cardiovascular randomized controlled trials: its value, current practice, and need for improvement. Heart Failure , 10(5):297--305
work page 2022
-
[19]
Raab, G. M. and Butcher, I. (2001). Balance in cluster randomized trials. Statistics in medicine , 20(3):351--365
work page 2001
- [20]
-
[21]
Robins, J., Sued, M., Lei-Gomez, Q., and Rotnitzky, A. (2007). Comment: Performance of double-robust estimators when “inverse probability” weights are highly variable. Statist. Sci. , 22(4):544--559
work page 2007
-
[22]
Robins, J. M. (2002). Covariance adjustment in randomized experiments and observational studies: Comment. Statistical Science , 17(3):309--321
work page 2002
- [23]
-
[24]
Tsiatis, A., Davidian, M., Zhang, M., and Lu, X. (2008). Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Stat Med , 27(23):4658--4677
work page 2008
-
[25]
Turner, E. L., Li, F., Gallis, J. A., Prague, M., and Murray, D. M. (2017a). Review of recent methodological developments in group-randomized trials: part 1—design. American journal of public health , 107(6):907--915
-
[26]
Turner, E. L., Prague, M., Gallis, J. A., Li, F., and Murray, D. M. (2017b). Review of recent methodological developments in group-randomized trials: part 2—analysis. American Journal of Public Health , 107(7):1078--1086
-
[27]
van der Laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical Applications in Genetics and Molecular Biology , 6(1)
work page 2007
-
[28]
van der Laan, M. J., Rose, S., et al. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data , volume 10. Springer
work page 2011
-
[29]
van der Vaart, A. (1998). Asymptotic Statistics . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press
work page 1998
-
[30]
Wager, S. and Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association , 113(523):1228--1242
work page 2018
-
[31]
Wang, B., Harhay, M. O., Small, D. S., Morris, T. P., and Li, F. (2021). On the mixed-model analysis of covariance in cluster-randomized trials. arXiv preprint arXiv:2112.00832
-
[32]
Wang, B., Ogburn, E. L., and Rosenblum, M. (2019). Analysis of covariance in randomized trials: More precision and valid confidence intervals, without model assumptions. Biometrics , 75(4):1391--1400
work page 2019
-
[33]
Wang, B., Park, C., Small, D. S., and Li, F. (2023a). Model-robust and efficient covariate adjustment for cluster-randomized experiments. Journal of the American Statistical Association
-
[34]
Wang, B., Susukida, R., Mojtabai, R., Amin-Esmaeili, M., and Rosenblum, M. (2023b). Model-robust inference for clinical trials that improve precision by stratified randomization and covariate adjustment. Journal of the American Statistical Association , 118(542):1152--1163
-
[35]
Wang, X., Wang, T., and Liu, H. (2023c). Rerandomization in stratified randomized experiments. Journal of the American Statistical Association , 118(542):1295--1304
- [36]
-
[37]
Zelen, M. (1974). The randomization and stratification of patients to clinical trials. Journal of Chronic Diseases , 27(7):365 -- 375
work page 1974
-
[38]
Zhao, A. and Ding, P. (2024). No star is good news: A unified look at rerandomization based on p-values from covariate balance tests. Journal of Econometrics , 241(1):105724
work page 2024
-
[39]
Zhou, Q., Ernst, P. A., Morgan, K. L., Rubin, D. B., and Zhang, A. (2018). Sequential rerandomization. Biometrika , 105(3):745--752
work page 2018
-
[40]
Bugni, F. A., Canay, I. A., and Shaikh, A. M. (2018). Inference under covariate-adaptive randomization. Journal of the American Statistical Association , 113(524):1784--1796
work page 2018
-
[41]
Durrett, R. (2019). Probability: theory and examples , volume 49. Cambridge university press
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.