pith. sign in

arxiv: 2406.02834 · v3 · pith:BKCK4DEYnew · submitted 2024-06-05 · 📊 stat.ME

Asymptotic inference with flexible covariate adjustment under rerandomization and stratified rerandomization

Pith reviewed 2026-05-24 00:36 UTC · model grok-4.3

classification 📊 stat.ME
keywords rerandomizationasymptoticstratifiedestimatorsunderresultsclasscovariate
0
0 comments X

The pith

Rerandomization leaves the asymptotic linearity and influence function of any M-estimator unchanged from simple randomization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops asymptotic theory showing that for any M-estimator, including those with data-adaptive machine learners, the influence function and asymptotic linearity are the same under rerandomization as under simple randomization. This holds even though rerandomization can produce a non-Gaussian limiting distribution. A reader would care because it justifies using flexible covariate-adjusted estimators in rerandomized experiments without altering their first-order properties. The theory extends to stratified rerandomization and establishes efficiency optimality for adaptive estimators.

Core claim

The paper claims that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. Asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. The paper also studies the asymptotic theory for efficient estimators based on data-adaptive machine learners and proves their efficiency optimality under rerandomization and stratified rerandomization.

What carries the argument

The invariance of the influence function for M-estimators under rerandomization, which shows that the first-order asymptotic behavior is unaffected by the rerandomization procedure.

Load-bearing premise

The estimators must satisfy the standard regularity conditions for asymptotic linearity under simple randomization.

What would settle it

A numerical experiment that computes the empirical influence function for an M-estimator under both simple randomization and rerandomization and finds they differ would falsify the invariance claim.

read the original abstract

Rerandomization is an effective treatment allocation procedure to control for baseline covariate imbalance. For estimating the average treatment effect, rerandomization has been previously shown to improve the precision of the unadjusted and the linearly-adjusted estimators over simple randomization without compromising consistency. However, it remains unclear whether such results apply more generally to the class of M-estimators, including the g-computation formula with generalized linear regression and doubly-robust methods, and more broadly, to efficient estimators with data-adaptive machine learners. In this paper, we develop the asymptotic theory for a more general class of covariate-adjusted estimators under rerandomization and its stratified extension. We prove that the asymptotic linearity and the influence function remain identical for any M-estimator under simple randomization and rerandomization, but rerandomization may lead to a non-Gaussian asymptotic distribution. We further explain, drawing examples from several common M-estimators, that asymptotic normality can be achieved if rerandomization variables are appropriately adjusted for in the final estimator. These results are extended to stratified rerandomization. Finally, we study the asymptotic theory for efficient estimators based on data-adaptive machine learners, and prove their efficiency optimality under rerandomization and stratified rerandomization. Our results are demonstrated via simulations and re-analyses of a cluster-randomized experiment that used stratified rerandomization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper develops asymptotic theory for a broad class of M-estimators (including g-computation, doubly robust, and data-adaptive machine-learning estimators) with covariate adjustment under rerandomization and stratified rerandomization. It claims that asymptotic linearity and the influence function are identical to those under simple randomization (provided standard regularity conditions hold), that the limiting distribution may be non-Gaussian, that normality is recovered by appropriate adjustment for the rerandomization variables, and that data-adaptive efficient estimators remain asymptotically optimal under these designs. Results are supported by simulations and a re-analysis of a cluster-randomized trial.

Significance. If the derivations hold, the work is significant because the invariance of the influence function allows reuse of standard asymptotic expansions and variance estimators (with design-based corrections) across randomization schemes, while the efficiency-optimality result justifies flexible adjustment in balanced experiments. This extends prior rerandomization theory beyond linear adjustment and provides a foundation for modern causal estimators in designed experiments.

minor comments (2)
  1. [Abstract] Abstract: the statement that 'asymptotic normality can be achieved if rerandomization variables are appropriately adjusted' would benefit from a one-sentence pointer to the specific adjustment (e.g., the form of the additional term in the estimating equation) so readers can immediately locate the construction.
  2. [Theory section (likely §3 or §4)] The paper should explicitly list the regularity conditions (e.g., differentiability of the estimating function, moments, and rates for the machine-learning estimators) in a dedicated subsection of the theory section rather than leaving them implicit.

Simulated Author's Rebuttal

0 responses · 1 unresolved

We thank the referee for their positive summary of our manuscript, recognition of its significance, and recommendation for minor revision. We are pleased that the invariance of the influence function and the efficiency results under rerandomization are viewed as useful extensions of prior work.

standing simulated objections not resolved
  • No specific major comments were provided in the referee report (the MAJOR COMMENTS section is empty), so we are unable to address any points point-by-point or indicate whether revisions are needed.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper derives asymptotic linearity, influence functions, and efficiency results for M-estimators under rerandomization from standard regularity conditions on the estimators (asymptotic linearity under simple randomization) and properties of the randomization scheme. No load-bearing step reduces by the paper's own equations to a fitted parameter, self-definition, or self-citation chain; the influence-function invariance is shown to hold identically rather than by construction from data-dependent fits. The extension to data-adaptive learners and stratified rerandomization likewise rests on external regularity assumptions rather than internal re-use of the target result. This is the expected outcome for a purely theoretical asymptotic analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based solely on abstract; no explicit free parameters, invented entities, or non-standard axioms are described. The work implicitly relies on standard regularity conditions for M-estimators.

axioms (1)
  • domain assumption Standard regularity conditions for M-estimators to possess asymptotic linearity and influence functions under simple randomization
    Invoked to assert that the influence function remains identical under rerandomization.

pith-pipeline@v0.9.0 · 5768 in / 1415 out tokens · 23662 ms · 2026-05-24T00:36:33.689794+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Asymptotic theory of rerandomization for survival analysis

    stat.ME 2026-04 unverdicted novelty 7.0

    Rerandomization yields tight limiting processes with lower pointwise asymptotic variances for Kaplan-Meier and IPCW Kaplan-Meier survival estimators, while the variance of debiased ML estimators remains invariant due ...

  2. Langevin-Gradient Rerandomization

    stat.ME 2026-04 unverdicted novelty 7.0

    LGR samples balanced treatment assignments in high-dimensional experiments via continuous relaxation and SGLD, retaining valid inference through randomization tests while being orders of magnitude faster than prior methods.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · cited by 2 Pith papers

  1. [1]

    Benkeser, D., D \' az, I., Luedtke, A., Segal, J., Scharfstein, D., and Rosenblum, M. (2021). Improving precision and power in randomized trials for covid-19 treatments using covariate adjustment, for binary, ordinal, and time-to-event outcomes. Biometrics , 77(4):1467--1481

  2. [2]

    and Van Der Laan, M

    Benkeser, D. and Van Der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA) , pages 689--696. IEEE

  3. [3]

    and McKenzie, D

    Bruhn, M. and McKenzie, D. (2009). In pursuit of balance: Randomization in practice in development field experiments. American economic journal: applied economics , 1(4):200--232

  4. [4]

    Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal , 21(1):C1--C68

  5. [5]

    and Rosenblum, M

    Colantuoni, E. and Rosenblum, M. (2015). Leveraging prognostic baseline variables to gain precision in randomized trials. Statistics in medicine , 34(18):2602--2617

  6. [6]

    Ding, P., Li, X., and Miratrix, L. W. (2017). Bridging finite and super population causal inference. Journal of Causal Inference , 5(2):20160027

  7. [7]

    H., Liang, T., and Misra, S

    Farrell, M. H., Liang, T., and Misra, S. (2021). Deep neural networks for estimation and inference. Econometrica , 89(1):181--213

  8. [8]

    M., Halperin, I

    Ivers, N. M., Halperin, I. J., Barnsley, J., Grimshaw, J. M., Shah, B. R., Tu, K., Upshur, R., and Zwarenstein, M. (2012). Allocation techniques for balance at baseline in cluster randomized trials: a methodological review. Trials , 13:1--9

  9. [9]

    J., Kohrt, B

    Jordans, M. J., Kohrt, B. A., Sangraula, M., Turner, E. L., Wang, X., Shrestha, P., Ghimire, R., van’t Hof, E., Bryant, R. A., Dawson, K. S., et al. (2021). Effectiveness of group problem management plus, a brief psychological intervention for adults affected by humanitarian disasters in nepal: A cluster randomized controlled trial. PLoS Medicine , 18(6):e1003621

  10. [10]

    M., Heagerty, P

    Li, F., Lokhnygina, Y., Murray, D. M., Heagerty, P. J., and DeLong, E. R. (2016). An evaluation of constrained randomization for the design and analysis of group-randomized trials. Statistics in medicine , 35(10):1565--1579

  11. [11]

    L., Heagerty, P

    Li, F., Turner, E. L., Heagerty, P. J., Murray, D. M., Vollmer, W. M., and DeLong, E. R. (2017). An evaluation of constrained randomization for the design and analysis of group-randomized trials with binary outcomes. Statistics in medicine , 36(24):3791--3806

  12. [12]

    and Ding, P

    Li, X. and Ding, P. (2020). Rerandomization and regression adjustment. Journal of the Royal Statistical Society Series B: Statistical Methodology , 82(1):241--268

  13. [13]

    Li, X., Ding, P., and Rubin, D. B. (2018). Asymptotic theory of rerandomization in treatment--control experiments. Proceedings of the National Academy of Sciences , 115(37):9157--9162

  14. [14]

    Lu, X., Liu, T., Liu, H., and Ding, P. (2023). Design-based theory for cluster rerandomization. Biometrika , 110(2):467--483

  15. [15]

    Morgan, K. L. and Rubin, D. B. (2012). Rerandomization to improve covariate balance in experiments. Annals of Statistics , 40(2):1263--1282

  16. [16]

    Morgan, K. L. and Rubin, D. B. (2015). Rerandomization to balance tiers of covariates. Journal of the American Statistical Association , 110(512):1412--1421

  17. [17]

    Moulton, L. H. (2004). Covariate-based constrained randomization of group-randomized trials. Clinical trials , 1(3):297--305

  18. [18]

    Pirondini, L., Gregson, J., Owen, R., Collier, T., and Pocock, S. (2022). Covariate adjustment in cardiovascular randomized controlled trials: its value, current practice, and need for improvement. Heart Failure , 10(5):297--305

  19. [19]

    Raab, G. M. and Butcher, I. (2001). Balance in cluster randomized trials. Statistics in medicine , 20(3):351--365

  20. [20]

    Rafi, A. (2023). Efficient semiparametric estimation of average treatment effects under covariate adaptive randomization. arXiv preprint arXiv:2305.08340

  21. [21]

    inverse probability

    Robins, J., Sued, M., Lei-Gomez, Q., and Rotnitzky, A. (2007). Comment: Performance of double-robust estimators when “inverse probability” weights are highly variable. Statist. Sci. , 22(4):544--559

  22. [22]

    Robins, J. M. (2002). Covariance adjustment in randomized experiments and observational studies: Comment. Statistical Science , 17(3):309--321

  23. [23]

    Shi, W., Zhao, A., and Liu, H. (2022). Rerandomization and covariate adjustment in split-plot designs. arXiv preprint arXiv:2209.12385

  24. [24]

    Tsiatis, A., Davidian, M., Zhang, M., and Lu, X. (2008). Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Stat Med , 27(23):4658--4677

  25. [25]

    L., Li, F., Gallis, J

    Turner, E. L., Li, F., Gallis, J. A., Prague, M., and Murray, D. M. (2017a). Review of recent methodological developments in group-randomized trials: part 1—design. American journal of public health , 107(6):907--915

  26. [26]

    L., Prague, M., Gallis, J

    Turner, E. L., Prague, M., Gallis, J. A., Li, F., and Murray, D. M. (2017b). Review of recent methodological developments in group-randomized trials: part 2—analysis. American Journal of Public Health , 107(7):1078--1086

  27. [27]

    J., Polley, E

    van der Laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical Applications in Genetics and Molecular Biology , 6(1)

  28. [28]

    J., Rose, S., et al

    van der Laan, M. J., Rose, S., et al. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data , volume 10. Springer

  29. [29]

    van der Vaart, A. (1998). Asymptotic Statistics . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press

  30. [30]

    and Athey, S

    Wager, S. and Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association , 113(523):1228--1242

  31. [31]

    O., Small, D

    Wang, B., Harhay, M. O., Small, D. S., Morris, T. P., and Li, F. (2021). On the mixed-model analysis of covariance in cluster-randomized trials. arXiv preprint arXiv:2112.00832

  32. [32]

    L., and Rosenblum, M

    Wang, B., Ogburn, E. L., and Rosenblum, M. (2019). Analysis of covariance in randomized trials: More precision and valid confidence intervals, without model assumptions. Biometrics , 75(4):1391--1400

  33. [33]

    S., and Li, F

    Wang, B., Park, C., Small, D. S., and Li, F. (2023a). Model-robust and efficient covariate adjustment for cluster-randomized experiments. Journal of the American Statistical Association

  34. [34]

    Wang, B., Susukida, R., Mojtabai, R., Amin-Esmaeili, M., and Rosenblum, M. (2023b). Model-robust inference for clinical trials that improve precision by stratified randomization and covariate adjustment. Journal of the American Statistical Association , 118(542):1152--1163

  35. [35]

    Wang, X., Wang, T., and Liu, H. (2023c). Rerandomization in stratified randomized experiments. Journal of the American Statistical Association , 118(542):1295--1304

  36. [36]

    and Li, X

    Wang, Y. and Li, X. (2022). Rerandomization with diminishing covariate imbalance and diverging number of covariates. The Annals of Statistics , 50(6):3439--3465

  37. [37]

    Zelen, M. (1974). The randomization and stratification of patients to clinical trials. Journal of Chronic Diseases , 27(7):365 -- 375

  38. [38]

    and Ding, P

    Zhao, A. and Ding, P. (2024). No star is good news: A unified look at rerandomization based on p-values from covariate balance tests. Journal of Econometrics , 241(1):105724

  39. [39]

    A., Morgan, K

    Zhou, Q., Ernst, P. A., Morgan, K. L., Rubin, D. B., and Zhang, A. (2018). Sequential rerandomization. Biometrika , 105(3):745--752

  40. [40]

    A., Canay, I

    Bugni, F. A., Canay, I. A., and Shaikh, A. M. (2018). Inference under covariate-adaptive randomization. Journal of the American Statistical Association , 113(524):1784--1796

  41. [41]

    Durrett, R. (2019). Probability: theory and examples , volume 49. Cambridge university press