Sharp variance estimator and causal bootstrap in stratified randomized experiments

Hanzhong Liu; Haoyang Yu; Ke Zhu

arxiv: 2401.16667 · v3 · pith:EQXABPBWnew · submitted 2024-01-30 · 🧮 math.ST · stat.AP· stat.ME· stat.TH

Sharp variance estimator and causal bootstrap in stratified randomized experiments

Haoyang Yu , Ke Zhu , Hanzhong Liu This is my paper

Pith reviewed 2026-05-24 04:36 UTC · model grok-4.3

classification 🧮 math.ST stat.APstat.MEstat.TH

keywords stratified randomized experimentscausal bootstrapsharp variance estimatorfinite-population inferencetreatment effect estimationrandomization-based inferencedifference-in-means estimator

0 comments

The pith

The rank-preserving causal bootstrap achieves a second-order refinement over normal approximation for the sampling distribution of the weighted difference-in-means estimator in stratified randomized experiments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that standard Neyman variance estimators and normal approximations can be overly conservative or inaccurate for treatment effect estimation in stratified randomized experiments, especially with small samples or skewed outcomes. It proposes a sharp variance estimator along with two randomization-based causal bootstrap procedures that generate replicates via imputation models. One procedure uses rank-preserving imputation and is shown to deliver second-order accuracy improvements. The methods rely only on the randomness from treatment assignment rather than hypothetical super-population sampling. Numerical studies and real-data examples indicate better finite-sample performance than conventional approaches.

Core claim

In stratified randomized experiments the weighted difference-in-means estimator has a finite-population randomization distribution that can be more accurately approximated by a sharp variance estimator and by rank-preserving causal bootstrap replicates than by the usual Neyman variance and normal approximation; the rank-preserving bootstrap attains a second-order refinement, while the constant-treatment-effect version extends the approach to paired experiments.

What carries the argument

Rank-preserving imputation model for bootstrap replicates, which generates pseudo-populations by preserving the observed ranks of potential outcomes under the finite-population randomization distribution.

If this is right

The sharp variance estimator reduces over-conservatism relative to the Neyman estimator when treatment effects are heterogeneous.
The rank-preserving bootstrap supplies higher-order corrections to normal-based confidence intervals without invoking super-population sampling.
The constant-treatment-effect bootstrap applies directly to paired randomized experiments.
Both bootstrap procedures remain valid under the randomization distribution alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be extended to cluster-randomized or multi-arm designs by adapting the imputation step to respect the corresponding randomization scheme.
If the second-order refinement holds, the bootstrap may yield shorter intervals than normal approximation while maintaining coverage, which would be useful in regulatory or medical settings with limited sample sizes.
Comparison with permutation-based methods for the same design would clarify whether the imputation step adds value beyond simple resampling of assignments.

Load-bearing premise

The rank-preserving and constant-treatment-effect imputation models correctly preserve the key features of the joint distribution of potential outcomes under the finite-population randomization distribution.

What would settle it

A Monte Carlo experiment in which the empirical coverage of the rank-preserving bootstrap intervals falls below the nominal level when the rank-ordering of potential outcomes is deliberately altered while keeping marginal distributions fixed.

Figures

Figures reproduced from arXiv: 2401.16667 by Hanzhong Liu, Haoyang Yu, Ke Zhu.

**Figure 2.** Figure 2: Density plot and Q-Q plot for the outcomes from the public health field [PITH_FULL_IMAGE:figures/full_fig_p031_2.png] view at source ↗

read the original abstract

Randomized experiments are the gold standard for estimating treatment effects, and randomization serves as a reasoned basis for inference. In widely used stratified randomized experiments, randomization-based finite-population asymptotic theory enables valid inference for the average treatment effect, relying on normal approximation and a Neyman-type conservative variance estimator. However, when the sample size is small or the outcomes are skewed, the Neyman-type variance estimator may become overly conservative, and the normal approximation can fail. To address these issues, we propose a sharp variance estimator and two causal bootstrap methods to more accurately approximate the sampling distribution of the weighted difference-in-means estimator in stratified randomized experiments. The first causal bootstrap procedure is based on rank-preserving imputation and we prove its second-order refinement over normal approximation. The second causal bootstrap procedure is based on constant-treatment-effect imputation and is further applicable in paired experiments. In contrast to traditional bootstrap methods, where randomness originates from hypothetical super-population sampling, our analysis for the proposed causal bootstrap is randomization-based, relying solely on the randomness of treatment assignment in randomized experiments. Numerical studies and two real data applications demonstrate advantages of our proposed methods in finite samples. The \texttt{R} package \texttt{CausalBootstrap} implementing our method is publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper delivers a sharp variance estimator plus two randomization-based causal bootstraps (rank-preserving with second-order refinement, constant-treatment-effect) that target finite-sample issues in stratified experiments better than standard Neyman.

read the letter

The main contribution is the sharp variance estimator and the two causal bootstrap procedures, both justified directly from the randomization distribution rather than super-population sampling. The rank-preserving version comes with a proof of second-order refinement over normal approximation for the weighted difference-in-means, which is the clearest new theoretical piece. The constant-treatment-effect version extends to paired experiments. They also ship an R package and show numerical gains plus two real-data examples, which is useful for people who actually run these designs with moderate n or skewed outcomes. The work stays within the finite-population randomization framework, so the claims line up with the stated assumptions. The imputation steps for the bootstraps are the main modeling choice; if those preserve the relevant features of the potential-outcome joint distribution, the refinement result should hold, and the stress-test note finds no internal contradiction. Nothing in the abstract or stress-test suggests the derivations are circular or rely on post-hoc fitting that undermines the randomization basis. This is aimed at statisticians and applied researchers who analyze stratified or paired experiments and want tighter inference when normal approximation or Neyman variance is too conservative. It is solid enough to send to referees; the combination of explicit proof, package, and empirical checks gives it a clear path through review even if the refinement needs some tightening in the full write-up.

Referee Report

1 major / 2 minor

Summary. The paper proposes a sharp variance estimator and two randomization-based causal bootstrap procedures (rank-preserving imputation and constant-treatment-effect imputation) for the weighted difference-in-means estimator in stratified randomized experiments. It claims to prove a second-order refinement of the rank-preserving bootstrap over normal approximation under the finite-population randomization distribution, extends the second method to paired experiments, and reports improved finite-sample performance via simulations and two real-data applications, with an accompanying R package CausalBootstrap.

Significance. If the second-order refinement holds under the stated conditions, the work supplies a theoretically grounded improvement to randomization inference for small or skewed strata, where Neyman-type variance estimators are known to be conservative. The explicit randomization-based (rather than super-population) framing and public code are concrete strengths that facilitate verification and adoption.

major comments (1)

[Theorem on second-order refinement (likely §3 or §4)] The central claim is the second-order refinement for the rank-preserving bootstrap. The manuscript should state the precise regularity conditions (moment bounds, stratum-size growth rates, and boundedness of potential outcomes) under which the Edgeworth expansion or equivalent argument establishes the refinement; without these, it is unclear whether the result applies to the skewed-outcome regimes highlighted in the introduction.

minor comments (2)

[Introduction and methods] Notation for the weighted difference-in-means estimator and the stratum-specific weights should be introduced once with a single consistent symbol rather than re-defined across sections.
[Simulation section] The numerical studies would benefit from an explicit table reporting coverage rates and interval lengths for all competing methods (normal, sharp variance, both bootstraps) across the same simulation configurations.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading, positive recommendation, and constructive comment. We address the point below and will incorporate the suggested clarification in the revision.

read point-by-point responses

Referee: [Theorem on second-order refinement (likely §3 or §4)] The central claim is the second-order refinement for the rank-preserving bootstrap. The manuscript should state the precise regularity conditions (moment bounds, stratum-size growth rates, and boundedness of potential outcomes) under which the Edgeworth expansion or equivalent argument establishes the refinement; without these, it is unclear whether the result applies to the skewed-outcome regimes highlighted in the introduction.

Authors: We agree that the regularity conditions for the second-order refinement (Theorem 3.1 or equivalent) should be stated explicitly. In the revised manuscript we will insert a dedicated remark immediately after the theorem that lists the precise assumptions: (i) uniform boundedness of all potential outcomes (or, alternatively, existence of moments of order 4+δ for δ>0), (ii) stratum-size growth conditions requiring that the smallest stratum size grows to infinity at a rate sufficient to make the Edgeworth remainder o(n^{-1/2}), and (iii) standard technical conditions on the stratum weights and the non-degeneracy of the finite-population variance. These conditions are already implicit in the proof strategy and are satisfied by the skewed-outcome simulation designs in Section 5; spelling them out will directly address applicability concerns without changing any results or proofs. revision: yes

Circularity Check

0 steps flagged

Derivation is self-contained under randomization distribution

full rationale

The paper's central claim is a mathematical proof of second-order refinement for the rank-preserving causal bootstrap under the finite-population randomization distribution, using rank-preserving and constant-treatment-effect imputation models as explicit assumptions. These are not derived from or equivalent to quantities fitted from the observed data by the paper's own equations. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or described methods. The analysis relies on randomization theory rather than super-population sampling, making the derivation independent of its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; no free parameters, invented entities, or non-standard axioms are mentioned beyond the domain assumption of randomization-based inference.

axioms (1)

domain assumption Randomization of treatment assignment within strata provides the sole basis for inference
Explicitly stated as the foundation for all proposed methods and analysis.

pith-pipeline@v0.9.0 · 5750 in / 1061 out tokens · 19226 ms · 2026-05-24T04:36:39.870162+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The first causal bootstrap procedure is based on rank-preserving imputation and we prove its second-order refinement over normal approximation.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

sharp upper bound for S[m]Y(1),Y(0) ... SU[m]Y(1),Y(0) = n[m]/(n[m]−1) {∫ G−1[m](u)F−1[m](u)du − Ȳ[m](1)Ȳ[m](0)}

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

[1]

W., and Wooldridge, J

Abadie, A., Athey, S., Imbens, G. W., and Wooldridge, J. M. (2020). Sampling-based versus design-based uncertainty in regression analysis. Econometrica , 88(1):265--296

work page 2020
[2]

M., Green, D

Aronow, P. M., Green, D. P., and Lee, D. K. K. (2014). Sharp bounds on the variance in randomized experiments. Annals of Statistics , 42(3):850--871

work page 2014
[3]

and Imbens, G

Athey, S. and Imbens, G. W. (2017). The econometrics of randomized experiments. In Handbook of Economic Field Experiments , volume 1, pages 73--140. Elsevier

work page 2017
[4]

Babu, G. J. and Singh, K. (1985). Edgeworth expansions for sampling without replacement from finite populations. Journal of Multivariate Analysis , 17(3):261--278

work page 1985
[5]

and Rubin, D

Bind, M.-A. and Rubin, D. (2020). When possible, report a fisher-exact p value and display its underlying null randomization distribution. Proceedings of the National Academy of Sciences , 117(32):19151--19158

work page 2020
[6]

Bobkov, S. G. (2004). Concentration of normalized sums and a central limit theorem for noncorrelated random variables. Annals of Probability , 32(4):2884--2907

work page 2004
[7]

Cohen, P. L. and Fogarty, C. B. (2022). Gaussian prepivoting for finite population causal inference. Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(2):295--320

work page 2022
[8]

Ding, P. (2017). A paradox from randomization-based causal inference. Statistical science , 32(3):331--345

work page 2017
[9]

Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics , 7(1):1--26

work page 1979
[10]

Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture , 33:503--513

work page 1926
[11]

Fogarty, C. B. (2018). Regression-assisted inference for the average treatment effect in paired experiments. Biometrika , 105(4):994--1000

work page 2018
[12]

Hall, P. (2013). The bootstrap and Edgeworth expansion . Springer Science & Business Media

work page 2013
[13]

Huestis, M. A. and Cone, E. J. (1998). Differentiating new marijuana use from residual drug excretion in occasional marijuana users. Journal of analytical toxicology , 22(6):445--454

work page 1998
[14]

Imai, K. (2008). Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Statistics in Medicine , 27(24):4857--4873

work page 2008
[15]

Imai, K., King, G., and Stuart, E. A. (2008). Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society Series A: Statistics in Society , 171(2):481--502

work page 2008
[16]

and Menzel, K

Imbens, G. and Menzel, K. (2021). A causal bootstrap. Annals of Statistics , 49(3):1460--1488

work page 2021
[17]

Imbens, G. W. and Rubin, D. B. (2015). Causal I nference for S tatistics, S ocial, and B iomedical S ciences: A n I ntroduction . New York: Cambridge University Press

work page 2015
[18]

politically robust

King, G., Gakidou, E., Ravishankar, N., Moore, R. T., Lakin, J., Vargas, M., T \'e llez-Rojo, M. M., Hern \'a ndez \'A vila, J. E., \'A vila, M. H., and Llamas, H. H. (2007). A “politically robust” experimental design for public policy evaluation, with application to the mexican universal health insurance program. Journal of Policy Analysis and Management...

work page 2007
[19]

and Ding, P

Li, X. and Ding, P. (2017). General forms of finite population central limit theorems with applications to causal inference. Journal of the American Statistical Association , 112(520):1759--1769

work page 2017
[20]

and Yang, Y

Liu, H. and Yang, Y. (2020). Regression-adjusted average treatment effect estimates in stratified randomized experiments. Biometrika , 107(4):935--948

work page 2020
[21]

A., Sonne, S

McClure, E. A., Sonne, S. C., Winhusen, T., Carroll, K. M., Ghitza, U. E., McRae-Clark, A. L., Matthews, A. G., Sharma, G., Van Veldhuisen, P., Vandrey, R. G., et al. (2014). Achieving cannabis cessation—evaluating n-acetylcysteine treatment (accent): Design and implementation of a multi-site, randomized controlled study in the national institute on drug ...

work page 2014
[22]

Motoyama, H. (2023). Extended glivenko—cantelli theorem for simple random sampling without replacement from a finite population. Communications in Statistics-Theory and Methods , pages 1--11

work page 2023
[23]

Mukerjee, R., Dasgupta, T., and Rubin, D. B. (2018). Using standard tools from finite population sampling to improve causal inference for complex experiments. Journal of the American Statistical Association , 113(522):868--881

work page 2018
[24]

Neyman, J. (1990). On the application of probability theory to agricultural experiments. Statistical Science , 5(4):465--472

work page 1990
[25]

Olken, B. A. (2007). Monitoring corruption: evidence from a field experiment in indonesia. Journal of Political Economy , 115(2):200--249

work page 2007
[26]

Pashley, N. E. and Miratrix, L. W. (2021). Insights on variance estimation for blocked and matched pairs designs. Journal of Educational and Behavioral Statistics , 46(3):271--296

work page 2021
[27]

F., Uschner, D., and Wang, Y

Rosenberger, W. F., Uschner, D., and Wang, Y. (2019). Randomization: The forgotten component of the randomized clinical trial. Statistics in medicine , 38(1):1--12

work page 2019
[28]

Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology , 66(5):688

work page 1974
[29]

Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association , 75(371):591--593

work page 1980
[30]

Z., Pashley, N

Schochet, P. Z., Pashley, N. E., Miratrix, L. W., and Kautz, T. (2022). Design-based ratio estimators and central limit theorems for clustered, blocked rcts. Journal of the American Statistical Association , 117(540):2135--2146

work page 2022
[31]

W., Gullberg, R

Schwilke, E. W., Gullberg, R. G., Darwin, W. D., Chiang, C. N., Cadet, J. L., Gorelick, D. A., Pope, H. G., and Huestis, M. A. (2011). Differentiating new cannabis use from residual urinary cannabinoid excretion in chronic, daily cannabis users. Addiction , 106(3):499--506

work page 2011
[32]

Wang, R., Wang, Q., Miao, W., and Zhou, X. (2024). Sharp bounds for variance of treatment effect estimators in the finite population in the presence of covariates. Statistica Sinica . in press

work page 2024
[33]

Wang, X., Wang, T., and Liu, H. (2023). Rerandomization in stratified randomized experiments. Journal of the American Statistical Association , 118(542):1295--1304

work page 2023
[34]

Wang, Z., Peng, L., and Kim, J. K. (2022). Bootstrap inference for the finite population mean under complex sampling designs. Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(4):1150--1174

work page 2022
[35]

and Ding, P

Wu, J. and Ding, P. (2021). Randomization tests for weak null hypotheses in randomized experiments. Journal of the American Statistical Association , 116(536):1898--1913

work page 2021
[36]

and G \"o tze, F

Bloznelis, M. and G \"o tze, F. (2002). An edgeworth expansion for symmetric finite population statistics. Annals of Probability , 30(3):1238--1265

work page 2002
[37]

Lehmann, E. L. (1966). Some concepts of dependence. The Annals of Mathematical Statistics , 37(5):1137--1153

work page 1966
[38]

Liu, R. Y. (1988). Bootstrap Procedures under some Non-I.I.D. Models . The Annals of Statistics , 16(4):1696 -- 1708

work page 1988
[39]

Tchen, A. H. (1980). Inequalities for distributions with given marginals. Annals of Probability , 8(4):814--827

work page 1980
[40]

Zhu, K., Liu, H., and Yang, Y. (2021). Design-based theory for lasso adjustment in randomized block experiments with a general blocking scheme. arXiv preprint arXiv:2109.11271

work page arXiv 2021

[1] [1]

W., and Wooldridge, J

Abadie, A., Athey, S., Imbens, G. W., and Wooldridge, J. M. (2020). Sampling-based versus design-based uncertainty in regression analysis. Econometrica , 88(1):265--296

work page 2020

[2] [2]

M., Green, D

Aronow, P. M., Green, D. P., and Lee, D. K. K. (2014). Sharp bounds on the variance in randomized experiments. Annals of Statistics , 42(3):850--871

work page 2014

[3] [3]

and Imbens, G

Athey, S. and Imbens, G. W. (2017). The econometrics of randomized experiments. In Handbook of Economic Field Experiments , volume 1, pages 73--140. Elsevier

work page 2017

[4] [4]

Babu, G. J. and Singh, K. (1985). Edgeworth expansions for sampling without replacement from finite populations. Journal of Multivariate Analysis , 17(3):261--278

work page 1985

[5] [5]

and Rubin, D

Bind, M.-A. and Rubin, D. (2020). When possible, report a fisher-exact p value and display its underlying null randomization distribution. Proceedings of the National Academy of Sciences , 117(32):19151--19158

work page 2020

[6] [6]

Bobkov, S. G. (2004). Concentration of normalized sums and a central limit theorem for noncorrelated random variables. Annals of Probability , 32(4):2884--2907

work page 2004

[7] [7]

Cohen, P. L. and Fogarty, C. B. (2022). Gaussian prepivoting for finite population causal inference. Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(2):295--320

work page 2022

[8] [8]

Ding, P. (2017). A paradox from randomization-based causal inference. Statistical science , 32(3):331--345

work page 2017

[9] [9]

Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics , 7(1):1--26

work page 1979

[10] [10]

Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture , 33:503--513

work page 1926

[11] [11]

Fogarty, C. B. (2018). Regression-assisted inference for the average treatment effect in paired experiments. Biometrika , 105(4):994--1000

work page 2018

[12] [12]

Hall, P. (2013). The bootstrap and Edgeworth expansion . Springer Science & Business Media

work page 2013

[13] [13]

Huestis, M. A. and Cone, E. J. (1998). Differentiating new marijuana use from residual drug excretion in occasional marijuana users. Journal of analytical toxicology , 22(6):445--454

work page 1998

[14] [14]

Imai, K. (2008). Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Statistics in Medicine , 27(24):4857--4873

work page 2008

[15] [15]

Imai, K., King, G., and Stuart, E. A. (2008). Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society Series A: Statistics in Society , 171(2):481--502

work page 2008

[16] [16]

and Menzel, K

Imbens, G. and Menzel, K. (2021). A causal bootstrap. Annals of Statistics , 49(3):1460--1488

work page 2021

[17] [17]

Imbens, G. W. and Rubin, D. B. (2015). Causal I nference for S tatistics, S ocial, and B iomedical S ciences: A n I ntroduction . New York: Cambridge University Press

work page 2015

[18] [18]

politically robust

King, G., Gakidou, E., Ravishankar, N., Moore, R. T., Lakin, J., Vargas, M., T \'e llez-Rojo, M. M., Hern \'a ndez \'A vila, J. E., \'A vila, M. H., and Llamas, H. H. (2007). A “politically robust” experimental design for public policy evaluation, with application to the mexican universal health insurance program. Journal of Policy Analysis and Management...

work page 2007

[19] [19]

and Ding, P

Li, X. and Ding, P. (2017). General forms of finite population central limit theorems with applications to causal inference. Journal of the American Statistical Association , 112(520):1759--1769

work page 2017

[20] [20]

and Yang, Y

Liu, H. and Yang, Y. (2020). Regression-adjusted average treatment effect estimates in stratified randomized experiments. Biometrika , 107(4):935--948

work page 2020

[21] [21]

A., Sonne, S

McClure, E. A., Sonne, S. C., Winhusen, T., Carroll, K. M., Ghitza, U. E., McRae-Clark, A. L., Matthews, A. G., Sharma, G., Van Veldhuisen, P., Vandrey, R. G., et al. (2014). Achieving cannabis cessation—evaluating n-acetylcysteine treatment (accent): Design and implementation of a multi-site, randomized controlled study in the national institute on drug ...

work page 2014

[22] [22]

Motoyama, H. (2023). Extended glivenko—cantelli theorem for simple random sampling without replacement from a finite population. Communications in Statistics-Theory and Methods , pages 1--11

work page 2023

[23] [23]

Mukerjee, R., Dasgupta, T., and Rubin, D. B. (2018). Using standard tools from finite population sampling to improve causal inference for complex experiments. Journal of the American Statistical Association , 113(522):868--881

work page 2018

[24] [24]

Neyman, J. (1990). On the application of probability theory to agricultural experiments. Statistical Science , 5(4):465--472

work page 1990

[25] [25]

Olken, B. A. (2007). Monitoring corruption: evidence from a field experiment in indonesia. Journal of Political Economy , 115(2):200--249

work page 2007

[26] [26]

Pashley, N. E. and Miratrix, L. W. (2021). Insights on variance estimation for blocked and matched pairs designs. Journal of Educational and Behavioral Statistics , 46(3):271--296

work page 2021

[27] [27]

F., Uschner, D., and Wang, Y

Rosenberger, W. F., Uschner, D., and Wang, Y. (2019). Randomization: The forgotten component of the randomized clinical trial. Statistics in medicine , 38(1):1--12

work page 2019

[28] [28]

Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology , 66(5):688

work page 1974

[29] [29]

Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association , 75(371):591--593

work page 1980

[30] [30]

Z., Pashley, N

Schochet, P. Z., Pashley, N. E., Miratrix, L. W., and Kautz, T. (2022). Design-based ratio estimators and central limit theorems for clustered, blocked rcts. Journal of the American Statistical Association , 117(540):2135--2146

work page 2022

[31] [31]

W., Gullberg, R

Schwilke, E. W., Gullberg, R. G., Darwin, W. D., Chiang, C. N., Cadet, J. L., Gorelick, D. A., Pope, H. G., and Huestis, M. A. (2011). Differentiating new cannabis use from residual urinary cannabinoid excretion in chronic, daily cannabis users. Addiction , 106(3):499--506

work page 2011

[32] [32]

Wang, R., Wang, Q., Miao, W., and Zhou, X. (2024). Sharp bounds for variance of treatment effect estimators in the finite population in the presence of covariates. Statistica Sinica . in press

work page 2024

[33] [33]

Wang, X., Wang, T., and Liu, H. (2023). Rerandomization in stratified randomized experiments. Journal of the American Statistical Association , 118(542):1295--1304

work page 2023

[34] [34]

Wang, Z., Peng, L., and Kim, J. K. (2022). Bootstrap inference for the finite population mean under complex sampling designs. Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(4):1150--1174

work page 2022

[35] [35]

and Ding, P

Wu, J. and Ding, P. (2021). Randomization tests for weak null hypotheses in randomized experiments. Journal of the American Statistical Association , 116(536):1898--1913

work page 2021

[36] [36]

and G \"o tze, F

Bloznelis, M. and G \"o tze, F. (2002). An edgeworth expansion for symmetric finite population statistics. Annals of Probability , 30(3):1238--1265

work page 2002

[37] [37]

Lehmann, E. L. (1966). Some concepts of dependence. The Annals of Mathematical Statistics , 37(5):1137--1153

work page 1966

[38] [38]

Liu, R. Y. (1988). Bootstrap Procedures under some Non-I.I.D. Models . The Annals of Statistics , 16(4):1696 -- 1708

work page 1988

[39] [39]

Tchen, A. H. (1980). Inequalities for distributions with given marginals. Annals of Probability , 8(4):814--827

work page 1980

[40] [40]

Zhu, K., Liu, H., and Yang, Y. (2021). Design-based theory for lasso adjustment in randomized block experiments with a general blocking scheme. arXiv preprint arXiv:2109.11271

work page arXiv 2021