arxiv: 2604.18323 · v1 · submitted 2026-04-20 · 📊 stat.ME

Recognition: unknown

Which Small-Sample Correction Should Be Used When Analyzing Stepped-Wedge Designs with Time-Varying Treatment Effects?

Yongdong Ouyang , Monica Taljaard , James P. Hughes , Fan Li

Authors on Pith no claims yet

Pith reviewed 2026-05-10 03:53 UTC · model grok-4.3

classification 📊 stat.ME

keywords stepped-wedge designrobust variance estimatorsmall-sample correctiontime-varying treatment effectcluster randomized trialexposure-time indicator modelMancl-DeRouen estimatorMorel-Bokossa-Neerchal estimator

0 comments

The pith

When random effects are misspecified in stepped-wedge trials with time-varying effects, the Mancl-DeRouen estimator restores coverage for continuous outcomes while the Morel-Bokossa-Neerchal estimator does so for binary outcomes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Stepped-wedge cluster randomized trials often use models that assume treatment effects begin right at crossover and stay constant. Exposure-time indicator models relax this by letting effects differ according to time since exposure, which allows separate estimation of the time-averaged treatment effect and the long-term effect. These models still depend on random-effects assumptions that are commonly wrong in practice, and model-based standard errors then produce undercoverage. Simulations compare four robust variance estimators across continuous and binary outcomes and show that certain small-sample corrections recover proper coverage while others remain unstable, especially for the long-term effect.

Core claim

Exposure-time indicator models target the time-averaged treatment effect and long-term effect in stepped-wedge designs when effects vary with exposure duration. Under misspecified random-effects structures, model-based standard errors undercover, but robust variance estimators improve performance. For continuous outcomes the Mancl-DeRouen estimator paired with a t-distribution whose degrees of freedom equal the number of clusters minus two yields the most consistent coverage; for binary outcomes the Morel-Bokossa-Neerchal estimator is the only consistently reliable choice. Both model-based and robust approaches remain unstable when targeting the long-term effect.

What carries the argument

Exposure-time indicator (ETI) models combined with robust variance estimators (classic sandwich, Kauermann-Carroll, Mancl-DeRouen, Morel-Bokossa-Neerchal) that adjust standard errors for small numbers of clusters and possible random-effects misspecification.

If this is right

Model-based standard errors produce undercoverage for both the time-averaged and long-term effects when random effects are misspecified.
For continuous outcomes the Mancl-DeRouen estimator with t-distribution and degrees of freedom equal to clusters minus two gives consistent coverage across scenarios.
For binary outcomes the Morel-Bokossa-Neerchal estimator is the only small-sample correction that remains reliable.
The Mancl-DeRouen estimator can become unstable in one-cluster-per-sequence designs because of data sparsity.
Inference on the long-term effect stays unstable whether model-based or robust standard errors are used.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Trialists who expect effects to change with exposure duration should fit exposure-time indicator models with the recommended robust corrections instead of immediate-treatment models.
Designs with few clusters or one cluster per sequence may still require additional safeguards beyond these estimators when targeting long-term effects.
Similar robust-variance recommendations could be tested in other longitudinal cluster designs such as crossover trials that also feature time-dependent exposures.

Load-bearing premise

The specific simulation scenarios, including the chosen forms of random-effects misspecification and the data-generating processes for time-varying effects, adequately represent conditions in real stepped-wedge cluster randomized trials.

What would settle it

Empirical coverage of nominal 95 percent intervals computed from a real stepped-wedge trial dataset whose true time-varying effects are known independently, checked separately for the Mancl-DeRouen and Morel-Bokossa-Neerchal estimators against the model-based standard errors.

read the original abstract

Stepped-wedge cluster randomized trials (SW-CRTs) evaluate interventions rolled out across clusters over time. Standard analyses typically use immediate-treatment (IT) models, which assume effects begin at crossover and remain constant thereafter. When effects vary with exposure duration, IT models may misrepresent target effects. Exposure-time indicator (ETI) models address this by allowing treatment effects to differ by time since exposure and by targeting the time-averaged treatment effect (TATE) and long-term effect (LTE). Like IT models, ETI models require specification of a random-effects structure, which is often misspecified, and the performance of robust variance estimators (RVEs) in this setting is not well understood. We review RVEs for ETI models and evaluate them in simulation studies with continuous and binary outcomes under correctly specified (binary only) and misspecified random-effects structures. We compare the classic sandwich, Kauermann-Carroll (KC), Mancl-DeRouen (MD), and Morel-Bokossa-Neerchal (MBN) estimators for inference on the TATE and LTE. Our simulations show that under misspecified random-effects structures, model-based standard errors (SE) produced undercoverage, whereas RVEs improved performance. For continuous outcomes, MD with a t-distribution and degrees of freedom equal to the number of clusters minus two gave the most consistent coverage probabilities. For binary outcomes, MBN was the only consistently reliable option. MD, however, could be unstable in one-cluster-per-sequence designs because of data sparsity. Across scenarios, both model-based SE and RVE for LTE were unstable, indicating that greater caution is needed when targeting LTE under ETI models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper evaluates robust variance estimators (classic sandwich, KC, MD, MBN) for exposure-time indicator (ETI) models in stepped-wedge cluster randomized trials (SW-CRTs) that allow time-varying treatment effects. It targets the time-averaged treatment effect (TATE) and long-term effect (LTE) under misspecified random-effects structures via Monte Carlo simulations for continuous and binary outcomes. Key results indicate model-based SEs undercover while RVEs improve coverage; MD with t-distribution (df = clusters-2) performs best for continuous outcomes, MBN is most reliable for binary, but LTE estimates are unstable across scenarios and MD can be unstable in sparse one-cluster-per-sequence designs.

Significance. If the simulation findings hold, the work offers timely practical guidance for choosing small-sample corrections in SW-CRT analyses when random-effects assumptions are violated and time-varying effects are present. The Monte Carlo design directly compares finite-sample performance of multiple RVEs on both TATE and LTE, addressing a gap where theoretical results are limited; this empirical evidence can help analysts avoid undercoverage in real trials.

major comments (2)

[Methods / Simulation design] Simulation design (Methods section): the data-generating processes specify particular random-effects misspecifications and exposure-time effect patterns, but no sensitivity analyses are reported for stronger cluster-size heterogeneity, higher ICC variability, or non-monotonic time-varying effects outside the simulated envelope. Because the headline recommendations (MD+t best for continuous; MBN only reliable for binary) rest entirely on these grids, limited coverage of realistic SW-CRT conditions weakens generalizability of the performance claims.
[Results] Results on LTE (Results section): both model-based SE and all RVEs for the long-term effect are reported as unstable across scenarios, yet the paper provides no quantitative thresholds (e.g., minimum number of clusters or periods) or alternative estimators that would make LTE inference reliable. This instability directly affects the central claim that ETI models can target LTE, so explicit guidance or caveats are needed to avoid over-interpretation.

minor comments (2)

[Introduction] The abstract and introduction use TATE and LTE without an early formal definition or equation; adding a brief display equation in the Introduction would improve readability for readers unfamiliar with ETI models.
[Results] Table captions (Results) should explicitly state the number of Monte Carlo replications and the exact coverage target (e.g., 95%) so that reported probabilities can be interpreted without returning to the Methods text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript evaluating robust variance estimators for exposure-time indicator models in stepped-wedge cluster randomized trials. The feedback on simulation scope and long-term effect stability is helpful, and we have revised the paper accordingly. We address each major comment point by point below.

read point-by-point responses

Referee: Simulation design (Methods section): the data-generating processes specify particular random-effects misspecifications and exposure-time effect patterns, but no sensitivity analyses are reported for stronger cluster-size heterogeneity, higher ICC variability, or non-monotonic time-varying effects outside the simulated envelope. Because the headline recommendations (MD+t best for continuous; MBN only reliable for binary) rest entirely on these grids, limited coverage of realistic SW-CRT conditions weakens generalizability of the performance claims.

Authors: We agree that the simulation grid does not cover every conceivable SW-CRT variation, including more extreme cluster-size heterogeneity, wider ICC ranges, or non-monotonic exposure-time patterns. Our design prioritized representative misspecifications and monotonic patterns to generate practical recommendations under conditions where random-effects assumptions are violated. To improve transparency, we will expand the Discussion section with an explicit description of the simulated envelope and clear statements limiting the applicability of the MD+t and MBN recommendations to settings similar to those examined. revision: partial
Referee: Results on LTE (Results section): both model-based SE and all RVEs for the long-term effect are reported as unstable across scenarios, yet the paper provides no quantitative thresholds (e.g., minimum number of clusters or periods) or alternative estimators that would make LTE inference reliable. This instability directly affects the central claim that ETI models can target LTE, so explicit guidance or caveats are needed to avoid over-interpretation.

Authors: We acknowledge that the observed instability of LTE estimates across scenarios requires more explicit guidance to prevent over-interpretation. The manuscript already states that greater caution is warranted when targeting the LTE, but we will add quantitative summaries drawn from the existing simulation results (e.g., scenarios in which coverage for the LTE approached nominal levels only when the number of clusters exceeded a certain threshold) and will recommend considering immediate-treatment models when long-term effects are the primary target and time-varying patterns are plausible. These additions will be placed in the Results and Discussion sections. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on independent Monte Carlo simulations

full rationale

The paper evaluates robust variance estimators for exposure-time indicator models in stepped-wedge trials exclusively through simulation studies that generate data under controlled random-effects misspecifications and compare coverage and stability of model-based SEs versus RVEs (KC, MD, MBN). These simulations constitute external benchmarks independent of the target performance metrics; no equations reduce TATE or LTE inference to fitted parameters by construction, no self-definitional loops appear in the estimator definitions, and no load-bearing self-citations or imported uniqueness theorems are invoked to force the reported conclusions. The derivation chain is therefore self-contained as an empirical comparison rather than a tautological renaming or fit.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on simulation-based evaluation rather than analytical proofs. The data-generating processes assume an exposure-time indicator structure with added misspecification only in the random-effects component.

free parameters (1)

Simulation design parameters (number of clusters, periods, effect sizes, intraclass correlations)
Chosen by the authors to represent typical stepped-wedge trial settings and varied across scenarios to test estimator performance.

axioms (1)

domain assumption The exposure-time indicator model correctly captures the time-varying treatment effects in the data-generating process
Invoked when generating simulated data to isolate the effect of random-effects misspecification on the estimators.

pith-pipeline@v0.9.0 · 5618 in / 1438 out tokens · 75296 ms · 2026-05-10T03:53:03.129979+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 7 canonical work pages

[1]

London: Arnold; 2000

Donner Allan.Design and analysis of cluster randomization trials in health research. London: Arnold; 2000. Book Title: Design and analysis of cluster randomization trials in health research

2000
[2]

OuyangYongdong,HemmingKarla,LiFan,TaljaardMonica.Estimatingintra-clustercorrelationcoefficientsforplanning longitudinal cluster randomized trials: a tutorial.International Journal of Epidemiology.2023;:dyad062

2023
[3]

Design and analysis of stepped wedge cluster randomized trials.Contemporary Clinical Trials.2007;28(2):182–191

Hussey Michael A., Hughes James P.. Design and analysis of stepped wedge cluster randomized trials.Contemporary Clinical Trials.2007;28(2):182–191

2007
[4]

CopasAndrewJ.,LewisJamesJ.,ThompsonJenniferA.,DaveyCalum,BaioGianluca,HargreavesJamesR..Designinga stepped wedge trial: three main designs, carry-over effects and randomisation approaches.Trials.2015;16(1):352

2015
[5]

Mixed-effects mod- els for the design and analysis of stepped wedge cluster randomized trials: An overview.Statistical Methods in Medical Research.2021;30(2):612–639

Li Fan, Hughes James P, Hemming Karla, Taljaard Monica, Melnick Edward R., Heagerty Patrick J. Mixed-effects mod- els for the design and analysis of stepped wedge cluster randomized trials: An overview.Statistical Methods in Medical Research.2021;30(2):612–639

2021
[6]

Ouyang Yongdong, Taljaard Monica, Forbes Andrew B, Li Fan. Maintaining the validity of inference from linear mixed models in stepped-wedge cluster randomized trials under misspecified random-effects structures.Statistical Methods in Medical Research.2024;33(9):1497–1516

2024
[7]

Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.Statistics in Medicine.2016;35(26):4718–4728

Hooper Richard, Teerenstra Steven, Hoop Esther, Eldridge Sandra. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.Statistics in Medicine.2016;35(26):4718–4728

2016
[8]

Statistical efficiency and optimal design for stepped cluster studies under linear mixed effects models

Girling Alan J., Hemming Karla. Statistical efficiency and optimal design for stepped cluster stud- ies under linear mixed effects models.Statistics in Medicine.2016;35(13):2149–2166. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.6850

work page doi:10.1002/sim.6850 2016
[9]

KaszaJ.,HemmingK.,HooperR.,MatthewsJns,ForbesA.B..Impactofnon-uniformcorrelationstructureonsamplesize and power in multiple-period cluster randomised trials.Statistical Methods in Medical Research.2019;28(3):703–716

2019
[10]

Model misspecification in stepped wedge trials: Random effects for time or treatment.Statistics in Medicine.2022;41(10):1751–1766

Voldal Emily C., Xia Fan, Kenny Avi, Heagerty Patrick J., Hughes James P.. Model misspecification in stepped wedge trials: Random effects for time or treatment.Statistics in Medicine.2022;41(10):1751–1766

2022
[11]

Sample size calculators for planning stepped-wedge cluster randomized trials: a review and comparison.International Journal of Epidemiology.2022;:dyac123

Ouyang Yongdong, Li Fan, Preisser John S, Taljaard Monica. Sample size calculators for planning stepped-wedge cluster randomized trials: a review and comparison.International Journal of Epidemiology.2022;:dyac123

2022
[12]

Accounting for complex intra- clustercorrelationsinlongitudinalclusterrandomizedtrials:acasestudyinmalariavectorcontrol.BMCMedicalResearch Methodology.2023;23(1):64

Ouyang Yongdong, Kulkarni Manisha A., Protopopoff Natacha, Li Fan, Taljaard Monica. Accounting for complex intra- clustercorrelationsinlongitudinalclusterrandomizedtrials:acasestudyinmalariavectorcontrol.BMCMedicalResearch Methodology.2023;23(1):64

2023
[13]

Analysis of cluster randomised stepped wedge trials with repeated cross-sectional samples.Trials.2017;18(1):101

Hemming Karla, Taljaard Monica, Forbes Andrew. Analysis of cluster randomised stepped wedge trials with repeated cross-sectional samples.Trials.2017;18(1):101

2017
[14]

Contemporary Clinical Trials.2015;45(Pt A):55–60

HughesJamesP.,GranstonTanyaS.,HeagertyPatrickJ..Currentissuesinthedesignandanalysisofsteppedwedgetrials. Contemporary Clinical Trials.2015;45(Pt A):55–60

2015
[15]

KennyAvi,VoldalEmilyC.,XiaFan,HeagertyPatrickJ.,HughesJamesP..Analysisofsteppedwedgeclusterrandomized trials in the presence of a time-varying treatment effect.Statistics in Medicine.2022;41(22):4311–4339

2022
[16]

Assessing exposure-time treatment effect heterogeneity in stepped- wedge cluster randomized trials.Biometrics.2023;79(3):2551–2564

Maleyeff Lara, Li Fan, Haneuse Sebastien, Wang Rui. Assessing exposure-time treatment effect heterogeneity in stepped- wedge cluster randomized trials.Biometrics.2023;79(3):2551–2564

2023
[17]

How to achieve model-robust inference in stepped wedge trials with model-based methods?.Biometrics.2024;80(4):ujae123

Wang Bingkai, Wang Xueqi, Li Fan. How to achieve model-robust inference in stepped wedge trials with model-based methods?.Biometrics.2024;80(4):ujae123

2024
[18]

Adherence to key recommendations for design and analysis of stepped-wedge cluster randomized trials: A review of trials published 2016–2022.Clinical Trials.2024;21(2):199–210

Nevins Pascale, Ryan Mary, Davis-Plourde Kendra, et al. Adherence to key recommendations for design and analysis of stepped-wedge cluster randomized trials: A review of trials published 2016–2022.Clinical Trials.2024;21(2):199–210. 22 Ouyang et al

2016
[19]

TongGuangyu,NevinsPascale,RyanMary,etal.Areviewofcurrentpracticeinthedesignandanalysisofextremelysmall stepped-wedge cluster randomized trials.Clinical Trials (London, England).2025;22(1):45–56

2025
[20]

Inference for the treatment effect in staircase designs with continuous outcomes: a simulation study.BMC medical research methodology.2025;25(1):127

Rezaei-Darzi Ehsan, Grantham Kelsey L., Forbes Andrew B., Kasza Jessica. Inference for the treatment effect in staircase designs with continuous outcomes: a simulation study.BMC medical research methodology.2025;25(1):127

2025
[21]

LiangKung-Yee,ZegerScottL..LongitudinalDataAnalysisUsingGeneralizedLinearModels.Biometrika.1986;73(1):13– 22

1986
[22]

The fixed-effects model for robust analysis of stepped-wedge cluster trials with a small number of clusters and continuous outcomes: a simulation study.Trials.2024;25(1):718

Lee Kenneth Menglin, Cheung Yin Bun. The fixed-effects model for robust analysis of stepped-wedge cluster trials with a small number of clusters and continuous outcomes: a simulation study.Trials.2024;25(1):718

2024
[23]

Scott JoAnna M, deCamp Allan, Juraska Michal, Fay Michael P, Gilbert Peter B. Finite-sample corrected generalized esti- mating equation of population average treatment effects in stepped wedge cluster randomized trials.Statistical Methods in Medical Research.2017;26(2):583–597

2017
[24]

FordWhitneyP.,WestgatePhilipM..Maintainingthevalidityofinferenceinsmall-samplesteppedwedgeclusterrandom- ized trials with binary outcomes when using generalized estimating equations.Statistics in Medicine.2020;39(21):2779–

2020
[25]

_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.8575

work page doi:10.1002/sim.8575
[26]

Comparison of small-sample standard-error corrections for generalised estimating equations in stepped wedge cluster randomised trials with a binary outcome: A simulation study

Thompson JA, Hemming K, Forbes A, Fielding K, Hayes R. Comparison of small-sample standard-error corrections for generalised estimating equations in stepped wedge cluster randomised trials with a binary outcome: A simulation study. Statistical Methods in Medical Research.2021;30(2):425–439

2021
[27]

ThompsonJ.A.,DaveyC.,FieldingK.,HargreavesJ.R.,HayesR.J..Robustanalysisofsteppedwedgetrialsusingcluster- level summaries within periods.Statistics in Medicine.2018;37(16):2487–2500

2018
[28]

Randomization-based inference for a marginal treatment effect in stepped wedge cluster randomized trials.Statistics in Medicine.2021;40(20):4442–4456

Rabideau Dustin J., Wang Rui. Randomization-based inference for a marginal treatment effect in stepped wedge cluster randomized trials.Statistics in Medicine.2021;40(20):4442–4456

2021
[29]

Small sample validity of latent variable models for cor- related binary data.Communications in Statistics - Simulation and Computation.1994;23(1):243–269

Qu Yinsheng, Piedmonte Marion R, Williams George V. Small sample validity of latent variable models for cor- related binary data.Communications in Statistics - Simulation and Computation.1994;23(1):243–269. _eprint: https://doi.org/10.1080/03610919408813167

work page doi:10.1080/03610919408813167 1994
[30]

On Anticipation Effect in Stepped Wedge Cluster Randomized Trials.Statistics in Medicine.2026;45(3-5):e70380

Wang Hao, Chen Xinyuan, Courtright Katherine R., et al. On Anticipation Effect in Stepped Wedge Cluster Randomized Trials.Statistics in Medicine.2026;45(3-5):e70380. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.70380

work page doi:10.1002/sim.70380 2026
[31]

PustejovskyJames.clubSandwich:Cluster-Robust(Sandwich)VarianceEstimatorswithSmall-SampleCorrections.2022

2022
[32]

ZegerS.L.,LiangK.Y..Longitudinaldataanalysisfordiscreteandcontinuousoutcomes.Biometrics.1986;42(1):121–130

1986
[33]

Bias reduction in standard errors for linear regression with multi-stage samples

Bell Robert M., McCaffrey Daniel F. Bias reduction in standard errors for linear regression with multi-stage samples. Statistics Canada.2002;(12-001-XIE)

2002
[34]

A., DeRouen T

Mancl L. A., DeRouen T. A.. A covariance estimator for GEE with improved small-sample properties.Biometrics. 2001;57(1):126–134

2001
[35]

2003;45(4):395–409

MorelJ.g.,BokossaM.c.,NeerchalN.k..SmallSampleCorrectionfortheVarianceofGEEEstimators.BiometricalJournal. 2003;45(4):395–409. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/bimj.200390021

work page doi:10.1002/bimj.200390021 2003
[36]

P., Graubard B

Fay M. P., Graubard B. I.. Small-sample adjustments for Wald-type tests using sandwich estimators.Biometrics. 2001;57(4):1198–1206

2001
[37]

Small-Sample Methods for Cluster-Robust Variance Estimation and Hypoth- esis Testing in Fixed Effects Models.Journal of Business & Economic Statistics.2018;36(4):672–683

Pustejovsky James E., Tipton Elizabeth. Small-Sample Methods for Cluster-Robust Variance Estimation and Hypoth- esis Testing in Fixed Effects Models.Journal of Business & Economic Statistics.2018;36(4):672–683. _eprint: https://doi.org/10.1080/07350015.2016.1247004. Ouyang et al. 23

work page doi:10.1080/07350015.2016.1247004 2018
[38]

HughesJamesP.,LeeWen-Yu,TroxelAndreaB.,HeagertyPatrickJ..SampleSizeCalculationsforSteppedWedgeDesigns with Treatment Effects that May Change with the Duration of Time under Intervention.Prevention Science: The Official Journal of the Society for Prevention Research.2024;25(Suppl 3):348–355

2024
[39]

KatzJoanne,TielschJamesM.,KhatrySubarnaK.,etal.ImpactofImprovedBiomassandLiquidPetroleumGasStoveson BirthOutcomesinRuralNepal:Resultsof2RandomizedTrials.GlobalHealth,ScienceandPractice.2020;8(3):372–382

2020
[40]

Peiris David, Praveen Devarsetty, Mogulluru Kishor, et al. SMARThealth India: A stepped-wedge, cluster randomised controlledtrialofacommunityhealthworkermanagedmobilehealthinterventionforpeopleassessedathighcardiovascular disease risk in rural India.PloS One.2019;14(3):e0213708

2019
[41]

Robust inference for the stepped wedge design.Biometrics

Hughes James P., Heagerty Patrick J., Xia Fan, Ren Yuqi. Robust inference for the stepped wedge design.Biometrics. 2020;76(1):119–130

2020
[42]

Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples.Statistics in Medicine.2016;35(10):1706–1721

Wang Ming, Kong Lan, Li Zheng, Zhang Lijun. Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples.Statistics in Medicine.2016;35(10):1706–1721

2016
[43]

2016;51(4):495–518

McNeishDaniel,StapletonLauraM..ModelingClusteredDatawithVeryFewClusters.MultivariateBehavioralResearch. 2016;51(4):495–518

2016
[44]

The Effect of Small Sample Size on Two-Level Model Estimates: A Review and Illustration.Educational Psychology Review.2016;28(2):295–314

McNeish Daniel M., Stapleton Laura M.. The Effect of Small Sample Size on Two-Level Model Estimates: A Review and Illustration.Educational Psychology Review.2016;28(2):295–314

2016
[45]

Kenny Avi, Voldal Emily C., Xia Fan, Chan Kwun Chuen Gary, Heagerty Patrick J., Hughes James P..Factors affecting power in stepped wedge trials when the treatment effect varies with time.arXiv:2503.11472 [stat] version: 1; 2025

work page arXiv 2025
[46]

LeeKennethM.,TurnerElizabethL.,KennyAvi.AnalysisofStepped-WedgeClusterRandomizedTrialsWhenTreatment Effects Vary by Exposure Time or Calendar Time.Statistics in Medicine.2025;44(20-22):e70256

2025
[47]

Grand rounds in methodology: improving the design of staggered implementation cluster randomised trials.BMJ quality & safety.2025;34(9):631–636

Watson Samuel I., Hooper Richard. Grand rounds in methodology: improving the design of staggered implementation cluster randomised trials.BMJ quality & safety.2025;34(9):631–636

2025
[48]

The staircase cluster randomised trial design: A pragmatic alternative to the stepped wedge.Statistical Methods in Medical Research.2024;33(1):24–41

Grantham Kelsey L, Forbes Andrew B, Hooper Richard, Kasza Jessica. The staircase cluster randomised trial design: A pragmatic alternative to the stepped wedge.Statistical Methods in Medical Research.2024;33(1):24–41

2024