pith. sign in

arxiv: 2501.18383 · v3 · submitted 2025-01-30 · 📊 stat.ME · stat.AP

A tutorial on conducting sample size and power calculations for detecting treatment effect heterogeneity in cluster randomized trials with linear mixed models

Pith reviewed 2026-05-23 04:38 UTC · model grok-4.3

classification 📊 stat.ME stat.AP
keywords cluster randomized trialstreatment effect heterogeneitysample size calculationpower analysislinear mixed modelsintracluster correlationstepped wedge designR Shiny calculator
0
0 comments X

The pith

This tutorial consolidates sample size and power formulas for testing treatment effect heterogeneity in cluster randomized trials via linear mixed models and supplies an R Shiny calculator.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper brings together recently derived formulas for power and sample size when the goal is to detect heterogeneity of treatment effects, rather than only the average treatment effect, in cluster randomized trials. These calculations apply to linear mixed effects models across single-period and multi-period parallel designs, crossover designs, and stepped-wedge designs, and cover both continuous and binary outcomes. Because the formulas require extra design parameters, especially intracluster correlation coefficients for both the outcome and the effect-modifying covariate, the tutorial also supplies an online calculator to reduce the barrier to use. A sympathetic reader would care because pre-specified heterogeneity analyses are increasingly common in community trials, yet without proper power planning those analyses risk being inconclusive.

Core claim

The authors consolidate separate power and sample size formulas for testing treatment-covariate interactions or differences in subpopulation-specific treatment effects in cluster randomized trials using linear mixed effects models, demonstrate their application through an R Shiny calculator, and highlight the sensitivity of results to accurate intracluster correlation estimates for both outcomes and covariates.

What carries the argument

The online R Shiny calculator that implements the design-specific sample size and power formulas for HTE testing in CRTs with LME models, taking as inputs the relevant ICCs, effect sizes, and cluster parameters.

If this is right

  • Trial designers can now calculate the number of clusters and cluster sizes needed to power pre-specified HTE analyses in the main CRT designs.
  • Power estimates become strongly dependent on the chosen ICC values for both the outcome and the covariate.
  • The same consolidated approach covers continuous and binary outcomes across parallel, crossover, and stepped-wedge structures.
  • The calculator lowers the practical barrier to performing these calculations before a trial begins.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future methodological work could test whether the formulas remain accurate when the linear mixed model assumptions are mildly violated in real cluster data.
  • Routine collection of pilot estimates for covariate ICCs, in addition to outcome ICCs, would become a standard part of CRT planning.
  • The calculator framework could be extended to allow users to upload their own simulation-based power checks for non-standard designs.

Load-bearing premise

Users will be able to supply accurate estimates of the intracluster correlation coefficients for both the outcome and the effect-modifying covariate.

What would settle it

Running the published formulas by hand for a stepped-wedge design with a continuous outcome and finding that the calculator's output power differs by more than sampling error from the hand calculation for the same inputs.

read the original abstract

Cluster-randomized trials (CRTs) are a well-established class of designs for evaluating community-based interventions. An essential task in planning these trials is determining the number of clusters and cluster sizes needed to achieve sufficient statistical power for detecting a clinically relevant effect size. While methods for evaluating the average treatment effect (ATE) for the entire study population are well-established, sample size methods for testing heterogeneity of treatment effects (HTEs), i.e., treatment-covariate interaction or difference in subpopulation-specific treatment effects, in CRTs have only recently been developed. For pre-specified analyses of HTEs in CRTs, effect-modifying covariates should, ideally, be accompanied by sample size or power calculations to ensure the trial has adequate power for the planned analyses. Power analysis for testing HTEs is more complex than for ATEs due to the additional design parameters that must be specified. Power and sample size formulas for testing HTEs via linear mixed effects (LME) models have been separately derived for different cluster-randomized designs, including single and multi-period parallel designs, crossover designs, and stepped-wedge designs, and for continuous and binary outcomes. This tutorial provides a consolidated reference guide for these methods and enhances their accessibility through an online R Shiny calculator. We further discuss key considerations for conducting sample size and power calculations to test pre-specified HTE hypotheses in CRTs, highlighting the importance of specifying advanced estimates of intracluster correlation coefficients for both outcomes and covariates, and their implications for power. The sample size methodology and calculator functionality are demonstrated through a real CRT example.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. This tutorial consolidates power and sample size formulas for testing pre-specified treatment effect heterogeneity (HTE) via linear mixed models in cluster-randomized trials. It covers single- and multi-period parallel designs, crossover designs, and stepped-wedge designs, for both continuous and binary outcomes. The paper provides an R Shiny calculator to implement the methods, discusses practical issues including the need for accurate intracluster correlation coefficients (ICCs) for the outcome and the effect-modifying covariate, and demonstrates the approach with a real CRT example.

Significance. If the cited derivations are represented accurately and the calculator implements them correctly, the manuscript supplies a consolidated, accessible reference that fills a practical gap: while ATE power methods for CRTs are mature, HTE methods have appeared only recently and separately. The explicit foregrounding of ICC sensitivity for both outcome and covariate, together with the online tool, should improve the quality of sample-size planning for HTE analyses in future CRTs. The provision of reproducible code (Shiny app) is a clear strength.

minor comments (3)
  1. [Introduction / §2] The abstract and introduction state that formulas were 'separately derived' for different designs and outcomes; a short table or appendix listing the original references for each formula (with equation numbers) would help readers trace the derivations without searching the cited papers.
  2. [Shiny calculator section] In the description of the Shiny app inputs, the mapping between user-supplied ICC values and the variance components appearing in the power formulas is not shown explicitly; adding a small schematic or equation reference next to each input field would reduce the chance of mis-specification.
  3. [Example section] The real-CRT example reports power curves but does not tabulate the exact ICC values used for the outcome and covariate; including these numerical values (and the source of the estimates) would allow readers to reproduce the displayed results directly from the formulas.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive evaluation of the manuscript and for recommending acceptance. We are pleased that the consolidation of power and sample size methods for HTE testing in CRTs, along with the R Shiny calculator and emphasis on ICC sensitivity, is recognized as addressing a practical gap.

Circularity Check

0 steps flagged

No significant circularity; tutorial consolidates external derivations

full rationale

The paper is explicitly a tutorial that consolidates power and sample size formulas previously derived separately for HTE testing in CRTs across designs and outcome types. It makes no new first-principles derivations or predictions that reduce to its own fitted inputs or self-citations. The abstract states the formulas 'have been separately derived' and positions the contribution as a reference guide plus R Shiny calculator, with discussion of ICC sensitivity as a practical point. No load-bearing step equates outputs to inputs by construction, and the reader's assessment of score 0.0 aligns with the provided text.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The tutorial rests on standard linear mixed model assumptions for clustered data and on the requirement that users provide realistic intracluster correlation estimates; no new entities are introduced.

free parameters (1)
  • intracluster correlation coefficients for outcome and covariate
    These must be specified by the user and directly determine the required sample size in the consolidated formulas.
axioms (1)
  • domain assumption Linear mixed effects models appropriately capture the clustering structure in CRT data for both outcomes and covariates.
    The tutorial focuses exclusively on LME-based power calculations.

pith-pipeline@v0.9.0 · 5846 in / 1173 out tokens · 37284 ms · 2026-05-23T04:38:47.938599+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

  1. [1]

    Design and analysis of group-randomized trials

    Murray DM. Design and analysis of group-randomized trials. Oxford Univrsity Press, USA; 1998

  2. [2]

    Review of Recent Methodological Developments in Group-Randomized Trials: Part 1—Design

    Turner EL, Li F, Gallis JA, Prague M, Murray DM. Review of Recent Methodological Developments in Group-Randomized Trials: Part 1—Design. American Journal of Public Health. 2017;107(6):907–915

  3. [3]

    Methods for sample size determination in cluster randomized trials

    Rutterford C, Copas A, Eldridge S. Methods for sample size determination in cluster randomized trials. International Journal of Epidemiology. 2015 June 1;44(3):1051–1067

  4. [4]

    Sample size calculators for planning stepped- wedge cluster randomized trials: a review and comparison

    Ouyang Y, Li F, Preisser JS, Taljaard M. Sample size calculators for planning stepped- wedge cluster randomized trials: a review and comparison. International Journal of Epidemiology. 2022 Dec 1;51(6):2000–2013

  5. [6]

    Designing three-level cluster randomized trials to assess treatment effect heterogeneity

    Li F, Chen X, Tian Z, Esserman D, Heagerty PJ, Wang R. Designing three-level cluster randomized trials to assess treatment effect heterogeneity. Biostatistics. 2022 July;24(4):833–849

  6. [9]

    Sample size requirements for testing treatment effect heterogeneity in cluster randomized trials with binary outcomes

    Maleyeff L, Wang R, Haneuse S, Li F. Sample size requirements for testing treatment effect heterogeneity in cluster randomized trials with binary outcomes. Statistics in Medicine. 2023;42(27):5054–5083

  7. [10]

    Sample Size Requirements to Test Subgroup- Specific Treatment Effects in Cluster-Randomized Trials

    Wang X, Goldfeld KS, Taljaard M, Li F. Sample Size Requirements to Test Subgroup- Specific Treatment Effects in Cluster-Randomized Trials. Prev Sci [Internet]. 2023 Oct 10 [cited 2024 Feb 29]; Available from: https://doi.org/10.1007/s11121-023-01590-6

  8. [11]

    Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity

    Li F, Chen X, Tian Z, Wang R, Heagerty PJ. Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity. Statistics in Medicine. 2024;43(5):890–911. Page 21 of 36

  9. [12]

    Sample size and power calculation for testing treatment effect heterogeneity in cluster randomized crossover designs

    Wang X, Chen X, Goldfeld KS, Taljaard M, Li F. Sample size and power calculation for testing treatment effect heterogeneity in cluster randomized crossover designs. Statistical Methods in Medical Research. 2024;33(7):1115–1136

  10. [13]

    Simple sample size calculation for cluster-randomized trials

    Hayes RJ, Bennett S. Simple sample size calculation for cluster-randomized trials. International Journal of Epidemiology. 1999 Apr 1;28(2):319–326

  11. [14]

    Sample size calculation for cluster randomized cross- over trials

    Giraudeau B, Ravaud P, Donner A. Sample size calculation for cluster randomized cross- over trials. Statistics in Medicine. 2008;27(27):5578–5585

  12. [15]

    Multi-period crossover trials

    Matthews J. Multi-period crossover trials. Statistical Methods in Medical Research. 1994;3(4):383–405

  13. [16]

    Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs

    Hemming K, Lilford R, Girling AJ. Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs. Statistics in Medicine. 2015;34(2):181–196

  14. [17]

    Design and analysis of stepped wedge cluster randomized trials

    Hussey MA, Hughes JP. Design and analysis of stepped wedge cluster randomized trials. Contemporary Clinical Trials. 2007 Feb 1;28(2):182–191

  15. [18]

    Statistical Power and Sample Size Requirements for Three Level Hierarchical Cluster Randomized Trials

    Heo M, Leon AC. Statistical Power and Sample Size Requirements for Three Level Hierarchical Cluster Randomized Trials. Biometrics. 2008;64(4):1256–1262

  16. [19]

    Individually Randomized Group Treatment Trials: A Critical Appraisal of Frequently Used Design and Analytic Approaches

    Pals SL, Murray DM, Alfano CM, Shadish WR, Hannan PJ, Baker WL. Individually Randomized Group Treatment Trials: A Critical Appraisal of Frequently Used Design and Analytic Approaches. American Journal of Public Health. 2008;98(8):1418–1424

  17. [20]

    Cluster randomised trials with repeated cross sections: alternatives to parallel group designs

    Hooper R, Bourke L. Cluster randomised trials with repeated cross sections: alternatives to parallel group designs. BMJ [Internet]. 2015;350. Available from: https://www.bmj.com/content/350/bmj.h2925

  18. [21]

    Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches

    Copas AJ, Lewis JJ, Thompson JA, Davey C, Baio G, Hargreaves JR. Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches. Trials. 2015 Aug 17;16(1):352

  19. [22]

    Best Practices for Integrating Health Equity into Embedded Pragmatic Clinical Trials for Dementia Care

    NIA IMPACT Collaboratory. Best Practices for Integrating Health Equity into Embedded Pragmatic Clinical Trials for Dementia Care. National Institutes of Health: Bethesda, Maryland. 2022

  20. [24]

    A tutorial on sample size calculation for multiple-period cluster randomized parallel, cross-over and stepped-wedge trials using the Shiny CRT Calculator

    Hemming K, Kasza J, Hooper R, Forbes A, Taljaard M. A tutorial on sample size calculation for multiple-period cluster randomized parallel, cross-over and stepped-wedge trials using the Shiny CRT Calculator. International Journal of Epidemiology. 2020 June 1;49(3):979– 995. Page 22 of 36

  21. [25]

    Cohort versus cross-sectional design in large field trials: Precision, sample size, and a unifying model

    Feldman HA, McKinlay SM. Cohort versus cross-sectional design in large field trials: Precision, sample size, and a unifying model. Statistics in Medicine. 1994;13(1):61–78

  22. [26]

    Statistical analysis and optimal design for cluster randomized trials

    Raudenbush SW. Statistical analysis and optimal design for cluster randomized trials. Psychological Methods. US: American Psychological Association; 1997;2(2):173–185

  23. [28]

    Lumbar Imaging With Reporting Of Epidemiology (LIRE)—Protocol for a pragmatic cluster randomized trial

    Jarvik JG, Comstock BA, James KT, et al. Lumbar Imaging With Reporting Of Epidemiology (LIRE)—Protocol for a pragmatic cluster randomized trial. Contemporary Clinical Trials. 2015;45:157–163

  24. [30]

    Association of intracluster correlation measures with outcome prevalence for binary outcomes in cluster randomised trials

    Mbekwe Yepnang AM, Caille A, Eldridge SM, Giraudeau B. Association of intracluster correlation measures with outcome prevalence for binary outcomes in cluster randomised trials. Stat Methods Med Res. SAGE Publications Ltd STM; 2021 Aug 1;30(8):1988–2003

  25. [31]

    Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data

    Gulliford MC, Adams G, Ukoumunne OC, Latinovic R, Chinn S, Campbell MJ. Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data. Journal of Clinical Epidemiology. 2005 Mar 1;58(3):246–251

  26. [32]

    Reflection on modern methods: when is a stepped-wedge cluster randomized trial a good study design choice? International Journal of Epidemiology

    Hemming K, Taljaard M. Reflection on modern methods: when is a stepped-wedge cluster randomized trial a good study design choice? International Journal of Epidemiology. 2020 June 1;49(3):1043–1052

  27. [33]

    Information content of stepped wedge designs with unequal cluster-period sizes in linear mixed models: Informing incomplete designs

    Kasza J, Bowden R, Forbes AB. Information content of stepped wedge designs with unequal cluster-period sizes in linear mixed models: Informing incomplete designs. Statistics in Medicine. 2021;40(7):1736–1751

  28. [34]

    Model misspecification in stepped wedge trials: Random effects for time or treatment

    Voldal EC, Xia F, Kenny A, Heagerty PJ, Hughes JP. Model misspecification in stepped wedge trials: Random effects for time or treatment. Statistics in Medicine. 2022;41(10):1751–1766

  29. [35]

    A tutorial on conducting sample size and power calculations for detecting treatment effect heterogeneity in cluster randomized trials with linear mixed models

    Kasza J, Hooper R, Copas A, Forbes AB. Sample size and power calculations for open cohort longitudinal cluster randomized trials. Statistics in Medicine. 2020;39(13):1871– 1883. Page 23 of 36 Supplementary material for “A tutorial on conducting sample size and power calculations for detecting treatment effect heterogeneity in cluster randomized trials wit...

  30. [36]

    Effects of a High-Intensity Functional Exercise Program on Dependence in Activities of Daily Living and Balance in Older Adults with Dementia

    Toots A, Littbrand H, Lindelöf N, et al. Effects of a High-Intensity Functional Exercise Program on Dependence in Activities of Daily Living and Balance in Older Adults with Dementia. Journal of the American Geriatrics Society. 2016;64(1):55–64

  31. [37]

    Sample size requirements for detecting treatment effect heterogeneity in cluster randomized trials

    Yang S, Li F, Starks MA, Hernandez AF, Mentz RJ, Choudhury KR. Sample size requirements for detecting treatment effect heterogeneity in cluster randomized trials. Statistics in Medicine. 2020;39(28):4218–4237

  32. [38]

    Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity

    Li F, Chen X, Tian Z, Wang R, Heagerty PJ. Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity. Statistics in Medicine. 2024;43(5):890–911

  33. [39]

    Patterns of intra-cluster correlation from primary care research to inform study design and analysis

    Adams G, Gulliford MC, Ukoumunne OC, Eldridge S, Chinn S, Campbell MJ. Patterns of intra-cluster correlation from primary care research to inform study design and analysis. Journal of Clinical Epidemiology. 2004 Aug 1;57(8):785–794. Page 35 of 36

  34. [40]

    Determinants of the intracluster correlation coefficient in cluster randomized trials: the case of implementation research

    Campbell MK, Fayers PM, Grimshaw JM. Determinants of the intracluster correlation coefficient in cluster randomized trials: the case of implementation research. Clinical Trials. 2005;2(2):99–107

  35. [41]

    Clustering in surgical trials - database of intracluster correlations

    Cook JA, Bruckner T, MacLennan GS, Seiler CM. Clustering in surgical trials - database of intracluster correlations. Trials. 2012 Jan 4;13(1):2

  36. [42]

    Intra-cluster correlations from the CLustered OUtcome Dataset bank to inform the design of longitudinal cluster trials

    Korevaar E, Kasza J, Taljaard M, et al. Intra-cluster correlations from the CLustered OUtcome Dataset bank to inform the design of longitudinal cluster trials. Clinical Trials. 2021;18(5):529–540

  37. [43]

    Estimating intra-cluster correlation coefficients for planning longitudinal cluster randomized trials: a tutorial

    Ouyang Y, Hemming K, Li F, Taljaard M. Estimating intra-cluster correlation coefficients for planning longitudinal cluster randomized trials: a tutorial. International Journal of Epidemiology. 2023 Oct 1;52(5):1634–1647

  38. [44]

    Adjusted intraclass correlation coefficients for binary data: methods and estimates from a cluster-randomized trial in primary care

    Yelland LN, Salter AB, Ryan P, Laurence CO. Adjusted intraclass correlation coefficients for binary data: methods and estimates from a cluster-randomized trial in primary care. Clinical Trials. 2011;8(1):48–58

  39. [45]

    A tutorial on sample size calculation for multiple-period cluster randomized parallel, cross-over and stepped-wedge trials using the Shiny CRT Calculator

    Hemming K, Kasza J, Hooper R, Forbes A, Taljaard M. A tutorial on sample size calculation for multiple-period cluster randomized parallel, cross-over and stepped-wedge trials using the Shiny CRT Calculator. International Journal of Epidemiology. 2020 Jun 1;49(3):979– 995

  40. [46]

    Estimates of intra-cluster correlation coefficients from 2018 USA Medicare data to inform the design of cluster randomized trials in Alzheimer’s and related dementias

    Ouyang Y, Li F, Li X, Bynum J, Mor V, Taljaard M. Estimates of intra-cluster correlation coefficients from 2018 USA Medicare data to inform the design of cluster randomized trials in Alzheimer’s and related dementias. Trials. 2024 Oct 30;25(1):732

  41. [47]

    Does it decay? Obtaining decaying correlation parameter values from previously analysed cluster randomised trials

    Kasza J, Bowden R, Ouyang Y, Taljaard M, Forbes AB. Does it decay? Obtaining decaying correlation parameter values from previously analysed cluster randomised trials. Statistical Methods in Medical Research. 2023;32(11):2123–2134

  42. [48]

    Barriers and facilitators to patient recruitment to a cluster randomized controlled trial in primary care: lessons for future trials

    Foster JM, Sawyer SM, Smith L, Reddel HK, Usherwood T. Barriers and facilitators to patient recruitment to a cluster randomized controlled trial in primary care: lessons for future trials. BMC Med Res Methodol. 2015 Mar 12;15(1):18

  43. [49]

    Recruitment and implementation challenges were common in stepped-wedge cluster randomized trials: Results from a methodological review

    Caille A, Taljaard M, Vilain—Abraham FL, et al. Recruitment and implementation challenges were common in stepped-wedge cluster randomized trials: Results from a methodological review. Journal of Clinical Epidemiology. 2022;148:93–103

  44. [50]

    How to design and analyse cluster randomized trials with a small number of clusters? Comment on Leyrat et al

    Breukelen GJP van, Candel MJJM. How to design and analyse cluster randomized trials with a small number of clusters? Comment on Leyrat et al. International Journal of Epidemiology. 2018 Jun 1;47(3):998–1001

  45. [51]

    Maintaining the validity of inference in small-sample stepped wedge cluster randomized trials with binary outcomes when using generalized estimating equations

    Ford WP, Westgate PM. Maintaining the validity of inference in small-sample stepped wedge cluster randomized trials with binary outcomes when using generalized estimating equations. Statistics in Medicine. 2020;39(21):2779–2792. Page 36 of 36

  46. [52]

    Design and analysis considerations for cohort stepped wedge cluster randomized trials with a decay correlation structure

    Li F. Design and analysis considerations for cohort stepped wedge cluster randomized trials with a decay correlation structure. Statistics in Medicine. 2020;39(4):438–455

  47. [53]

    Sample size considerations for stepped wedge designs with subclusters

    Davis-Plourde K, Taljaard M, Li F. Sample size considerations for stepped wedge designs with subclusters. Biometrics. 2023;79(1):98–112

  48. [54]

    Substantial risks associated with few clusters in cluster randomized and stepped wedge designs

    Taljaard M, Teerenstra S, Ivers NM, Fergusson DA. Substantial risks associated with few clusters in cluster randomized and stepped wedge designs. Clinical Trials. SAGE Publications; 2016 Aug 1;13(4):459–463

  49. [55]

    Lessons for cluster randomized trials in the twenty-first century: a systematic review of trials in primary care

    Eldridge SM, Ashby D, Feder GS, Rudnicka AR, Ukoumunne OC. Lessons for cluster randomized trials in the twenty-first century: a systematic review of trials in primary care. Clinical Trials. 2004;1(1):80–90

  50. [56]

    Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method

    Eldridge SM, Ashby D, Kerry S. Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. International Journal of Epidemiology. 2006 Oct 1;35(5):1292–1300

  51. [57]

    Relative efficiency of unequal versus equal cluster sizes in cluster randomized and multicentre trials

    Breukelen GJP van, Candel MJJM, Berger MPF. Relative efficiency of unequal versus equal cluster sizes in cluster randomized and multicentre trials. Statistics in Medicine. 2007;26(13):2589–2603

  52. [58]

    Accounting for unequal cluster sizes in designing cluster randomized trials to detect treatment effect heterogeneity

    Tong G, Esserman D, Li F. Accounting for unequal cluster sizes in designing cluster randomized trials to detect treatment effect heterogeneity. Statistics in Medicine. 2022;41(8):1376–1396

  53. [59]

    Sample size considerations for assessing treatment effect heterogeneity in randomized trials with heterogeneous intracluster correlations and variances

    Tong G, Taljaard M, Li F. Sample size considerations for assessing treatment effect heterogeneity in randomized trials with heterogeneous intracluster correlations and variances. Statistics in Medicine. 2023;42(19):3392–3412

  54. [60]

    Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression

    Candel MJJM, Van Breukelen GJP. Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression. Statistics in Medicine. 2010;29(14):1488–1501

  55. [61]

    Calculating sample sizes for cluster randomized trials: We can keep it simple and efficient! Journal of Clinical Epidemiology

    Breukelen GJP van, Candel MJJM. Calculating sample sizes for cluster randomized trials: We can keep it simple and efficient! Journal of Clinical Epidemiology. 2012 Nov 1;65(11):1212–1218

  56. [62]

    Cluster randomised crossover trials with binary data and unbalanced cluster sizes: Application to studies of near-universal interventions in intensive care

    Forbes AB, Akram M, Pilcher D, Cooper J, Bellomo R. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: Application to studies of near-universal interventions in intensive care. Clinical Trials. 2015;12(1):34–44

  57. [63]

    Relative efficiency of unequal cluster sizes in stepped wedge and other trial designs under longitudinal or cross-sectional sampling

    Girling AJ. Relative efficiency of unequal cluster sizes in stepped wedge and other trial designs under longitudinal or cross-sectional sampling. Statistics in Medicine. 2018;37(30):4652–4664