A web application for the design of multi-arm clinical trials

James MS Wason; Michael J Grayling

arxiv: 1906.09178 · v1 · pith:MZKB5DIFnew · submitted 2019-06-21 · 📊 stat.CO

A web application for the design of multi-arm clinical trials

Michael J Grayling , James MS Wason This is my paper

Pith reviewed 2026-05-25 18:19 UTC · model grok-4.3

classification 📊 stat.CO

keywords multi-arm clinical trialssample size calculationmultiple comparison correctionspower calculationclinical trial designweb applicationallocation ratios

0 comments

The pith

A free web application performs sample size calculations for multi-arm clinical trials under multiple comparison corrections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a web application developed to simplify the design of multi-arm clinical trials by handling sample size calculations. It incorporates a range of popular methods to adjust for multiple comparisons and supports calculations that control different varieties of power. The tool also determines optimized allocation ratios across treatment arms. This addresses the challenge of selecting among numerous possible multi-arm designs when suitable software has been limited. The application is intended to make these designs more practical for statisticians and clinicians by providing key operating characteristics without requiring specialized programming skills.

Core claim

The authors have developed a web application for sample size calculation when using a variety of popular multiple comparison corrections. The application supports sample size calculation to control several varieties of power, as well as the determination of optimised arm-wise allocation ratios. It is free to access on any device with an internet browser and requires no programming knowledge to use. The application provides the core information required by statisticians and clinicians to review the operating characteristics of a chosen multi-arm clinical trial design.

What carries the argument

The web application that implements sample size calculations for multi-arm designs while applying multiple comparison corrections and optimizing arm allocations.

If this is right

Users can more readily evaluate operating characteristics such as power and error rates for chosen multi-arm designs.
Optimized arm-wise allocation ratios can be identified to meet trial objectives efficiently.
Sample sizes can be calculated while controlling per-comparison error, family-wise error, or disjunctive power as needed.
The tool may facilitate greater use of multi-arm designs by reducing the software barrier to their planning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Broader adoption could allow more simultaneous testing of treatments within single trials, potentially conserving patient resources across disease areas.
The interface design implies that non-statisticians might now participate more directly in reviewing and adjusting multi-arm trial parameters.
The focus on multiple power types suggests the application could support both confirmatory and exploratory goals within the same design framework.

Load-bearing premise

The statistical procedures for sample size calculation, multiple comparison corrections, and power control are correctly implemented and suitable for real clinical trial planning.

What would settle it

An independent manual calculation or simulation for a specific multi-arm design with a known multiple comparison method that produces sample sizes or allocation ratios differing from those output by the application.

Figures

Figures reproduced from arXiv: 1906.09178 by James MS Wason, Michael J Grayling.

**Figure 2.** Figure 2: Design summary box. The box in which a summary of the input parameters [PITH_FULL_IMAGE:figures/full_fig_p024_2.png] view at source ↗

**Figure 3.** Figure 3: Operating characteristics summary. The boxes in which a summary of the [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗

**Figure 4.** Figure 4: Operating characteristics plots. The boxes in which plots of the identified [PITH_FULL_IMAGE:figures/full_fig_p026_4.png] view at source ↗

read the original abstract

Multi-arm designs provide an effective means of evaluating several treatments within the same clinical trial. Given the large number of treatments now available for testing in many disease areas, it has been argued that their utilisation should increase. However, for any given clinical trial there are numerous possible multi-arm designs that could be used, and choosing between them can be a difficult task. This task is complicated further by a lack of available easy-to-use software for designing multi-arm trials. To aid the wider implementation of multi-arm clinical trial designs, we have developed a web application for sample size calculation when using a variety of popular multiple comparison corrections. Furthermore, the application supports sample size calculation to control several varieties of power, as well as the determination of optimised arm-wise allocation ratios. It is built using the Shiny package in the R programming language, is free to access on any device with an internet browser, and requires no programming knowledge to use. The application provides the core information required by statisticians and clinicians to review the operating characteristics of a chosen multi-arm clinical trial design. We hope that it will assist with the future utilisation of such designs in practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper describes a new Shiny web app for multi-arm trial sample size calculations but provides no checks on whether the numbers are correct.

read the letter

The paper describes a web application built with Shiny for calculating sample sizes in multi-arm clinical trials. It includes options for popular multiple comparison corrections, control of different power types, and optimized allocation ratios across arms. What stands out is the focus on accessibility: it's free, runs in any browser, and needs no programming. This directly tackles the problem mentioned in the abstract that easy-to-use software for these designs has been missing. The authors do a good job outlining the features and how the app can help statisticians and clinicians review operating characteristics. The soft spot is the absence of any supporting evidence for correctness. There are no formulas presented, no code, and no comparisons or simulations to show that the calculations match known methods. The stress-test note is right on this point. Users would have to trust the implementation or verify it themselves. This is for readers who design or review multi-arm trials and want a convenient tool for initial explorations. It could save time for those without strong coding skills. I recommend sending it to peer review. The practical need is clear, and referees can assess whether the app delivers on its claims once the implementation details are examined.

Referee Report

2 major / 2 minor

Summary. The manuscript describes the development of a free Shiny web application in R for designing multi-arm clinical trials. The tool performs sample size calculations that incorporate a variety of popular multiple comparison corrections, supports control of several power definitions, and determines optimized arm-wise allocation ratios. It is presented as accessible via any web browser with no programming required, with the goal of supplying the core operating characteristics information needed by statisticians and clinicians to evaluate such designs.

Significance. If the underlying calculations are correctly implemented, the application could meaningfully lower the barrier to using multi-arm trial designs in practice by providing an accessible interface for complex sample-size and power calculations. The work is primarily a software contribution rather than a methodological advance, so its significance is tied directly to demonstrated reliability and usability rather than novel statistical results.

major comments (2)

[Abstract] Abstract: the central claim that the application 'provides the core information required by statisticians and clinicians' rests on an unverified implementation; the manuscript supplies neither explicit formulas for the supported multiple-comparison corrections and power types nor any numerical checks against published tables or analytic results.
[Implementation / functionality description] The description of the application's functionality (throughout the manuscript) contains no R code snippets, pseudocode, or verification examples that would allow independent confirmation that the sample-size calculations match standard methods for the listed corrections and power definitions.

minor comments (2)

A table or appendix listing the exact multiple-comparison procedures, power definitions, and allocation optimization methods supported by the app would improve clarity and allow readers to assess coverage without running the application.
Consider adding a short reproducibility statement indicating whether the source code for the Shiny app is publicly available (e.g., on GitHub) and whether any unit tests or validation scripts are included.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments and recommendation. We agree that verification is important for a software contribution and will revise the manuscript accordingly to include formulas, pseudocode, and numerical checks. We respond to each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the application 'provides the core information required by statisticians and clinicians' rests on an unverified implementation; the manuscript supplies neither explicit formulas for the supported multiple-comparison corrections and power types nor any numerical checks against published tables or analytic results.

Authors: We accept the point that the manuscript did not provide explicit formulas or numerical verification to support the abstract claim. The calculations rely on standard methods (e.g., Dunnett, Bonferroni corrections; disjunctive and conjunctive power), but these were not detailed. In revision we will add a dedicated section with the formulas for all supported corrections and power types, plus numerical checks against published tables or analytic results to substantiate the claim. revision: yes
Referee: [Implementation / functionality description] The description of the application's functionality (throughout the manuscript) contains no R code snippets, pseudocode, or verification examples that would allow independent confirmation that the sample-size calculations match standard methods for the listed corrections and power definitions.

Authors: We agree that the absence of pseudocode or verification examples limits independent confirmation. We will add pseudocode for the core sample-size algorithm (including allocation optimization) and specific verification examples demonstrating agreement with standard methods. We will also note the R functions or packages used for the computations. revision: yes

Circularity Check

0 steps flagged

No circularity: software description with no derivations or fitted predictions

full rationale

The paper presents a web application (Shiny/R) for multi-arm trial sample-size calculations under various multiple-comparison corrections, power definitions, and allocation ratios. No equations, derivations, parameter fits, or 'predictions' are claimed; the work is purely descriptive of an implementation. No self-citation load-bearing steps, uniqueness theorems, or ansatzes appear. The central claim reduces only to the existence of the tool and its menu of standard methods, which is independent of any internal reduction to its own inputs. This matches the default non-circular case for implementation papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities because the paper describes software implementation rather than a mathematical or theoretical derivation.

pith-pipeline@v0.9.0 · 5724 in / 1128 out tokens · 28235 ms · 2026-05-25T18:19:20.553349+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

Journal of Health Economics47, 20–33 (2016)

DiMasi, J.A., Grabowski, H.G., Hansen, R.W.: Innovation in the pharmaceutical industry: new estimates of R&D costs. Journal of Health Economics47, 20–33 (2016)

work page 2016
[2]

Biotechnology Innovation Organization (BIO), Biomedtracker, AMPLION: Clinical development success rates 2006-2015 (2016)

work page 2006
[3]

Lancet 384(9940), 283–4 (2014)

Parmar, M.K.B., Carpenter, J., Sydes, M.R.: More multiarm randomised trials of superiority are needed. Lancet 384(9940), 283–4 (2014)

work page 2014
[4]

BMC Cardiovascular Disorders 18(1), 215 (2018) 17

Jaki, T., Wason, J.M.S.: Multi-arm multi-stage trials can improve the eﬃciency of ﬁnding eﬀective treatments for stroke: a case study. BMC Cardiovascular Disorders 18(1), 215 (2018) 17

work page 2018
[5]

Wason, J.M.S., Stecher, L., Mander, A.P.: Correcting for multiple-testing in multi- arm trials: is it necessary and is it done? Trials 15, 364 (2014)

work page 2014
[6]

BMC Medicine 11, 84 (2013)

Baron, G., Perrodeau, E., Boutron, I., Ravaud, P.: Reporting of analyses from ran- domized controlled trials with multiple arms: a systematic review. BMC Medicine 11, 84 (2013)

work page 2013
[7]

JAMA 321(16), 1610–1620 (2019)

Juszczak, E., Altman, D.G., Hopewell, S., Schulz, K.: Reporting of multi-arm parallel-group randomized trials: extension of the CONSORT 2010 statement. JAMA 321(16), 1610–1620 (2019)

work page 2010
[8]

Epidemiology 1(1), 43–46 (1990)

Rothman, K.J.: No adjustments are needed for multiple comparisons. Epidemiology 1(1), 43–46 (1990)

work page 1990
[9]

Journal of the Royal Statistical Society (Series A) 159(1), 93–110 (1996)

Cook, R.J., Farewell, V.T.: Multiplicity considerations in the design and analysis of clinical trials. Journal of the Royal Statistical Society (Series A) 159(1), 93–110 (1996)

work page 1996
[10]

Controlled clinical trials 21(6), 527–539 (2000)

Proschan, M.A., Waclawiw, M.A.: Practical guidelines for multiplicity adjustment in clinical trials. Controlled clinical trials 21(6), 527–539 (2000)

work page 2000
[11]

Bender, R., Lange, S.: Adjusting for multiple testing - when and how? Journal of Clinical Epidemiology 54(4) (2001)

work page 2001
[12]

Feise, R.J.: Do multiple outcome measures require p-value adjustment? BMC Med- ical Research Methodology 2, 8 (2002)

work page 2002
[13]

Encyclopedia Biostatistics 5, 3446–3451 (2005)

Hughes, M.D.: Multiplicity in clinical trials. Encyclopedia Biostatistics 5, 3446–3451 (2005)

work page 2005
[14]

Clinical Cancer Research 14 (2008)

Freidlin, B., Korn, E.L., Gray, R., Martin, A.: Multi-arm clinical trials of new agents: some design considerations. Clinical Cancer Research 14 (2008)

work page 2008
[15]

International Journal of Epidemiology46(2), 746–755 (2016)

Li, G., Taljaard, M., Van den Heuvel, E.R., Levine, M.A.H., Cook, D.J., Wells, G.A., Devereaux, P.J., Thabane, L.: An introduction to multiplicity issues in clinical trials: 18 the what, why, when and how. International Journal of Epidemiology46(2), 746–755 (2016)

work page 2016
[16]

Agency, E.M.: Guideline on Multiplicity Issues in Clinical Trials. (2017). https://www.ema.europa.eu/en/documents/scientific-guideline/ draft-guideline-multiplicity-issues-clinical-trials_en.pdf

work page 2017
[17]

Administration, U.F..D.: Multiple Endpoints in Clinical Tri- als Guidance for Industry. (2017). https://www.fda.gov/ regulatory-information/search-fda-guidance-documents/ multiple-endpoints-clinical-trials-guidance-industry

work page 2017
[18]

Statistical Methods in Medical Research 27(5), 1513–1530 (2018)

Howard, D.R., Brown, J.M., Todd, S., Gregory, W.M.: Recommendations on mul- tiple testing adjustment in multi-arm trials with a shared control group. Statistical Methods in Medical Research 27(5), 1513–1530 (2018)

work page 2018
[19]

John Wiley & Sons, New York, NY (1987)

Hochberg, Y., Tamhane, A.C.: Multiple Comparison Procedures. John Wiley & Sons, New York, NY (1987)

work page 1987
[20]

Chapman & Hall, London (1996)

Hsu, J.C.: Multiple Comparisons. Chapman & Hall, London (1996)

work page 1996
[21]

CRC Press, Boca Raton, FL (2010)

Bretz, F., Hothorn, T., Westfall, P.: Multiple Comparisons using R. CRC Press, Boca Raton, FL (2010)

work page 2010
[22]

Statistics in Medicine 22(20), 3133–3150 (2003)

Sankoh, A.J., D’Agostino, R.B.S., Huque, M.F.: Eﬃcacy endpoint selection and multiplicity adjustment methods in clinical trials with inherent multiple endpoint issues. Statistics in Medicine 22(20), 3133–3150 (2003)

work page 2003
[23]

Oxford University Press, Oxford (2007)

Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs, with SAS. Oxford University Press, Oxford (2007)

work page 2007
[24]

https://www.cytel.com/software/east

East. https://www.cytel.com/software/east. Accessed: 2019-05-04

work page 2019
[25]

Chang, W., Cheng, J., Allaire, J.J., Xie, Y., McPherson, J.: shiny: Web Application Framework for R. (2019). https://CRAN.R-project.org/package=shiny 19

work page 2019
[26]

R Founda- tion for Statistical Computing, Vienna, Austria (2018)

R Core Team: R: a Language and Environment for Statistical Computing. R Founda- tion for Statistical Computing, Vienna, Austria (2018). R Foundation for Statistical Computing. https://www.R-project.org/

work page 2018
[27]

http://www.github.com/mjg211/multiarm/

Grayling, M.J.: multiarm: Design and analysis of ﬁxed-sample multi-arm clinical trials (2019). http://www.github.com/mjg211/multiarm/

work page 2019
[28]

Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze (1936)

Bonferroni, C.E.: Teoria statistica delle classi e calcolo delle probabilit. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze (1936)

work page 1936
[29]

Journal of the American Statistical Association 62(318), 626–633 (1967)

ˇSid´ ak, Z.: Rectangular conﬁdence regions for the means of multivariate normal dis- tributions. Journal of the American Statistical Association 62(318), 626–633 (1967)

work page 1967
[30]

Journal of the American Statistical Association 50(272), 1096–1121 (1955)

Dunnett, C.W.: A multiple comparison procedure for comparing several treatments with a control. Journal of the American Statistical Association 50(272), 1096–1121 (1955)

work page 1955
[31]

Scandinavian Jour- nal of Statistics 6(2), 65–70 (1979)

Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Jour- nal of Statistics 6(2), 65–70 (1979)

work page 1979
[32]

Biometrika 75(4), 800–802 (1988)

Hochberg, Y.: A sharper bonferroni procedure for multiple tests of signiﬁcance. Biometrika 75(4), 800–802 (1988)

work page 1988
[33]

Journal of the Royal Statistical Society (Series B) 57(1), 289–300 (1995)

Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society (Series B) 57(1), 289–300 (1995)

work page 1995
[34]

Annals of Statistics 29(4), 1165–1188 (1995)

Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29(4), 1165–1188 (1995)

work page 1995
[35]

Statistical Methods in Medical Research 25(2), 716–727 (2016)

Wason, J., Magirr, D., Law, M., Jaki, T.: Some recommendations for multi-arm multi-stage trials. Statistical Methods in Medical Research 25(2), 716–727 (2016)

work page 2016
[36]

Journal of Statistical Theory and Practice 7(4), 753–773 (2013) 20

Sverdlov, O., Rosenberger, W.F.: On recent advances in optimal allocation designs in clinical trials. Journal of Statistical Theory and Practice 7(4), 753–773 (2013) 20

work page 2013
[37]

R package version 1.0-10

Genz, A., Bretz, F., Miwa, T., X, M., F, L., F, S., T, H.: mvtnorm: Multivariate nor- mal and t distributions. R package version 1.0-10. (2019). http://CRAN.R-project. org/package=mvtnorm

work page 2019
[38]

BMC Medical Research Methodology 16, 67 (2016)

Jacob, L., M, U., Boulet, S., Begaj, I., Chevret, S.: Evaluation of a multi-arm multi- stage Bayesian design for phase II drug selection trials - an example in hemato- oncology. BMC Medical Research Methodology 16, 67 (2016)

work page 2016
[39]

PLoS ONE 11(7), 0159026 (2016)

Wheeler, G.M., Sweeting, M.J., Mander, A.P.: AplusB: A Web Application for In- vestigating A + B Designs for Phase I Cancer Clinical Trials. PLoS ONE 11(7), 0159026 (2016)

work page 2016
[40]

BMC Cancer 18, 133 (2018)

Wages, N.A., Petroni, G.R.: A web tool for designing and conducting phase I trials using the continual reassessment method. BMC Cancer 18, 133 (2018)

work page 2018
[41]

https://www.ncss.com/software/pass/

PASS. https://www.ncss.com/software/pass/. Accessed: 2019-05-04

work page 2019
[42]

Biometrika 99(2), 494–501 (2012)

Magirr, D., Jaki, T., Whitehead, J.: A generalized Dunnett test for multi-arm multi- stage clinical studies with treatment selection. Biometrika 99(2), 494–501 (2012)

work page 2012
[43]

Statistical Methods in Medical Research 26(1), 508–524 (2017)

Wason, J., Stallard, N., Bowden, J., Jennison, C.: A multi-stage drop-the-losers design for multi-arm clinical trials. Statistical Methods in Medical Research 26(1), 508–524 (2017)

work page 2017
[44]

Stata Journal 9(4), 505–523 (2009)

Barthel, F.M.S., Royston, P., Parmar, M.K.B.: A menu-driven facility for sample- size calculation in novel multiarm, multistage randomized controlled trials with a time-to-event outcome. Stata Journal 9(4), 505–523 (2009)

work page 2009
[45]

Journal of Statistical Software 88(4), 1–25 (2019)

Jaki, T., Pallmann, P., Magirr, D.: The R package MAMS for designing multi-arm multi-stage clinical trials. Journal of Statistical Software 88(4), 1–25 (2019)

work page 2019
[46]

BMC Medicine 16, 210 (2018) 22 Figure 1: Design parameters box

Dimairo, M., Coates, E., Pallmann, P., Todd, S., Julious, S.A., Jaki, T., Wason, J., Mander, A.P., Weir, C.J., Koenig, F., Walton, M.K., Biggs, K., Nicholl, J., Hamasaki, T., Proschan, M.A., Scott, J.A., Ando, Y., Hind, D., Altman, D.G.: Development 21 process of a consensus-driven CONSORT extension for randomised trials using an adaptive design. BMC Medi...

work page 2018

[1] [1]

Journal of Health Economics47, 20–33 (2016)

DiMasi, J.A., Grabowski, H.G., Hansen, R.W.: Innovation in the pharmaceutical industry: new estimates of R&D costs. Journal of Health Economics47, 20–33 (2016)

work page 2016

[2] [2]

Biotechnology Innovation Organization (BIO), Biomedtracker, AMPLION: Clinical development success rates 2006-2015 (2016)

work page 2006

[3] [3]

Lancet 384(9940), 283–4 (2014)

Parmar, M.K.B., Carpenter, J., Sydes, M.R.: More multiarm randomised trials of superiority are needed. Lancet 384(9940), 283–4 (2014)

work page 2014

[4] [4]

BMC Cardiovascular Disorders 18(1), 215 (2018) 17

Jaki, T., Wason, J.M.S.: Multi-arm multi-stage trials can improve the eﬃciency of ﬁnding eﬀective treatments for stroke: a case study. BMC Cardiovascular Disorders 18(1), 215 (2018) 17

work page 2018

[5] [5]

Wason, J.M.S., Stecher, L., Mander, A.P.: Correcting for multiple-testing in multi- arm trials: is it necessary and is it done? Trials 15, 364 (2014)

work page 2014

[6] [6]

BMC Medicine 11, 84 (2013)

Baron, G., Perrodeau, E., Boutron, I., Ravaud, P.: Reporting of analyses from ran- domized controlled trials with multiple arms: a systematic review. BMC Medicine 11, 84 (2013)

work page 2013

[7] [7]

JAMA 321(16), 1610–1620 (2019)

Juszczak, E., Altman, D.G., Hopewell, S., Schulz, K.: Reporting of multi-arm parallel-group randomized trials: extension of the CONSORT 2010 statement. JAMA 321(16), 1610–1620 (2019)

work page 2010

[8] [8]

Epidemiology 1(1), 43–46 (1990)

Rothman, K.J.: No adjustments are needed for multiple comparisons. Epidemiology 1(1), 43–46 (1990)

work page 1990

[9] [9]

Journal of the Royal Statistical Society (Series A) 159(1), 93–110 (1996)

Cook, R.J., Farewell, V.T.: Multiplicity considerations in the design and analysis of clinical trials. Journal of the Royal Statistical Society (Series A) 159(1), 93–110 (1996)

work page 1996

[10] [10]

Controlled clinical trials 21(6), 527–539 (2000)

Proschan, M.A., Waclawiw, M.A.: Practical guidelines for multiplicity adjustment in clinical trials. Controlled clinical trials 21(6), 527–539 (2000)

work page 2000

[11] [11]

Bender, R., Lange, S.: Adjusting for multiple testing - when and how? Journal of Clinical Epidemiology 54(4) (2001)

work page 2001

[12] [12]

Feise, R.J.: Do multiple outcome measures require p-value adjustment? BMC Med- ical Research Methodology 2, 8 (2002)

work page 2002

[13] [13]

Encyclopedia Biostatistics 5, 3446–3451 (2005)

Hughes, M.D.: Multiplicity in clinical trials. Encyclopedia Biostatistics 5, 3446–3451 (2005)

work page 2005

[14] [14]

Clinical Cancer Research 14 (2008)

Freidlin, B., Korn, E.L., Gray, R., Martin, A.: Multi-arm clinical trials of new agents: some design considerations. Clinical Cancer Research 14 (2008)

work page 2008

[15] [15]

International Journal of Epidemiology46(2), 746–755 (2016)

Li, G., Taljaard, M., Van den Heuvel, E.R., Levine, M.A.H., Cook, D.J., Wells, G.A., Devereaux, P.J., Thabane, L.: An introduction to multiplicity issues in clinical trials: 18 the what, why, when and how. International Journal of Epidemiology46(2), 746–755 (2016)

work page 2016

[16] [16]

Agency, E.M.: Guideline on Multiplicity Issues in Clinical Trials. (2017). https://www.ema.europa.eu/en/documents/scientific-guideline/ draft-guideline-multiplicity-issues-clinical-trials_en.pdf

work page 2017

[17] [17]

Administration, U.F..D.: Multiple Endpoints in Clinical Tri- als Guidance for Industry. (2017). https://www.fda.gov/ regulatory-information/search-fda-guidance-documents/ multiple-endpoints-clinical-trials-guidance-industry

work page 2017

[18] [18]

Statistical Methods in Medical Research 27(5), 1513–1530 (2018)

Howard, D.R., Brown, J.M., Todd, S., Gregory, W.M.: Recommendations on mul- tiple testing adjustment in multi-arm trials with a shared control group. Statistical Methods in Medical Research 27(5), 1513–1530 (2018)

work page 2018

[19] [19]

John Wiley & Sons, New York, NY (1987)

Hochberg, Y., Tamhane, A.C.: Multiple Comparison Procedures. John Wiley & Sons, New York, NY (1987)

work page 1987

[20] [20]

Chapman & Hall, London (1996)

Hsu, J.C.: Multiple Comparisons. Chapman & Hall, London (1996)

work page 1996

[21] [21]

CRC Press, Boca Raton, FL (2010)

Bretz, F., Hothorn, T., Westfall, P.: Multiple Comparisons using R. CRC Press, Boca Raton, FL (2010)

work page 2010

[22] [22]

Statistics in Medicine 22(20), 3133–3150 (2003)

Sankoh, A.J., D’Agostino, R.B.S., Huque, M.F.: Eﬃcacy endpoint selection and multiplicity adjustment methods in clinical trials with inherent multiple endpoint issues. Statistics in Medicine 22(20), 3133–3150 (2003)

work page 2003

[23] [23]

Oxford University Press, Oxford (2007)

Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs, with SAS. Oxford University Press, Oxford (2007)

work page 2007

[24] [24]

https://www.cytel.com/software/east

East. https://www.cytel.com/software/east. Accessed: 2019-05-04

work page 2019

[25] [25]

Chang, W., Cheng, J., Allaire, J.J., Xie, Y., McPherson, J.: shiny: Web Application Framework for R. (2019). https://CRAN.R-project.org/package=shiny 19

work page 2019

[26] [26]

R Founda- tion for Statistical Computing, Vienna, Austria (2018)

R Core Team: R: a Language and Environment for Statistical Computing. R Founda- tion for Statistical Computing, Vienna, Austria (2018). R Foundation for Statistical Computing. https://www.R-project.org/

work page 2018

[27] [27]

http://www.github.com/mjg211/multiarm/

Grayling, M.J.: multiarm: Design and analysis of ﬁxed-sample multi-arm clinical trials (2019). http://www.github.com/mjg211/multiarm/

work page 2019

[28] [28]

Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze (1936)

Bonferroni, C.E.: Teoria statistica delle classi e calcolo delle probabilit. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze (1936)

work page 1936

[29] [29]

Journal of the American Statistical Association 62(318), 626–633 (1967)

ˇSid´ ak, Z.: Rectangular conﬁdence regions for the means of multivariate normal dis- tributions. Journal of the American Statistical Association 62(318), 626–633 (1967)

work page 1967

[30] [30]

Journal of the American Statistical Association 50(272), 1096–1121 (1955)

Dunnett, C.W.: A multiple comparison procedure for comparing several treatments with a control. Journal of the American Statistical Association 50(272), 1096–1121 (1955)

work page 1955

[31] [31]

Scandinavian Jour- nal of Statistics 6(2), 65–70 (1979)

Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Jour- nal of Statistics 6(2), 65–70 (1979)

work page 1979

[32] [32]

Biometrika 75(4), 800–802 (1988)

Hochberg, Y.: A sharper bonferroni procedure for multiple tests of signiﬁcance. Biometrika 75(4), 800–802 (1988)

work page 1988

[33] [33]

Journal of the Royal Statistical Society (Series B) 57(1), 289–300 (1995)

Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society (Series B) 57(1), 289–300 (1995)

work page 1995

[34] [34]

Annals of Statistics 29(4), 1165–1188 (1995)

Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29(4), 1165–1188 (1995)

work page 1995

[35] [35]

Statistical Methods in Medical Research 25(2), 716–727 (2016)

Wason, J., Magirr, D., Law, M., Jaki, T.: Some recommendations for multi-arm multi-stage trials. Statistical Methods in Medical Research 25(2), 716–727 (2016)

work page 2016

[36] [36]

Journal of Statistical Theory and Practice 7(4), 753–773 (2013) 20

Sverdlov, O., Rosenberger, W.F.: On recent advances in optimal allocation designs in clinical trials. Journal of Statistical Theory and Practice 7(4), 753–773 (2013) 20

work page 2013

[37] [37]

R package version 1.0-10

Genz, A., Bretz, F., Miwa, T., X, M., F, L., F, S., T, H.: mvtnorm: Multivariate nor- mal and t distributions. R package version 1.0-10. (2019). http://CRAN.R-project. org/package=mvtnorm

work page 2019

[38] [38]

BMC Medical Research Methodology 16, 67 (2016)

Jacob, L., M, U., Boulet, S., Begaj, I., Chevret, S.: Evaluation of a multi-arm multi- stage Bayesian design for phase II drug selection trials - an example in hemato- oncology. BMC Medical Research Methodology 16, 67 (2016)

work page 2016

[39] [39]

PLoS ONE 11(7), 0159026 (2016)

Wheeler, G.M., Sweeting, M.J., Mander, A.P.: AplusB: A Web Application for In- vestigating A + B Designs for Phase I Cancer Clinical Trials. PLoS ONE 11(7), 0159026 (2016)

work page 2016

[40] [40]

BMC Cancer 18, 133 (2018)

Wages, N.A., Petroni, G.R.: A web tool for designing and conducting phase I trials using the continual reassessment method. BMC Cancer 18, 133 (2018)

work page 2018

[41] [41]

https://www.ncss.com/software/pass/

PASS. https://www.ncss.com/software/pass/. Accessed: 2019-05-04

work page 2019

[42] [42]

Biometrika 99(2), 494–501 (2012)

Magirr, D., Jaki, T., Whitehead, J.: A generalized Dunnett test for multi-arm multi- stage clinical studies with treatment selection. Biometrika 99(2), 494–501 (2012)

work page 2012

[43] [43]

Statistical Methods in Medical Research 26(1), 508–524 (2017)

Wason, J., Stallard, N., Bowden, J., Jennison, C.: A multi-stage drop-the-losers design for multi-arm clinical trials. Statistical Methods in Medical Research 26(1), 508–524 (2017)

work page 2017

[44] [44]

Stata Journal 9(4), 505–523 (2009)

Barthel, F.M.S., Royston, P., Parmar, M.K.B.: A menu-driven facility for sample- size calculation in novel multiarm, multistage randomized controlled trials with a time-to-event outcome. Stata Journal 9(4), 505–523 (2009)

work page 2009

[45] [45]

Journal of Statistical Software 88(4), 1–25 (2019)

Jaki, T., Pallmann, P., Magirr, D.: The R package MAMS for designing multi-arm multi-stage clinical trials. Journal of Statistical Software 88(4), 1–25 (2019)

work page 2019

[46] [46]

BMC Medicine 16, 210 (2018) 22 Figure 1: Design parameters box

Dimairo, M., Coates, E., Pallmann, P., Todd, S., Julious, S.A., Jaki, T., Wason, J., Mander, A.P., Weir, C.J., Koenig, F., Walton, M.K., Biggs, K., Nicholl, J., Hamasaki, T., Proschan, M.A., Scott, J.A., Ando, Y., Hind, D., Altman, D.G.: Development 21 process of a consensus-driven CONSORT extension for randomised trials using an adaptive design. BMC Medi...

work page 2018