Multiple testing

Jesse Hemerik

arxiv: 2606.26781 · v1 · pith:5NPU2PFKnew · submitted 2026-06-25 · 📊 stat.ME · math.ST· stat.TH

Multiple testing

Jesse Hemerik This is my paper

Pith reviewed 2026-06-26 03:16 UTC · model grok-4.3

classification 📊 stat.ME math.STstat.TH

keywords multiple hypothesis testingerror criteriatesting proceduresfamily-wise error ratefalse discovery rateR packages

0 comments

The pith

This text introduces multiple hypothesis testing by covering error criteria and testing procedures with R package references.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper serves as lecture notes providing an introduction to multiple hypothesis testing. It explains various error criteria used when testing many hypotheses simultaneously. It also covers common testing procedures and points to relevant R packages for implementation. The material was developed for a PhD-level course.

Core claim

The text provides an introduction to multiple hypothesis testing. It covers various error criteria and testing procedures, and includes references to relevant R packages.

What carries the argument

Multiple testing procedures that control error rates such as family-wise error rate or false discovery rate when many hypotheses are tested at once.

If this is right

Users gain the ability to select appropriate error control when performing many simultaneous tests.
Practical implementation is supported by the referenced R packages.
The material supports teaching of multiple testing concepts at an advanced level.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The notes could serve as a foundation for researchers entering fields that require high-dimensional testing.
They highlight the need to match error criteria to the scientific goal of the analysis.
Similar lecture notes might be adapted for other statistical topics with software examples.

Load-bearing premise

The descriptions of error criteria and testing procedures accurately reflect established methods in the statistical literature.

What would settle it

A demonstration that one of the described procedures fails to control the stated error rate under the conditions given in the text.

Figures

Figures reproduced from arXiv: 2606.26781 by Jesse Hemerik.

**Figure 2.** Figure 2: Equivalence testing. “Equivalence” means that that [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Histogram of 5! = 120 test statistics based on permuted versions of the [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: Histogram of 26 = 64 test statistics based on sign-flipped versions of the dataset on maize plants. where Gg means {h ◦ g : h ∈ G}. Because tests such as those in §2.4.2 and §2.4.3 involve groups of transformations, they are sometimes called group invariance tests. Thus, a permutation test for example, is a special case of a group invariance test. Example 2.1 (Permutation maps form a group.). In the exampl… view at source ↗

**Figure 5.** Figure 5: Illustration of the global test using Simes’ inequality ( [PITH_FULL_IMAGE:figures/full_fig_p035_5.png] view at source ↗

**Figure 6.** Figure 6: The sorted values of the smallest 50 p-values among all 279 p-values for [PITH_FULL_IMAGE:figures/full_fig_p050_6.png] view at source ↗

**Figure 7.** Figure 7: The sorted p-values corresponding to the numerical predictors of the [PITH_FULL_IMAGE:figures/full_fig_p051_7.png] view at source ↗

**Figure 8.** Figure 8: A hypothesis is a set of distributions. This Venn diagram represents [PITH_FULL_IMAGE:figures/full_fig_p053_8.png] view at source ↗

**Figure 9.** Figure 9: This figure shows all intersection hypotheses in case there are three [PITH_FULL_IMAGE:figures/full_fig_p055_9.png] view at source ↗

**Figure 10.** Figure 10: Example data on car models. 18.1 cyl 22.8 disp 24.4 hp 21 drat 14.3 wt 6 qsec 4 vs 6 am 8 gear 4 carb 160 108 225 360 146.7 110 93 105 245 62 3.9 3.85 2.76 3.21 3.69 2.62 2.32 3.46 3.57 3.19 16.46 18.61 20.22 15.84 20 0 1 1 0 1 1 1 0 0 0 4 4 3 3 4 4 1 1 4 2 [PITH_FULL_IMAGE:figures/full_fig_p067_10.png] view at source ↗

**Figure 11.** Figure 11: Simultaneously permuting the columns corresponding to all variables [PITH_FULL_IMAGE:figures/full_fig_p067_11.png] view at source ↗

**Figure 12.** Figure 12: For each of the 120 permutations we computed the maximum of the [PITH_FULL_IMAGE:figures/full_fig_p068_12.png] view at source ↗

read the original abstract

This text provides an introduction to multiple hypothesis testing. It covers various error criteria and testing procedures, and includes references to relevant R packages. An earlier version of this text served as the lecture notes for a PhD-level course on multiple testing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is lecture notes on multiple testing with no new results or methods.

read the letter

The punchline for this paper is that it contains no new research at all. It is a write-up of lecture notes for a PhD course on multiple hypothesis testing.

What the paper does well is provide a clear, organized overview of the main concepts. It discusses various error criteria, including the family-wise error rate and false discovery rate, describes several testing procedures, and gives references to R packages that can be used to apply them. This kind of summary can save time for students who are getting started with the topic and need the basics in one place.

The soft spots are limited because the work does not introduce any new methods or derivations that could be flawed. The central requirement is that the descriptions match the existing literature, which they appear to do based on the abstract and the nature of the text. There is no circularity or invented entities here. One possible minor issue is that lecture notes sometimes skip over nuances or recent papers, but for an introductory text that is not a serious problem.

This paper is for people who want an educational introduction rather than original findings. A reader interested in advancing multiple testing research will not get much from it. It shows clear thinking in how it structures the material, but it is not pushing any boundaries.

I do not think this deserves peer review. It is fine as lecture notes and can be shared that way without needing formal refereeing.

Referee Report

0 major / 2 minor

Summary. The manuscript is an expository introduction to multiple hypothesis testing. It covers error criteria (FWER, FDR and variants), standard procedures (Bonferroni, Holm, Benjamini-Hochberg and related step-up/step-down methods), and points readers to R packages for implementation. The text originated as PhD-level lecture notes.

Significance. If the descriptions match the established literature, the manuscript could function as a compact teaching aid for graduate students. Because it advances no new methods, proofs, or empirical results, its contribution to the research literature in statistical methodology is minimal.

minor comments (2)

Add a table of contents or explicit section numbering to improve usability as standalone lecture notes.
Include version numbers or last-update dates for the cited R packages (e.g., multtest, qvalue) so readers can reproduce the examples.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for reviewing our manuscript. We agree that it is an expository introduction based on PhD lecture notes, covering established error criteria and procedures along with R package references, without introducing new methods or results.

read point-by-point responses

Referee: If the descriptions match the established literature, the manuscript could function as a compact teaching aid for graduate students. Because it advances no new methods, proofs, or empirical results, its contribution to the research literature in statistical methodology is minimal.

Authors: We concur that the manuscript does not advance new methodology, proofs, or empirical findings, as its scope is limited to summarizing standard approaches and directing readers to implementations. This aligns with its origin as lecture notes intended for instructional use rather than original research. We maintain that such consolidated expository resources can still offer pedagogical value for students and practitioners seeking an accessible overview. revision: no

Circularity Check

0 steps flagged

No circularity: purely expository introduction with no derivations or predictions

full rationale

The manuscript is an expository introduction to multiple hypothesis testing methods drawn from the established statistical literature. It covers error criteria, procedures, and R packages but contains no derivations, predictions, fitted parameters, or novel claims. The reader's weakest assumption (accurate reflection of standard methods) is external to the paper and does not create internal circularity. No load-bearing steps reduce to self-definition, self-citation chains, or fitted inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an expository text introducing standard concepts in multiple testing with no new mathematical derivations, parameters, or entities introduced.

pith-pipeline@v0.9.1-grok · 5537 in / 970 out tokens · 43788 ms · 2026-06-26T03:16:32.615347+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 4 canonical work pages

[1]

Anderson, M. J. and Robinson, J. Permutation tests for linear models. Australian & New Zealand Journal of Statistics, 43 0 (1): 0 75--88, 2001

2001
[2]

Permutation-based true discovery proportions for functional magnetic resonance imaging cluster analysis

Andreella, A., Hemerik, J., Finos, L., Weeda, W., and Goeman, J. Permutation-based true discovery proportions for functional magnetic resonance imaging cluster analysis. Statistics in Medicine, 42 0 (14): 0 2311--2340, 2023

2023
[3]

Barber, R. F. and Candes, E. Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43 0 (5): 0 2055--2085, 2015

2055
[4]

F., Candes, E., Janson, L., Patterson, E., and Sesia, M

Barber, R. F., Candes, E., Janson, L., Patterson, E., and Sesia, M. The Knockoff Filter for Controlled Variable Selection, 2022. URL https://CRAN.R-project.org/package=knockoff. R package version 0.3.6

2022
[5]

and Hochberg, Y

Benjamini, Y. and Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), pages 289--300, 1995

1995
[6]

and Yekutieli, D

Benjamini, Y. and Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Annals of statistics, pages 1165--1188, 2001

2001
[7]

Notip: Non-parametric true discovery proportion control for brain imaging

Blain, A., Thirion, B., and Neuvial, P. Notip: Non-parametric true discovery proportion control for brain imaging. NeuroImage, 260: 0 119492, 2022

2022
[8]

R., Linhart, J., Thirion, B., and Neuvial, P

Blain, A., Lobo, A. R., Linhart, J., Thirion, B., and Neuvial, P. When knockoffs fail: diagnosing and fixing non-exchangeability of knockoffs. arXiv preprint arXiv:2407.06892, 2024

work page arXiv 2024
[9]

B., Benjamini, Y., and Sabatti, C

Bogomolov, M., Peterson, C. B., Benjamini, Y., and Sabatti, C. Hypotheses on a tree: new error rates and testing strategies. Biometrika, 108 0 (3): 0 575--590, 2021

2021
[10]

High-dimensional statistics with a view toward applications in biology

B \"u hlmann, P., Kalisch, M., and Meier, L. High-dimensional statistics with a view toward applications in biology. Annual review of statistics and its application, 1 0 (1): 0 255--278, 2014

2014
[11]

A., Romano, J

Canay, I. A., Romano, J. P., and Shaikh, A. M. Randomization tests under an approximate symmetry assumption. Econometrica, 85 0 (3): 0 1013--1030, 2017

2017
[12]

Panning for gold:‘model-x’knockoffs for high dimensional controlled variable selection

Candes, E., Fan, Y., Janson, L., and Lv, J. Panning for gold:‘model-x’knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 80 0 (3): 0 551--577, 2018

2018
[13]

P., and Wolf, M

Clarke, D., Romano, J. P., and Wolf, M. The R omano-- W olf multiple-hypothesis correction in S tata. The S tata Journal , 20 0 (4): 0 812--843, 2020

2020
[14]

Cock, D. D. Ames, I owa: Alternative to the B oston housing data as an end of semester regression project. Journal of Statistics Education, 19 0 (3): 0 1--15, 2011

2011
[15]

and Flachaire, E

Davidson, R. and Flachaire, E. The wild bootstrap, tamed at last. Journal of Econometrics, 146 0 (1): 0 162--169, 2008

2008
[16]

J., Davenport, S., Hemerik, J., and Finos, L

De Santis, R., Goeman, J. J., Davenport, S., Hemerik, J., and Finos, L. Permutation-based multiple testing when fitting many generalized linear models. Electronic Journal of Statistics, 19 0 (2): 0 3317--3332, 2025 a

2025
[17]

J., Hemerik, J., Davenport, S., and Finos, L

De Santis, R., Goeman, J. J., Hemerik, J., Davenport, S., and Finos, L. Inference in generalized linear models with robustness to misspecified variances. Journal of the American Statistical Association, 120 0 (552): 0 2762--2771, 2025 b

2025
[18]

and Roquain, E

Delattre, S. and Roquain, E. New procedures controlling the false discovery proportion via R omano-- W olf’s heuristic. The Annals of Statistics, 43 0 (3): 0 1141--1177, 2015

2015
[19]

and Scheer, M

Dikta, G. and Scheer, M. Bootstrap methods. Springer, 2021

2021
[20]

False Discovery Exceedance Controlling Multiple Testing Procedures, 2024

Dohler, S., Junge, F., and Roquain, E. False Discovery Exceedance Controlling Multiple Testing Procedures, 2024. URL https://CRAN.R-project.org/package=FDX. R package version 2.0.2

2024
[21]

and Van Der Laan, M

Dudoit, S. and Van Der Laan, M. J. Multiple testing procedures with applications to genomics. Springer, 2008

2008
[22]

Fay, M. P. and Brittain, E. H. Statistical Hypothesis Testing in Context: Volume 52: Reproducibility, Inference, and Science, volume 52. Cambridge University Press, 2022

2022
[23]

On the false discovery rate and an asymptotically optimal rejection curve

Finner, H., Dickhaus, T., and Roters, M. On the false discovery rate and an asymptotically optimal rejection curve. The Annals of Statistics, pages 596--618, 2009

2009
[24]

Fisher, R. A. The design of experiments. Oliver and Boyd, 1935

1935
[25]

and Lane, D

Freedman, D. and Lane, D. A nonstochastic interpretation of reported significance levels. Journal of Business & Economic Statistics, 1 0 (4): 0 292--298, 1983

1983
[26]

Genovese, C. R. and Wasserman, L. Exceedance control of the false discovery proportion. Journal of the American Statistical Association, 101 0 (476): 0 1408--1417, 2006

2006
[27]

Goeman, J. J. and Solari, A. Multiple testing for exploratory research. Statistical Science, 26 0 (4): 0 584--597, 2011

2011
[28]

Goeman, J. J. and Solari, A. Multiple hypothesis testing in genomics. Statistics in medicine, 33 0 (11): 0 1946--1978, 2014

1946
[29]

J., Meijer, R

Goeman, J. J., Meijer, R. J., Krebs, T. J., and Solari, A. Simultaneous control of all false discovery proportions in large-scale multiple hypothesis testing. Biometrika, 106 0 (4): 0 841--856, 2019

2019
[30]

J., Hemerik, J., and Solari, A

Goeman, J. J., Hemerik, J., and Solari, A. Only closed testing procedures are admissible for controlling false discovery proportions. The Annals of Statistics, 49 0 (2): 0 1218--1238, 2021

2021
[31]

J., Meijer, R., and Krebs, T

Goeman, J. J., Meijer, R., and Krebs, T. Methods for Closed Testing with Simes Inequality, in Particular Hommel's Method, 2025. URL https://CRAN.R-project.org/package=hommel. R package version 1.8

2025
[32]

and Goeman, J

Hemerik, J. and Goeman, J. J. False discovery proportion estimation by permutations: confidence for significance analysis of microarrays. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80 0 (1): 0 137--155, 2018 a

2018
[33]

Permutation-based simultaneous confidence bounds for the false discovery proportion

Hemerik, J., Solari, A., and Goeman, J. Permutation-based simultaneous confidence bounds for the false discovery proportion. Biometrika, 106 0 (3): 0 635--649, 2019

2019
[34]

and Goeman, J

Hemerik, J. and Goeman, J. J. Exact testing with random permutations. TEST, 27 0 (4): 0 811--825, 2018 b

2018
[35]

J., and Finos, L

Hemerik, J., Goeman, J. J., and Finos, L. Robust testing in generalized linear models by sign flipping score contributions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 0 (3): 0 841--864, 2020

2020
[36]

and Tamhane, A

Hochberg, Y. and Tamhane, A. C. Multiple comparison procedures. John Wiley & Sons, Inc., 1987

1987
[37]

A simple sequentially rejective multiple test procedure

Holm, S. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65--70, 1979

1979
[38]

A stagewise rejective multiple test procedure based on a modified bonferroni test

Hommel, G. A stagewise rejective multiple test procedure based on a modified bonferroni test. Biometrika, pages 383--386, 1988

1988
[39]

Studentized permutation tests for non-iid hypotheses and the generalized behrens-fisher problem

Janssen, A. Studentized permutation tests for non-iid hypotheses and the generalized behrens-fisher problem. Statistics & probability letters, 36 0 (1): 0 9--21, 1997

1997
[40]

Kashlak, A. B. Asymptotic symmetry and group invariance for randomization. arXiv preprint arXiv:2211.00144, 2022

work page arXiv 2022
[41]

Koning, N. W. and Hemerik, J. More efficient exact group invariance testing: using a representative subgroup. Biometrika, 111 0 (2): 0 441--458, 2024

2024
[42]

The A mes I owa Housing data , 2025

Kuhn, M. The A mes I owa Housing data , 2025. URL https://CRAN.R-project.org/package=AmesHousing. R package version 0.0.4

2025
[43]

Lehmann, E. L. and Romano, J. P. Testing statistical hypotheses. Springer Science & Business Media, 2022

2022
[44]

Lehmann, E. L. and Romano, J. P. Generalizations of the familywise error rate. volume 33, pages 1138--1154. 2005

2005
[45]

J., Krebs, T

Meijer, R. J., Krebs, T. J., and Goeman, J. J. Hommel's procedure in linear time. Biometrical Journal, 61 0 (1): 0 73--82, 2019

2019
[46]

S., Dudoit, S., and van der Laan, M

Pollard, K. S., Dudoit, S., and van der Laan, M. J. R package multtest. URL https://www.bioconductor.org/packages/release/bioc/html/multtest.html
[47]

S., Dudoit, S., and van der Laan, M

Pollard, K. S., Dudoit, S., and van der Laan, M. J. Multiple testing procedures: the multtest package and applications to genomics. In Bioinformatics and computational biology solutions using R and bioconductor, pages 249--271. Springer, 2005

2005
[48]

Potter, D. M. A permutation test for inference in logistic regression with small-and moderate-sized data sets. Statistics in medicine, 24 0 (5): 0 693--708, 2005

2005
[49]

and Wang, R

Ramdas, A. and Wang, R. Hypothesis testing with e-values. Foundations and Trends in Statistics , 1 0 (1-2): 0 1--390, 2025. doi:10.1561/STA

work page doi:10.1561/sta 2025
[50]

F., Cand \`e s, E

Ramdas, A., Barber, R. F., Cand \`e s, E. J., and Tibshirani, R. J. Permutation tests using arbitrary permutation distributions. Sankhya A, 85 0 (2): 0 1156--1177, 2023

2023
[51]

Romano, J. P. On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association, 85 0 (411): 0 686--692, 1990

1990
[52]

Romano, J. P. and Shaikh, A. M. On stepdown control of the false discovery proportion. Lecture Notes-Monograph Series, pages 33--50, 2006

2006
[53]

Romano, J. P. and Wolf, M. Stepwise multiple testing as formalized data snooping. Econometrica, 73 0 (4): 0 1237--1282, 2005

2005
[54]

Romano, J. P. and Wolf, M. Control of generalized error rates in multiple testing. The Annals of Statistics, 35 0 (4): 0 1378--1408, 2007

2007
[55]

Romano, J. P. and Wolf, M. Efficient computation of adjusted p-values for resampling-based stepdown multiple testing. Statistics & Probability Letters, 113: 0 38--40, 2016

2016
[56]

Deep knockoffs

Romano, Y., Sesia, M., and Cand \`e s, E. Deep knockoffs. Journal of the American Statistical Association, 115 0 (532): 0 1861--1872, 2020

2020
[57]

Sarkar, S. K. Some probability inequalities for ordered mtp 2 random variables: a proof of the simes conjecture. Annals of Statistics, pages 494--504, 1998

1998
[58]

Solari, A., Finos, L., and Goeman, J. J. Rotation-based multiple testing in the multivariate linear model. Biometrics, 70 0 (4): 0 954--961, 2014

2014
[59]

K., Kim, S

Southworth, L. K., Kim, S. K., and Owen, A. B. Properties of balanced permutations. Journal of Computational Biology, 16 0 (4): 0 625--638, 2009

2009
[60]

Spreij, P. J. Measure theoretic probability. Course Notes, 2023. URL https://staff.fnwi.uva.nl/p.j.c.spreij/onderwijs/master/mtp.pdf

2023
[61]

Storey, J. D. A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64 0 (3): 0 479--498, 2002

2002
[62]

Vesely, A., Finos, L., and Goeman, J. J. Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 0 (3): 0 664--683, 2023

2023
[63]

Elementary proofs of several results on false discovery rate

Wang, R. Elementary proofs of several results on false discovery rate. arXiv preprint arXiv:2201.09350, 2022

work page arXiv 2022
[64]

Westfall, P. H. and Young, S. S. Resampling-based multiple testing: Examples and methods for p-value adjustment, volume 279. John Wiley & Sons, 1993

1993
[65]

M., Ridgway, G

Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., and Nichols, T. E. Permutation inference for the general linear model. Neuroimage, 92: 0 381--397, 2014

2014

[1] [1]

Anderson, M. J. and Robinson, J. Permutation tests for linear models. Australian & New Zealand Journal of Statistics, 43 0 (1): 0 75--88, 2001

2001

[2] [2]

Permutation-based true discovery proportions for functional magnetic resonance imaging cluster analysis

Andreella, A., Hemerik, J., Finos, L., Weeda, W., and Goeman, J. Permutation-based true discovery proportions for functional magnetic resonance imaging cluster analysis. Statistics in Medicine, 42 0 (14): 0 2311--2340, 2023

2023

[3] [3]

Barber, R. F. and Candes, E. Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43 0 (5): 0 2055--2085, 2015

2055

[4] [4]

F., Candes, E., Janson, L., Patterson, E., and Sesia, M

Barber, R. F., Candes, E., Janson, L., Patterson, E., and Sesia, M. The Knockoff Filter for Controlled Variable Selection, 2022. URL https://CRAN.R-project.org/package=knockoff. R package version 0.3.6

2022

[5] [5]

and Hochberg, Y

Benjamini, Y. and Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological), pages 289--300, 1995

1995

[6] [6]

and Yekutieli, D

Benjamini, Y. and Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Annals of statistics, pages 1165--1188, 2001

2001

[7] [7]

Notip: Non-parametric true discovery proportion control for brain imaging

Blain, A., Thirion, B., and Neuvial, P. Notip: Non-parametric true discovery proportion control for brain imaging. NeuroImage, 260: 0 119492, 2022

2022

[8] [8]

R., Linhart, J., Thirion, B., and Neuvial, P

Blain, A., Lobo, A. R., Linhart, J., Thirion, B., and Neuvial, P. When knockoffs fail: diagnosing and fixing non-exchangeability of knockoffs. arXiv preprint arXiv:2407.06892, 2024

work page arXiv 2024

[9] [9]

B., Benjamini, Y., and Sabatti, C

Bogomolov, M., Peterson, C. B., Benjamini, Y., and Sabatti, C. Hypotheses on a tree: new error rates and testing strategies. Biometrika, 108 0 (3): 0 575--590, 2021

2021

[10] [10]

High-dimensional statistics with a view toward applications in biology

B \"u hlmann, P., Kalisch, M., and Meier, L. High-dimensional statistics with a view toward applications in biology. Annual review of statistics and its application, 1 0 (1): 0 255--278, 2014

2014

[11] [11]

A., Romano, J

Canay, I. A., Romano, J. P., and Shaikh, A. M. Randomization tests under an approximate symmetry assumption. Econometrica, 85 0 (3): 0 1013--1030, 2017

2017

[12] [12]

Panning for gold:‘model-x’knockoffs for high dimensional controlled variable selection

Candes, E., Fan, Y., Janson, L., and Lv, J. Panning for gold:‘model-x’knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 80 0 (3): 0 551--577, 2018

2018

[13] [13]

P., and Wolf, M

Clarke, D., Romano, J. P., and Wolf, M. The R omano-- W olf multiple-hypothesis correction in S tata. The S tata Journal , 20 0 (4): 0 812--843, 2020

2020

[14] [14]

Cock, D. D. Ames, I owa: Alternative to the B oston housing data as an end of semester regression project. Journal of Statistics Education, 19 0 (3): 0 1--15, 2011

2011

[15] [15]

and Flachaire, E

Davidson, R. and Flachaire, E. The wild bootstrap, tamed at last. Journal of Econometrics, 146 0 (1): 0 162--169, 2008

2008

[16] [16]

J., Davenport, S., Hemerik, J., and Finos, L

De Santis, R., Goeman, J. J., Davenport, S., Hemerik, J., and Finos, L. Permutation-based multiple testing when fitting many generalized linear models. Electronic Journal of Statistics, 19 0 (2): 0 3317--3332, 2025 a

2025

[17] [17]

J., Hemerik, J., Davenport, S., and Finos, L

De Santis, R., Goeman, J. J., Hemerik, J., Davenport, S., and Finos, L. Inference in generalized linear models with robustness to misspecified variances. Journal of the American Statistical Association, 120 0 (552): 0 2762--2771, 2025 b

2025

[18] [18]

and Roquain, E

Delattre, S. and Roquain, E. New procedures controlling the false discovery proportion via R omano-- W olf’s heuristic. The Annals of Statistics, 43 0 (3): 0 1141--1177, 2015

2015

[19] [19]

and Scheer, M

Dikta, G. and Scheer, M. Bootstrap methods. Springer, 2021

2021

[20] [20]

False Discovery Exceedance Controlling Multiple Testing Procedures, 2024

Dohler, S., Junge, F., and Roquain, E. False Discovery Exceedance Controlling Multiple Testing Procedures, 2024. URL https://CRAN.R-project.org/package=FDX. R package version 2.0.2

2024

[21] [21]

and Van Der Laan, M

Dudoit, S. and Van Der Laan, M. J. Multiple testing procedures with applications to genomics. Springer, 2008

2008

[22] [22]

Fay, M. P. and Brittain, E. H. Statistical Hypothesis Testing in Context: Volume 52: Reproducibility, Inference, and Science, volume 52. Cambridge University Press, 2022

2022

[23] [23]

On the false discovery rate and an asymptotically optimal rejection curve

Finner, H., Dickhaus, T., and Roters, M. On the false discovery rate and an asymptotically optimal rejection curve. The Annals of Statistics, pages 596--618, 2009

2009

[24] [24]

Fisher, R. A. The design of experiments. Oliver and Boyd, 1935

1935

[25] [25]

and Lane, D

Freedman, D. and Lane, D. A nonstochastic interpretation of reported significance levels. Journal of Business & Economic Statistics, 1 0 (4): 0 292--298, 1983

1983

[26] [26]

Genovese, C. R. and Wasserman, L. Exceedance control of the false discovery proportion. Journal of the American Statistical Association, 101 0 (476): 0 1408--1417, 2006

2006

[27] [27]

Goeman, J. J. and Solari, A. Multiple testing for exploratory research. Statistical Science, 26 0 (4): 0 584--597, 2011

2011

[28] [28]

Goeman, J. J. and Solari, A. Multiple hypothesis testing in genomics. Statistics in medicine, 33 0 (11): 0 1946--1978, 2014

1946

[29] [29]

J., Meijer, R

Goeman, J. J., Meijer, R. J., Krebs, T. J., and Solari, A. Simultaneous control of all false discovery proportions in large-scale multiple hypothesis testing. Biometrika, 106 0 (4): 0 841--856, 2019

2019

[30] [30]

J., Hemerik, J., and Solari, A

Goeman, J. J., Hemerik, J., and Solari, A. Only closed testing procedures are admissible for controlling false discovery proportions. The Annals of Statistics, 49 0 (2): 0 1218--1238, 2021

2021

[31] [31]

J., Meijer, R., and Krebs, T

Goeman, J. J., Meijer, R., and Krebs, T. Methods for Closed Testing with Simes Inequality, in Particular Hommel's Method, 2025. URL https://CRAN.R-project.org/package=hommel. R package version 1.8

2025

[32] [32]

and Goeman, J

Hemerik, J. and Goeman, J. J. False discovery proportion estimation by permutations: confidence for significance analysis of microarrays. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80 0 (1): 0 137--155, 2018 a

2018

[33] [33]

Permutation-based simultaneous confidence bounds for the false discovery proportion

Hemerik, J., Solari, A., and Goeman, J. Permutation-based simultaneous confidence bounds for the false discovery proportion. Biometrika, 106 0 (3): 0 635--649, 2019

2019

[34] [34]

and Goeman, J

Hemerik, J. and Goeman, J. J. Exact testing with random permutations. TEST, 27 0 (4): 0 811--825, 2018 b

2018

[35] [35]

J., and Finos, L

Hemerik, J., Goeman, J. J., and Finos, L. Robust testing in generalized linear models by sign flipping score contributions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 0 (3): 0 841--864, 2020

2020

[36] [36]

and Tamhane, A

Hochberg, Y. and Tamhane, A. C. Multiple comparison procedures. John Wiley & Sons, Inc., 1987

1987

[37] [37]

A simple sequentially rejective multiple test procedure

Holm, S. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, pages 65--70, 1979

1979

[38] [38]

A stagewise rejective multiple test procedure based on a modified bonferroni test

Hommel, G. A stagewise rejective multiple test procedure based on a modified bonferroni test. Biometrika, pages 383--386, 1988

1988

[39] [39]

Studentized permutation tests for non-iid hypotheses and the generalized behrens-fisher problem

Janssen, A. Studentized permutation tests for non-iid hypotheses and the generalized behrens-fisher problem. Statistics & probability letters, 36 0 (1): 0 9--21, 1997

1997

[40] [40]

Kashlak, A. B. Asymptotic symmetry and group invariance for randomization. arXiv preprint arXiv:2211.00144, 2022

work page arXiv 2022

[41] [41]

Koning, N. W. and Hemerik, J. More efficient exact group invariance testing: using a representative subgroup. Biometrika, 111 0 (2): 0 441--458, 2024

2024

[42] [42]

The A mes I owa Housing data , 2025

Kuhn, M. The A mes I owa Housing data , 2025. URL https://CRAN.R-project.org/package=AmesHousing. R package version 0.0.4

2025

[43] [43]

Lehmann, E. L. and Romano, J. P. Testing statistical hypotheses. Springer Science & Business Media, 2022

2022

[44] [44]

Lehmann, E. L. and Romano, J. P. Generalizations of the familywise error rate. volume 33, pages 1138--1154. 2005

2005

[45] [45]

J., Krebs, T

Meijer, R. J., Krebs, T. J., and Goeman, J. J. Hommel's procedure in linear time. Biometrical Journal, 61 0 (1): 0 73--82, 2019

2019

[46] [46]

S., Dudoit, S., and van der Laan, M

Pollard, K. S., Dudoit, S., and van der Laan, M. J. R package multtest. URL https://www.bioconductor.org/packages/release/bioc/html/multtest.html

[47] [47]

S., Dudoit, S., and van der Laan, M

Pollard, K. S., Dudoit, S., and van der Laan, M. J. Multiple testing procedures: the multtest package and applications to genomics. In Bioinformatics and computational biology solutions using R and bioconductor, pages 249--271. Springer, 2005

2005

[48] [48]

Potter, D. M. A permutation test for inference in logistic regression with small-and moderate-sized data sets. Statistics in medicine, 24 0 (5): 0 693--708, 2005

2005

[49] [49]

and Wang, R

Ramdas, A. and Wang, R. Hypothesis testing with e-values. Foundations and Trends in Statistics , 1 0 (1-2): 0 1--390, 2025. doi:10.1561/STA

work page doi:10.1561/sta 2025

[50] [50]

F., Cand \`e s, E

Ramdas, A., Barber, R. F., Cand \`e s, E. J., and Tibshirani, R. J. Permutation tests using arbitrary permutation distributions. Sankhya A, 85 0 (2): 0 1156--1177, 2023

2023

[51] [51]

Romano, J. P. On the behavior of randomization tests without a group invariance assumption. Journal of the American Statistical Association, 85 0 (411): 0 686--692, 1990

1990

[52] [52]

Romano, J. P. and Shaikh, A. M. On stepdown control of the false discovery proportion. Lecture Notes-Monograph Series, pages 33--50, 2006

2006

[53] [53]

Romano, J. P. and Wolf, M. Stepwise multiple testing as formalized data snooping. Econometrica, 73 0 (4): 0 1237--1282, 2005

2005

[54] [54]

Romano, J. P. and Wolf, M. Control of generalized error rates in multiple testing. The Annals of Statistics, 35 0 (4): 0 1378--1408, 2007

2007

[55] [55]

Romano, J. P. and Wolf, M. Efficient computation of adjusted p-values for resampling-based stepdown multiple testing. Statistics & Probability Letters, 113: 0 38--40, 2016

2016

[56] [56]

Deep knockoffs

Romano, Y., Sesia, M., and Cand \`e s, E. Deep knockoffs. Journal of the American Statistical Association, 115 0 (532): 0 1861--1872, 2020

2020

[57] [57]

Sarkar, S. K. Some probability inequalities for ordered mtp 2 random variables: a proof of the simes conjecture. Annals of Statistics, pages 494--504, 1998

1998

[58] [58]

Solari, A., Finos, L., and Goeman, J. J. Rotation-based multiple testing in the multivariate linear model. Biometrics, 70 0 (4): 0 954--961, 2014

2014

[59] [59]

K., Kim, S

Southworth, L. K., Kim, S. K., and Owen, A. B. Properties of balanced permutations. Journal of Computational Biology, 16 0 (4): 0 625--638, 2009

2009

[60] [60]

Spreij, P. J. Measure theoretic probability. Course Notes, 2023. URL https://staff.fnwi.uva.nl/p.j.c.spreij/onderwijs/master/mtp.pdf

2023

[61] [61]

Storey, J. D. A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64 0 (3): 0 479--498, 2002

2002

[62] [62]

Vesely, A., Finos, L., and Goeman, J. J. Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 0 (3): 0 664--683, 2023

2023

[63] [63]

Elementary proofs of several results on false discovery rate

Wang, R. Elementary proofs of several results on false discovery rate. arXiv preprint arXiv:2201.09350, 2022

work page arXiv 2022

[64] [64]

Westfall, P. H. and Young, S. S. Resampling-based multiple testing: Examples and methods for p-value adjustment, volume 279. John Wiley & Sons, 1993

1993

[65] [65]

M., Ridgway, G

Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., and Nichols, T. E. Permutation inference for the general linear model. Neuroimage, 92: 0 381--397, 2014

2014