BOOST: Power-Optimal Strong-FWER Testing for Block-Structured Multiplicity
Pith reviewed 2026-06-29 15:16 UTC · model grok-4.3
The pith
BOOST is the power-optimal strong-FWER procedure for hypotheses grouped in blocks of size three.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BOOST attains power optimality for block size three by solving the equalized-marginal KKT condition that equalizes marginal power contributions across heterogeneous blocks; the resulting allocation yields finite-sample strong FWER validity at O(K) cost and a strict improvement over Sidak when blocks are independent.
What carries the argument
The equalized-marginal KKT condition that determines the error-rate allocation across blocks to maximize total power subject to strong FWER control.
If this is right
- Finite-sample strong FWER validity holds without any independence assumptions at O(K) cost.
- Under cross-block independence the procedure strictly dominates Sidak in power.
- A sample-split plug-in version controls FWER up to an additive term linear in the sup-norm estimation error of the alternative density.
- Simulations and two published datasets show 1.4-1.7 times more discoveries than the strongest baseline at calibrated FWER.
Where Pith is reading between the lines
- The same KKT-based allocation idea may extend to block sizes larger than three once the corresponding optimality conditions are characterized.
- The sample-split plug-in construction suggests a general route for handling unknown alternative distributions in other structured testing settings.
- Applications to genomics and online experiments indicate the procedure can increase the number of certifiable discoveries in any confirmatory analysis whose design already imposes blocks.
Load-bearing premise
The equalized-marginal KKT condition is solvable and produces the global power maximum inside the block-separable class.
What would settle it
Any procedure inside the block-separable class that, on the same data, rejects more hypotheses than BOOST while keeping the realized strong FWER at or below the nominal level.
Figures
read the original abstract
Structured multiple-testing problems (gatekeeping trials, dose-finding, multi-tissue eQTL mapping, bundled-challenger A/B experiments) organize hypotheses into design-imposed blocks and demand strong family-wise error rate (FWER) control for confirmatory claims. Practitioners currently use objective-agnostic stepwise rules (Bonferroni, Holm, Hochberg, Hommel), closed-testing and graphical extensions, or hierarchical and resampling methods; none is power-optimal within the block-separable class these designs induce. We introduce BOOST (Block-Optimal Objective-driven Strong-FWER Testing), the power-optimal strong-FWER procedure for block size three, with three guarantees: (i) finite-sample strong-FWER validity at $O(K)$ cost (versus $O(K^2)$ for general closed testing) without independence assumptions, with a strict Sidak improvement under cross-block independence; (ii) power-optimal allocation across heterogeneous blocks via an equalized-marginal KKT condition, solvable by bisection in $O(B\log(1/\varepsilon))$; and (iii) a sample-split plug-in variant for unknown alternative density $g$, attaining $\alpha$-control up to $O(B_T \mathbb E\|g-\widehat g\|_\infty)$ inflation with per-hypothesis power deficit independent of $B_T$. Simulations across independent, equicorrelated, sparse, and mis-specified regimes show 1.4-1.7$\times$ power gains over the strongest existing baseline at calibrated FWER. On two published datasets (BLUEPRINT cross-lineage cis-eQTL and Upworthy bundled-challenger A/B experiments), BOOST certifies an order of magnitude more full-block discoveries than existing baselines at controlled FWER.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces BOOST, a procedure for strong-FWER control in block-structured multiple testing with block size three. It claims finite-sample strong-FWER validity at O(K) cost without independence assumptions (with Sidak improvement under cross-block independence), power-optimality within the block-separable class via an equalized-marginal KKT condition solved by bisection in O(B log(1/ε)), a sample-split plug-in variant for unknown alternative density g with controlled inflation, and 1.4-1.7× power gains over baselines in simulations plus more discoveries on two real datasets at controlled FWER.
Significance. If the finite-sample validity and global optimality claims hold, the work would advance methodology for confirmatory structured testing (e.g., gatekeeping, eQTL, A/B experiments) by providing the first power-optimal rule in the block-separable class together with linear-time computation, offering both theoretical and practical improvements over stepwise, closed-testing, and graphical methods.
major comments (3)
- [Abstract] Abstract: finite-sample strong-FWER validity is asserted with no derivation, proof sketch, or theorem reference; this guarantee is load-bearing for all subsequent claims including the O(K) procedure and plug-in variant.
- [Abstract (KKT allocation paragraph)] Abstract (paragraph on KKT allocation): the claim that the equalized-marginal KKT condition yields the global power optimum (and that no other rule in the block-separable class can exceed it) assumes the Lagrangian admits a unique global solution, but provides no argument that the power objective is strictly concave in the per-block thresholds or that the strong-FWER constraint set is convex in the allocation variables; without this, the bisection solver may locate only a local stationary point or fail to exist for some p-value distributions.
- [Simulations across independent, equicorrelated, sparse, and mis-specified regimes] Simulations section: the reported 1.4-1.7× power gains give no detail on how FWER was calibrated for each baseline or whether post-hoc tuning occurred, undermining the ability to attribute gains specifically to the KKT optimality rather than calibration differences.
minor comments (2)
- [Abstract] The O(K) vs. O(K²) complexity comparison with general closed testing is stated without an explicit algorithmic complexity breakdown or pseudocode for the bisection solver.
- [Abstract] Notation for the plug-in bound O(B_T E||g - ĝ||_∞) is introduced without defining B_T or the precise form of the per-hypothesis power deficit.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the presentation of our finite-sample guarantees and simulation details. We respond to each major comment below and indicate planned revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: finite-sample strong-FWER validity is asserted with no derivation, proof sketch, or theorem reference; this guarantee is load-bearing for all subsequent claims including the O(K) procedure and plug-in variant.
Authors: We agree that the abstract should explicitly reference the supporting result. The finite-sample strong-FWER validity is established in Theorem 1 (Section 3), which derives the O(K) procedure from the block-wise Sidak bound without independence assumptions. In the revision we will insert a parenthetical reference to Theorem 1 immediately after the validity claim in the abstract. revision: yes
-
Referee: [Abstract (KKT allocation paragraph)] Abstract (paragraph on KKT allocation): the claim that the equalized-marginal KKT condition yields the global power optimum (and that no other rule in the block-separable class can exceed it) assumes the Lagrangian admits a unique global solution, but provides no argument that the power objective is strictly concave in the per-block thresholds or that the strong-FWER constraint set is convex in the allocation variables; without this, the bisection solver may locate only a local stationary point or fail to exist for some p-value distributions.
Authors: The referee correctly notes that the abstract does not spell out the concavity/convexity argument. Within the block-separable class the equalized-marginal condition is obtained directly from the KKT stationarity requirement on the separable Lagrangian; uniqueness follows from the strict monotonicity of the marginal power functions under the maintained regularity conditions on g. The bisection solver is guaranteed to locate the unique root because the left-hand side of the equalized-marginal equation is strictly decreasing. We will add a one-sentence clarification of this monotonicity in the abstract and a short paragraph in Section 4 referencing the relevant properties of the power objective. revision: partial
-
Referee: [Simulations across independent, equicorrelated, sparse, and mis-specified regimes] Simulations section: the reported 1.4-1.7× power gains give no detail on how FWER was calibrated for each baseline or whether post-hoc tuning occurred, undermining the ability to attribute gains specifically to the KKT optimality rather than calibration differences.
Authors: We agree that additional calibration details are needed. In the revised simulations section we will report, for each baseline, the exact nominal level at which it was run, the method used to enforce exact FWER control (e.g., closed-testing or resampling), and confirmation that no post-hoc adjustment was applied. This will make clear that the observed power advantage is attributable to the KKT allocation rather than differential calibration. revision: yes
Circularity Check
No circularity; optimality derived directly from KKT conditions on stated optimization problem
full rationale
The paper formulates power maximization under strong-FWER constraints as an explicit optimization problem, derives the equalized-marginal KKT stationarity condition from the Lagrangian, and solves it via bisection. This is a standard first-principles derivation from the defined objective and constraints, not a fit to data, self-referential definition, or load-bearing self-citation. Finite-sample validity and simulation power gains are established independently. No quoted steps reduce by construction to inputs or prior author results; the global-optimality question is a convexity/correctness issue outside circularity analysis.
Axiom & Free-Parameter Ledger
free parameters (1)
- block allocation parameters
axioms (2)
- domain assumption Hypotheses are partitioned into blocks of size three by the experimental design
- domain assumption Strong FWER is the relevant error criterion for confirmatory claims
Reference graph
Works this paper leans on
-
[1]
URL https://onlinelibrary.wiley.com/doi/abs/10
doi: 10.1002/sim.3495. URL https://onlinelibrary.wiley.com/doi/abs/10. 1002/sim.3495. Frank Bretz, Martin Posch, Ekkehard Glimm, Florian Klinglmueller, Willi Maurer, and Kornelius Rohmeyer. Graphical approaches for multiple comparison procedures using weighted Bonferroni, Simes, or parametric tests.Biometrical Journal, 53(6):894–913,
-
[2]
URL https://onlinelibrary.wiley.com/doi/ abs/10.1002/bimj.201000239
doi: 10.1002/bimj.201000239. URL https://onlinelibrary.wiley.com/doi/ abs/10.1002/bimj.201000239. Lu Chen, Bing Ge, Francesco Paolo Casale, Louella Vasquez, Tony Kwan, Diego Garrido- Mart´ ın, Stephen Watt, Ying Yan, Kousik Kundu, Simone Ecker, Avik Datta, David Richardson, Frances Burden, Daniel Mead, Alice L. Mann, Jose Maria Fernandez, Sophia Rowlston,...
-
[3]
Family-wise Error Rate Control with E-values
ISBN 9780521864015. doi: 10.1017/CBO9781139020893. Will Hartog and Lihua Lei. Family-wise error rate control with e-values, 2025. URL https://arxiv.org/abs/2501.09015. Yosef Hochberg. A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75(4):800–802, 1988. ISSN 00063444. URL http://www.jstor.org/ stable/2336325. Yosef Hochberg a...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1017/cbo9781139020893 2025
-
[4]
at level α∗,ind b applied to Q(b) ⃗hK . In the homogeneous special case (Assumption S2.1), Π∗,ind K = π3(1 − (1 −α )1/B)and the optimal allocation is the uniform ˇSid´ ak splitα(b) blk = 1 − (1 −α )1/B, strictly dominating the Bonferroni split α/B for B > 1and any α∈ (0, 1) on whichπ 3 is strictly increasing. Proof. By Theorem 3.9, every Dsep ∈D ind sep w...
2025
-
[5]
(Remark S2.3). Proof. (i)If ⃗D(b) depends only on X(b), then {V (b) > 0} ∈σ (X(b)) by composition.(ii) Under (1), the unordered block X(b) = ( u(b) 1 , u(b) 2 , u(b) 3 ) has product density Q i ˜gi(u(b) i ) with ˜gi =1 [0,1] if η(b) i = 0 and ˜gi = g if η(b) i = 1. Mapping to the ordered simplex Q = {u1 ≤u 2 ≤u 3} of volume 1 /3! multiplies by the symmetr...
2014
-
[6]
Cross-block equicorrelation: Xk = √ρZ0 + √1−ρZ k with a single latent factor Z0 coupling allK, forρ∈ {0.2,0.4,0.6,0.8,0.95}
-
[7]
Findings.Figure 15 reports two FWER statistics per regime under the complete null: the global FWER P(∪k{k∈ R} ) and the average per-block FWER B−1P b P(Eb)
1-factor: Xk = λkZ0 + p 1−λ 2 kZk with block-constant heterogeneous loadings λk averaging ¯λ∈ {0.1,0.3,0.5,0.7,0.9}. Findings.Figure 15 reports two FWER statistics per regime under the complete null: the global FWER P(∪k{k∈ R} ) and the average per-block FWER B−1P b P(Eb). Under independence, global FWER is 0.049 (nominal) and block-level is 0.005 ≈α/B , ...
2008
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.