Anytime-valid simultaneous lower confidence bounds for the true discovery proportion
Pith reviewed 2026-05-19 13:52 UTC · model grok-4.3
The pith
Combining closed testing with safe anytime-valid inference yields lower bounds on the true discovery proportion that hold at every time point and for every subset of hypotheses.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By merging the closed testing framework with safe anytime-valid inference, the authors construct lower confidence bounds for the true discovery proportion. These bounds remain valid at every observation time point and are simultaneous across all subsets of hypotheses. The underlying hypotheses stay fixed, but the subsets of interest can be chosen or changed at any time. The construction permits sequential updating of the bounds and optional stopping without loss of validity.
What carries the argument
Integration of closed testing with safe anytime-valid inference to produce simultaneous, time-uniform lower bounds on the true discovery proportion.
If this is right
- The bounds can be recomputed after every new observation while retaining exact coverage.
- Data collection may stop at any time chosen by the analyst without invalidating the results.
- Any subset of hypotheses can be examined after seeing the data and still receive a valid lower bound.
- The computational shortcut makes the method feasible for hundreds or thousands of hypotheses.
Where Pith is reading between the lines
- The approach could support adaptive experiments that decide whether to continue based on interim lower bounds.
- It may extend naturally to other sequential multiple-testing metrics such as false discovery proportions.
- The same structure might apply to online decision problems where new tests arrive over time.
Load-bearing premise
The hypotheses under test are fixed in advance and do not change as new data arrive over time.
What would settle it
Simulate data from a known mixture of true and false null hypotheses, apply the procedure sequentially, stop at an arbitrary time, and verify whether the reported lower bound falls below the actual true discovery proportion more often than the nominal error rate allows.
read the original abstract
We propose a method that combines the closed testing framework with the concept of safe anytime-valid inference (SAVI) to compute lower confidence bounds for the true discovery proportion in a multiple testing setting. The proposed procedure provides confidence bounds that are valid at every observation time point and that are simultaneous for all possible subsets of hypotheses. While the hypotheses are assumed to be fixed over time, the subsets of interest may vary. Anytime-valid simultaneous confidence bounds allow us to sequentially update the bounds over time and allow for optional stopping. This is a desirable property in practical applications such as neuroscience, where data acquisition is costly and time-consuming. We also present a computational shortcut which makes the application of the proposed procedure feasible when the number of hypotheses under consideration is large. We illustrate the performance of the proposed method in a simulation study and give some practical guidelines on the implementation of the proposed procedure.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a procedure that combines the closed testing framework with safe anytime-valid inference (SAVI) to construct lower confidence bounds for the true discovery proportion (TDP). These bounds are valid at every observation time point and hold simultaneously for all possible subsets of hypotheses. Hypotheses remain fixed while subsets of interest may vary over time, enabling sequential updating and optional stopping. A computational shortcut is introduced to make the method feasible for large numbers of hypotheses, with performance illustrated through a simulation study and accompanied by practical implementation guidelines.
Significance. If the validity arguments hold, the work provides a useful advance for sequential multiple testing by delivering anytime-valid and simultaneous TDP bounds that support flexible, post-hoc subset analysis without pre-specifying stopping times. The synthesis of closed testing (for simultaneity) and SAVI (for time-uniform validity) directly addresses needs in costly data-acquisition settings such as neuroscience. The computational shortcut and simulation evidence further support practical deployment, strengthening the contribution to adaptive inference methodology.
major comments (1)
- [§4.2] §4.2 (computational shortcut): the claim that the shortcut preserves exact simultaneous validity under closed testing requires an explicit lemma or argument showing that the reduced enumeration does not introduce any gap in coverage for arbitrary subsets; without this, the feasibility claim for large p rests on an unverified preservation property.
minor comments (3)
- [§5] The simulation study would benefit from explicit reporting of coverage at multiple stopping times to illustrate the anytime-valid property beyond fixed-sample results.
- [Throughout] Notation for the TDP estimator and its bounds could be unified across sections to avoid minor inconsistencies in subscript usage between the main text and the appendix.
- [§1] A brief comparison table contrasting the proposed bounds with existing sequential TDP methods (e.g., those based on martingale or e-value approaches) would improve context in the introduction.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the constructive major comment. We address it directly below.
read point-by-point responses
-
Referee: [§4.2] §4.2 (computational shortcut): the claim that the shortcut preserves exact simultaneous validity under closed testing requires an explicit lemma or argument showing that the reduced enumeration does not introduce any gap in coverage for arbitrary subsets; without this, the feasibility claim for large p rests on an unverified preservation property.
Authors: We agree that the current presentation of the computational shortcut in §4.2 would benefit from an explicit argument establishing preservation of exact simultaneous validity. In the revised manuscript we will add a new lemma (Lemma 4.1) that shows the reduced enumeration of intersections still guarantees the required coverage for every possible subset. The lemma proceeds by verifying that any subset whose closed-testing p-value is computed via the shortcut is bounded above by the p-value obtained from the full enumeration, which directly inherits the simultaneous validity from the underlying closed-testing and SAVI construction. This addition removes the unverified step while leaving the computational complexity reduction intact. revision: yes
Circularity Check
Derivation combines established external frameworks without circular reduction
full rationale
The paper integrates the closed testing framework (for simultaneous bounds over all subsets) with safe anytime-valid inference (SAVI) to obtain lower confidence bounds on the true discovery proportion that remain valid at every time point and under optional stopping. The abstract and construction explicitly treat hypotheses as fixed while allowing subsets to vary, and the computational shortcut is presented as preserving exact validity without introducing new assumptions on dependence or uniformity. No quoted equations or steps reduce a claimed result to a fitted parameter renamed as a prediction, a self-definition, or a load-bearing self-citation chain; the central claims rest on independent prior literature rather than tautological re-expression of the paper's own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Hypotheses are fixed over time while subsets of interest may vary
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a method that combines the closed testing framework with the concept of safe anytime-valid inference (SAVI) to compute lower confidence bounds for the true discovery proportion
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
An e-process corresponding to H_I is a nonnegative process (E^[n]_I) adapted to some filtration with E_P[E^[ν]_I] ≤ 1 for any F-stopping time ν
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.