pith. sign in

arxiv: 2604.11812 · v2 · submitted 2026-04-10 · 🧮 math.ST · stat.ME· stat.TH

Confidence envelopes for the false discoveries with heterogeneous data

Pith reviewed 2026-05-10 17:28 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH
keywords confidence envelopesfalse discoveriesselective inferenceheterogeneous datadiscrete p-valuesBretagnolle inequalitySimes inequalitymultiple testing
0
0 comments X

The pith

New confidence envelopes tighten bounds on false discoveries for heterogeneous discrete p-values

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to construct confidence envelopes for the number of false discoveries in any chosen subset of hypotheses, with guarantees that hold even when the data are heterogeneous. Prior approaches assumed uniform p-value distributions under the null, which can lead to overly large estimates of false discoveries and thus reduced power when p-values are actually discrete and vary across tests. By introducing tools such as the Bretagnolle inequality and a new variant of the Simes inequality, the authors bridge the homogeneous constructions to this more general setting and derive several new envelope procedures. These are shown through simulations to offer improvements over the homogeneous versions in terms of bound tightness.

Core claim

We bridge the previous constructions under the homogeneous case with new tools. We also apply these tools to propose several confidence envelopes based on tools tailored for heterogeneous data, like the Bretagnolle inequality, or a new variant of the Simes inequality. We compare these new envelopes to their homogeneous counterparts on simulated data.

What carries the argument

The key machinery consists of confidence envelope constructions adapted from local test families, paths, or interpolation, now using the Bretagnolle inequality and a new variant of the Simes inequality to accommodate heterogeneous discrete distributions of p-values under the null.

Load-bearing premise

The new inequalities tailored for heterogeneous data, including Bretagnolle and the modified Simes variant, yield valid bounds that are tighter than those from homogeneous assumptions when p-values have discrete heterogeneous distributions.

What would settle it

A controlled experiment generating p-values from known heterogeneous discrete null distributions, checking whether the actual false discovery counts exceed the envelope bounds more often than the nominal error rate allows.

Figures

Figures reproduced from arXiv: 2604.11812 by CELESTE), DATASHAPE), Etienne Roquain (LPSM (UMR\_8001)), Gilles Blanchard (LMO, Guillermo Durand (CELESTE, LMO), Romain P\'erier (LMO, Sebastian D\"ohler.

Figure 1
Figure 1. Figure 1: Graphs of the cdf of U([0, 1]) (black) and the cdf of the p-value under the null testing if a binomial with 5, 15 or 30 trials is of parameter 1 2 or not (other colors). Heterogeneity has already been investigated as a way to correct the conservativeness in￾duced by discrete tests, for the FWER control (Tarone, 1990), for the FDR control (Gilbert, 11 [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Upper bounds ζ(·) provided by the DKW and Bretagnolle inequalities, with their respective adaptive bounds. We set π0 = 0.2 and q = 0.4. Lower is better. because the signal is strong and large (we used π0 = 0.2 and q = 0.4), which is a favourable case for adaptive methods [PITH_FULL_IMAGE:figures/full_fig_p039_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Lower bounds on number of true positives in [PITH_FULL_IMAGE:figures/full_fig_p040_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Upper bounds ζ(·) provided by Simes and heterogeneous Simes inequalities, with their respective adaptive bounds. We set π0 = 0.2, π ′ 0 = 0.5 and q = 0.4. Lower is better. 7.4 Comparison with other homogeneous bounds We compare our new envelopes for heterogeneous data with the bounds from Meah et al. (2024) and Katsevich and Ramdas (2020) provided in the homogeneous case. We only show here the adaptive ver… view at source ↗
Figure 5
Figure 5. Figure 5: Lower bounds on number of true positives in [PITH_FULL_IMAGE:figures/full_fig_p042_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Upper bounds ζ(·) provided by adaptive bounds from Meah et al. (2024), Katse￾vich and Ramdas (2020) and our adaptive heterogeneous bounds. We set π0 = 0.2, π ′ 0 = 0.5 and q = 0.4. Lower is better. ses. These bounds are less conservative than those obtained via a naive homogenization of the local test family. Consequently, when plugged in into the JER methodology, these estimators provide sharper, more eff… view at source ↗
Figure 7
Figure 7. Figure 7: Lower bounds on number of true positives in [PITH_FULL_IMAGE:figures/full_fig_p044_7.png] view at source ↗
read the original abstract

In the context of selective inference, confidence envelopes for the false discoveries allow the user to select any subset of null hypotheses while having a statistical guarantee on the number of false discoveries in the selected set. Many constructions of such envelopes have been proposed recently, using local test families (Genovese and Wasserman, 2006; Goeman and Solari, 2011), paths (Katsevich and Ramdas, 2020) or interpolation (Blanchard et al., 2020a). All those methods have in common that they have been well-studied for the homogeneous case where all p-values under the null have a uniform distribution over [0, 1]. However, in many applications the data are heterogeneous and discrete, hence the p-values have heterogeneous, discrete distributions, and the previous constructions may incur a loss of power, in the sense that they over-estimate the number of false discoveries. In this paper, we bridge the previous constructions under the homogeneous case with new tools. We also apply these tools to propose several confidence envelopes based on tools tailored for heterogeneous data, like the Bretagnolle inequality, or a new variant of the Simes inequality. We compare these new envelopes to their homogeneous counterparts on simulated data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper extends confidence envelopes for the number of false discoveries in selective inference to the setting of heterogeneous discrete p-values. It bridges existing constructions (local tests, paths, interpolation) developed for the homogeneous uniform case and introduces new envelopes based on the Bretagnolle inequality and a modified Simes inequality that are claimed to remain valid while being less conservative under heterogeneity. The new methods are compared to their homogeneous counterparts via simulation.

Significance. If the validity claims hold, the work would be useful for applications with discrete or heterogeneous data (e.g., genomics, clinical trials) by reducing over-estimation of false discoveries and thereby increasing power. The bridging of prior homogeneous methods with new tailored inequalities is a natural and potentially valuable contribution, provided the theoretical guarantees and simulation evidence are rigorous.

major comments (2)
  1. [§4] §4 (Bretagnolle-based envelope): the proof that the envelope remains valid under heterogeneous discrete null distributions relies on an application of Bretagnolle's inequality that is only sketched; the exact form of the bound and the handling of the discrete support points must be stated explicitly to confirm that the coverage guarantee is not lost.
  2. [§5] §5 (modified Simes inequality): the new variant is presented as controlling the false-discovery envelope, but the manuscript does not provide a self-contained proof or a clear statement of the additional assumptions (e.g., independence or positive dependence) required for the discrete heterogeneous case; without this, it is unclear whether the claimed improvement over the classical Simes envelope is guaranteed.
minor comments (2)
  1. [Simulation study] The simulation section would benefit from reporting the exact discrete distributions used for the null p-values and the number of Monte Carlo replications, to allow readers to assess variability of the reported power gains.
  2. [Introduction / §2] Notation for the heterogeneous p-value distributions (e.g., the family of CDFs F_i) should be introduced earlier and used consistently when stating the new inequalities.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive suggestions. The comments highlight areas where additional detail will improve the clarity and rigor of the theoretical arguments. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [§4] §4 (Bretagnolle-based envelope): the proof that the envelope remains valid under heterogeneous discrete null distributions relies on an application of Bretagnolle's inequality that is only sketched; the exact form of the bound and the handling of the discrete support points must be stated explicitly to confirm that the coverage guarantee is not lost.

    Authors: We agree that the sketch in §4 should be expanded for full transparency. In the revised version we will state the precise form of Bretagnolle's inequality employed, derive the envelope bound step by step, and explicitly describe how the finite discrete support of each heterogeneous null p-value distribution is incorporated into the argument. This will make the coverage guarantee under heterogeneity fully rigorous and verifiable. revision: yes

  2. Referee: [§5] §5 (modified Simes inequality): the new variant is presented as controlling the false-discovery envelope, but the manuscript does not provide a self-contained proof or a clear statement of the additional assumptions (e.g., independence or positive dependence) required for the discrete heterogeneous case; without this, it is unclear whether the claimed improvement over the classical Simes envelope is guaranteed.

    Authors: We acknowledge that the presentation of the modified Simes inequality in §5 lacks a self-contained proof and an explicit list of assumptions. In the revision we will supply a complete proof of the new variant, state the precise conditions (including any independence or positive dependence requirements adapted to the discrete heterogeneous setting), and show how these conditions guarantee envelope control while yielding the reported improvement over the classical homogeneous Simes construction. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper extends homogeneous-case confidence envelopes (citing Genovese-Wasserman, Goeman-Solari, Katsevich-Ramdas, and Blanchard et al. 2020a) to heterogeneous discrete p-values by introducing bridging tools and new constructions based on the Bretagnolle inequality and a modified Simes inequality. These new inequalities are external to the paper and not derived from its own fitted parameters or self-referential definitions. The self-citation to prior Blanchard work supports only the homogeneous baseline and does not carry the load of the heterogeneous extensions or validity claims. Simulations are presented as comparative evidence rather than the sole justification for the results. No step reduces by construction to its inputs, and the central claims remain independently verifiable via the cited external inequalities.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard multiple-testing assumptions about p-value distributions under the null, extended to heterogeneous discrete cases via known inequalities; no free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)
  • domain assumption P-values under the null hypothesis follow known but heterogeneous discrete distributions
    This is the core setting stated in the abstract for which prior methods lose power and new envelopes are proposed.

pith-pipeline@v0.9.0 · 5562 in / 1210 out tokens · 49715 ms · 2026-05-10T17:28:12.280624+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

  1. [1]

    63 Remark F.2.The proposition and its proof prove that local tests of the form(26)can be rewritten in the form(7)by lettingℓ i:n =f −1 n (i)

    =ψ(k 0) = ˆV IP φ (S) ≤ψ(k)≤ f ˆm0 p(k:S) + S\R p(k:S) , which entails Equation (27). 63 Remark F.2.The proposition and its proof prove that local tests of the form(26)can be rewritten in the form(7)by lettingℓ i:n =f −1 n (i). Furthermore, theˆm0 given by Equation(28) is then exactly theˆm0 given by Equation(9). F.15 Proofs of Lemma 4.12 and Proposition ...

  2. [2]

    Lett∈(0,1]

    +· · ·+ H(λ, t) [n−i+1] ≤(i+ 1)α ≥ℓ i:n, thusˆm hom 0 = ˆV SC2 φhom J1, mK by Lemma 4.7. Lett∈(0,1]. Sinceφ hom J1,mK,n,t =1 t≤ℓ (n+i(t)−m):n for alln∈J1, mKby Lemma 4.5, then b(t) = max n n∈J0, mK:φ hom J1,mK,n,t = 0 o = max n∈J0, mK:t > ℓ (n+i(t)−m):n . Thus, by def- inition of(ℓ i:n),b(t) = max ( n∈J0, mK: m−i(t)+1P k=1 H(λ, t) [k] >(n+i(t)−m)α ) , and...