pith. sign in

arxiv: 2605.22004 · v1 · pith:QTDTAEJ2new · submitted 2026-05-21 · 📊 stat.ME

Selecting Informative Conformal Prediction Sets with an Optimized FCR-Controlled Approach

Pith reviewed 2026-05-22 04:26 UTC · model grok-4.3

classification 📊 stat.ME
keywords conformal predictionfalse coverage rateselective inferenceoracle policycalibration procedureprediction setspower optimizationinformative selection
0
0 comments X

The pith

A calibrated oracle-guided policy for informative conformal prediction sets achieves higher power while controlling the false coverage rate.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a method to select informative conformal prediction sets while controlling the false coverage rate on the selected cases. It derives the optimal decision rule in an oracle setting where membership probabilities are known and then calibrates it for use with estimated probabilities to ensure finite-sample validity. A sympathetic reader would care because selective inference on informative sets is common in practice, yet unadjusted methods can suffer from inflated error rates due to selection bias. The calibrated approach is shown to deliver substantially more power than existing alternatives in both simulations and real data for classification problems.

Core claim

In the oracle setting with known probabilities, an optimal policy selects informative prediction sets to maximize power subject to FCR control; a calibration step then adjusts this policy using estimated probabilities to preserve finite-sample FCR control, resulting in higher power than prior methods.

What carries the argument

The oracle-guided optimal decision policy with subsequent calibration to estimated probabilities for finite-sample FCR control.

If this is right

  • The approach maintains valid FCR control on the selected informative cases.
  • It attains substantially higher power than available alternatives.
  • It applies effectively to classification outcomes on real and simulated data.
  • The calibration ensures control even when only estimated probabilities are used.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This framework could be extended to regression tasks or other outcome types beyond classification.
  • Integrating it with adaptive selection criteria might further enhance efficiency in selective inference.
  • Testing the method in high-stakes applications like medical diagnostics could reveal practical benefits.

Load-bearing premise

The calibration procedure adjusts the oracle policy to maintain finite sample FCR control when only estimated probabilities are available.

What would settle it

A simulation in which the empirical false coverage rate on the selected cases exceeds the nominal level after applying the calibrated procedure would disprove the finite-sample FCR control.

Figures

Figures reproduced from arXiv: 2605.22004 by Etienne Roquain, Israela Solomon, Ruth Heller, Saharon Rosset.

Figure 1
Figure 1. Figure 1: An example for K = 3, where I is all sets with 1 or 2 classes, w(C) = 1 |C| , and α = 0.1. In the left panel, the class probabilities are 0.5, 0.3, and 0.2. The upper envelope Ux(µ) consists of ℓx,{1}(µ) for µ ∈ [0, 1 3 ] and ℓx,{1,2}(µ) for µ ∈ ( 1 3 , ∞), with a zero crossing at µ = 4. Therefore C µ (x) is {1} for µ ∈ [0, 1 3 ] and {1, 2} for µ > 1 3 , and Dµ (x) = I{µ ≤ 4}. In the right panel, the class… view at source ↗
Figure 2
Figure 2. Figure 2: A sample of 1,000 iid examples from a K = 4 bivariate normal mixture with mixing weights π = (0.25, 0.25, 0.25, 0.25) in the left panel; π = (0.1, 0.7, 0.1, 0.1) in the right panel. The n = 500 calibration observations are shown as triangles: blue with mean (0, 0), red with mean (2, 0), purple with mean (2, 2) and yellow with mean (0, 2). The m = 500 test observations are shown as green circles, for which … view at source ↗
Figure 3
Figure 3. Figure 3: For the K = 4 bivariate normal mixture data generation described in § 6.1, with π = (0.25, 0.25, 0.25, 0.25) in columns 1 and 2 and π = (0.1, 0.7, 0.1, 0.1) in columns 3 and 4, n=m=500, we compare two selection goals: non-trivial prediction sets (row 1) and exclusion of one class (row 2). We report the FCR (columns 1 and 3) and resolution-adjusted power (columns 2 and 4) for OGInfoSP and OGInfoSP-calOnly, … view at source ↗
Figure 4
Figure 4. Figure 4: For the K = 4 bivariate normal mixture described in § 6.1: logistic regression was trained on 10,000 samples with class probabilities π = (0.25, 0.25, 0.25, 0.25); the calibration and test examples, n = m = 500 in total, were sampled from the Gaussian mixture with shifted class probabilities π = (0.1, 0.7, 0.1, 0.1). We report the FCR (row 1) and resolution￾adjusted power (row 2) for OGInfoSP, OGInfoSP-cal… view at source ↗
Figure 5
Figure 5. Figure 5: For the CIFAR-10 data described in § 6.2, the FCP (column 1) and resolution￾adjusted TCP (column 2) in the setting of non-trivial classification (row 1) and exclusion of the cat class (column 2). The sample sizes are m = n = 500, α = 0.05. Based on 1,000 iterations I(hxy(x, y) > µ) and gµ(x) = I(hx(x) > µ). Expressing Lemma 4.1 in this equivalent form provides a connection to the application of the BH proc… view at source ↗
Figure 6
Figure 6. Figure 6: For the CIFAR-10 data described in § 6.2: the classifier was trained as detailed in § 6.2 using equal class probabilities, but the calibration and test samples had class probabil￾ities (0.2, 0.6, 0.2). We report the FCP (column 1) and resolution-adjusted TCP (column 2) in the setting of non-trivial classification (row 1) and exclusion of the cat class (column 2) for OGInfoSP, OGInfoSP-calOnly, and their la… view at source ↗
Figure 7
Figure 7. Figure 7: Counterexample showing that Assumption E.1 is insufficient to guarantee the nestedness required by Assumption 3.1. Here, K = 3, I = {{1}, {2}, {3}, {2, 3}}, w(C) = 1 |C| , and α = 0.1. The estimated conditional class probabilities are: 0.4 for class 1, 0.35 for class 2, and 0.25 for class 3. The upper envelope Ux(µ) consists of ˆℓX,{1}(µ) = 0.4 + µ(0.4 − 0.9) = 0.4 − 0.5µ for µ < 0.5, and ˆℓX,{2,3}(µ) = (1… view at source ↗
read the original abstract

Conformal methods provide prediction sets for outcomes with confidence guarantees. We study their use in a selective inference setting, where inference is performed only when the prediction set is informative. The analyst may consider as informative, for example, cases with prediction sets that are sufficiently small, exclude null values, or satisfy other appropriate monotone constraints. Because inference is typically restricted to informative cases in practical applications, accounting for the resulting selection bias is crucial to maintaining false coverage rate (FCR) control. A general framework for constructing such informative conformal prediction sets while controlling the FCR on the selected sample was suggested in Gazin et al. (2025). In this work we focus on oracle-guided procedures. We derive the optimal decision policy under a suitable power objective in the oracle setting where the probability of belonging to each prediction set can be computed. In practice, of course, only estimated probabilities are available. We therefore introduce a calibration procedure that adjusts the oracle policy to maintain finite sample FCR control. We show that this approach can achieve substantially higher power than available alternatives. We demonstrate the effectiveness of our new methods for classification outcomes on both real and simulated data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops an oracle-guided framework for selecting informative conformal prediction sets under a monotone constraint while controlling the false coverage rate (FCR) on the selected sample. It derives an optimal decision policy in the oracle setting with known probabilities, then introduces a calibration adjustment that maintains finite-sample FCR control when only estimated probabilities are available. The method is shown to deliver substantially higher power than existing alternatives on both simulated and real classification data.

Significance. If the central claims hold, the work strengthens selective-inference methodology for conformal prediction by supplying an explicitly optimized policy and a practical calibration step that preserves guarantees. The reported power gains over baselines, together with the focus on monotone selection rules, could make FCR-controlled informative sets more usable in applications where analysts wish to restrict inference to sufficiently small or decisive prediction sets.

major comments (2)
  1. [§3] §3 (oracle policy): the optimization problem that yields the claimed optimal policy is not stated explicitly; without the precise objective function, the monotone constraint, and the proof that the derived threshold rule is optimal, it is difficult to verify that the subsequent calibration inherits the desired power properties.
  2. [Calibration section] Calibration section: the finite-sample FCR guarantee is asserted after replacing oracle probabilities with estimates, yet the argument does not quantify how estimation error propagates into the coverage statement or whether the same data used for fitting the probabilities is also used for the final evaluation; an explicit error bound or a data-splitting argument is needed to close this gap.
minor comments (2)
  1. [Abstract / Introduction] The abstract cites Gazin et al. (2025) as the general framework; the introduction should clarify precisely which results are extended and which are taken as given.
  2. [Experiments] Figure captions and table legends should explicitly define the power metric and the FCR estimator used in the simulations so that readers can reproduce the reported gains.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and will make revisions to improve the clarity and rigor of the presentation.

read point-by-point responses
  1. Referee: [§3] §3 (oracle policy): the optimization problem that yields the claimed optimal policy is not stated explicitly; without the precise objective function, the monotone constraint, and the proof that the derived threshold rule is optimal, it is difficult to verify that the subsequent calibration inherits the desired power properties.

    Authors: We agree that the optimization problem should be stated more explicitly. In the revised manuscript we will present the precise objective of maximizing expected power (defined as the expected fraction of selected instances that receive informative sets) subject to the FCR constraint and the monotone selection constraint. We will also include a short proof that the optimal policy under this formulation is a threshold rule on the oracle probabilities. These additions will make it straightforward to verify that the subsequent calibration step preserves the power properties of the oracle policy. revision: yes

  2. Referee: [Calibration section] Calibration section: the finite-sample FCR guarantee is asserted after replacing oracle probabilities with estimates, yet the argument does not quantify how estimation error propagates into the coverage statement or whether the same data used for fitting the probabilities is also used for the final evaluation; an explicit error bound or a data-splitting argument is needed to close this gap.

    Authors: We acknowledge that the current presentation of the finite-sample guarantee could be strengthened with respect to estimation error. In the revision we will add an explicit data-splitting argument: probability estimates are obtained on a training fold, while the conformal calibration and final evaluation are performed on a held-out fold. This separation ensures that the FCR control holds conditionally on the estimates without requiring a quantitative bound on estimation error. We will also include a brief discussion of the procedure’s robustness when splitting is not feasible. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper first cites Gazin et al. (2025) for the general FCR-controlling framework on informative conformal sets, then derives an optimal oracle policy under a monotone selection constraint and explicit power objective in the setting where true probabilities are known. A subsequent calibration step adjusts this policy for the practical case of estimated probabilities while preserving finite-sample FCR control. Neither step reduces to a self-definition, a fitted input relabeled as a prediction, or a load-bearing self-citation whose validity depends on the current work; the oracle optimization and calibration adjustment are presented as independent contributions whose correctness can be checked against external conformal and selective-inference benchmarks. Reported simulations and real-data experiments provide separate empirical support for the claimed power gains.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of an oracle where set-membership probabilities are known exactly and on the validity of a post-hoc calibration that preserves FCR control under estimated probabilities.

axioms (1)
  • standard math Conformal prediction sets provide marginal coverage guarantees under exchangeability.
    Invoked as the foundation for FCR control in selective settings.

pith-pipeline@v0.9.0 · 5738 in / 1212 out tokens · 39490 ms · 2026-05-22T04:26:55.104347+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    Proceedings of the 37th International Conference on Machine Learning , pages =

    Online Control of the False Coverage Rate and False Sign Rate , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , editor =

  2. [2]

    and Yekutieli, D

    Weinstein, A. and Yekutieli, D. , journal=

  3. [3]

    Conformalized Multiple Testing after Data-dependent Selection , url =

    Wang, Xiaoning and Huo, Yuyang and Peng, Liuhua and Zou, Changliang , booktitle =. Conformalized Multiple Testing after Data-dependent Selection , url =. doi:10.52202/079017-1867 , editor =

  4. [4]

    , title =

    Sharir, Micha and Agarwal, Pankaj K. , title =. 1995 , isbn =

  5. [5]

    2009 , publisher=

    Learning multiple layers of features from tiny images , author=. 2009 , publisher=

  6. [6]

    SIAM Journal on Scientific Computing , volume=

    A limited memory algorithm for bound constrained optimization , author=. SIAM Journal on Scientific Computing , volume=

  7. [7]

    Algorithm 778:

    Zhu, Ciyou and Byrd, Richard H and Lu, Peihuang and Nocedal, Jorge , journal=. Algorithm 778:

  8. [8]

    International Conference on Machine Learning , pages=

    On calibration of modern neural networks , author=. International Conference on Machine Learning , pages=. 2017 , organization=

  9. [9]

    , title =

    Karatzas, Ioannis and Shreve, Steven E. , title =. 1991 , volume =

  10. [10]

    Conformal Inference for Cell Type Prediction with Graph-Structured Constraints

    Corbetta, Daniela and Finos, Livio and Risso, Davide. Conformal Inference for Cell Type Prediction with Graph-Structured Constraints. Methodological and Applied Statistics and Demography II. 2025

  11. [11]

    Biometrika , volume=

    An adaptive null proportion estimator for false discovery rate control , author=. Biometrika , volume=. 2025 , publisher=

  12. [12]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Confidence on the focal: conformal prediction with selection-conditional coverage , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2025 , publisher=

  13. [13]

    Testing for outliers with conformal p-values , author=. Ann. Statist. , volume=. 2023 , publisher=

  14. [14]

    Electronic Journal of Statistics , volume =

    David Mary and Etienne Roquain , title =. Electronic Journal of Statistics , volume =. 2022 , doi =

  15. [15]

    The Annals of Statistics , volume=

    Adaptive novelty detection with false discovery rate guarantee , author=. The Annals of Statistics , volume=. 2024 , publisher=

  16. [16]

    Classification with Valid and Adaptive Coverage , url =

    Romano, Yaniv and Sesia, Matteo and Candes, Emmanuel , booktitle =. Classification with Valid and Adaptive Coverage , url =

  17. [17]

    Candes , title =

    Ying Jin and Emmanuel J. Candes , title =. Journal of Machine Learning Research , year =

  18. [18]

    Biometrika , pages=

    Selective conformal inference with false coverage-statement rate control , author=. Biometrika , pages=. 2024 , publisher=

  19. [19]

    Controlling the false discovery rate: a practical and powerful approach to multiple testing , volume =

    Benjamini, Yoav and Hochberg, Yosef , coden =. Controlling the false discovery rate: a practical and powerful approach to multiple testing , volume =. J. Roy. Statist. Soc. Ser. B , mrclass =

  20. [20]

    Journal of the American Statistical Association , volume=

    False discovery rate--adjusted multiple confidence intervals for selected parameters , author=. Journal of the American Statistical Association , volume=. 2005 , publisher=

  21. [21]

    Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =

    Selecting informative conformal prediction sets with false coverage rate control , author =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. 2025 , doi =

  22. [22]

    and Yekutieli, Daniel , title =

    Benjamini, Yoav and Krieger, Abba M. and Yekutieli, Daniel , title =. Biometrika , volume =. 2006 , month =. doi:10.1093/biomet/93.3.491 , url =

  23. [23]

    2023 , eprint=

    Controlling FSR in Selective Classification , author=. 2023 , eprint=