Selective Conformal Risk Control
Pith reviewed 2026-05-16 22:05 UTC · model grok-4.3
The pith
Selective Conformal Risk Control shrinks prediction sets by filtering to confident samples before calibration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that formulating uncertainty control as selective classification followed by conformal risk control on the selected subset allows construction of calibrated prediction sets that achieve target coverage and risk levels, with SCRC-T providing exact finite-sample guarantees via joint thresholds and SCRC-I offering efficient PAC-style guarantees.
What carries the argument
The two-stage process of first selecting confident samples via a selection rule and then applying conformal risk control on that subset, implemented in the joint-threshold SCRC-T variant and the calibration-only SCRC-I variant.
If this is right
- Prediction sets become more compact than those produced by standard conformal prediction while still meeting coverage targets.
- SCRC-T delivers exact finite-sample coverage guarantees through joint threshold computation over calibration and test samples.
- SCRC-I achieves similar performance with PAC-style probabilistic guarantees and lower computational cost.
- Both methods maintain the desired risk levels on the selected subset across tested datasets.
Where Pith is reading between the lines
- The approach could support real-time deployment in domains like medical diagnostics by reducing set sizes without losing reliability.
- Different selection criteria might be tested to see how they trade off set size against guarantee tightness.
- The framework might combine with other uncertainty methods to handle structured outputs such as sequences or graphs.
Load-bearing premise
The selection of confident samples preserves the exchangeability properties needed for the conformal guarantees to hold on the selected subset.
What would settle it
Finding that empirical coverage on the selected test samples drops below the target level in experiments where the selection rule introduces dependence that violates exchangeability between calibration and test points.
Figures
read the original abstract
Reliable uncertainty quantification is essential for deploying machine learning systems in high-stakes domains. Conformal prediction provides distribution-free coverage guarantees but often produces overly large prediction sets, limiting its practical utility. To address this issue, we propose \textit{Selective Conformal Risk Control} (SCRC), a unified framework that integrates conformal prediction with selective classification. The framework formulates uncertainty control as a two-stage problem: the first stage selects confident samples for prediction, and the second stage applies conformal risk control on the selected subset to construct calibrated prediction sets. We develop two algorithms under this framework. The first, SCRC-T, preserves exchangeability by computing thresholds jointly over calibration and test samples, offering exact finite-sample guarantees. The second, SCRC-I, is a calibration-only variant that provides PAC-style probabilistic guarantees while being more computational efficient. Experiments on two public datasets show that both methods achieve the target coverage and risk levels, with nearly identical performance, while SCRC-I exhibits slightly more conservative risk control but superior computational practicality. Our results demonstrate that selective conformal risk control offers an effective and efficient path toward compact, reliable uncertainty quantification.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Selective Conformal Risk Control (SCRC), a two-stage framework that first selects confident samples via a data-dependent rule and then applies conformal risk control on the selected subset to produce calibrated prediction sets with controlled risk. It introduces SCRC-T, which jointly computes thresholds over the combined calibration and test pool to claim exact finite-sample coverage guarantees, and SCRC-I, a calibration-only variant providing PAC-style probabilistic guarantees with improved efficiency. Experiments on two public datasets are reported to achieve the target coverage and risk levels for both methods.
Significance. If the coverage claims hold, the work offers a principled route to smaller prediction sets than standard conformal methods by incorporating selective classification, which could improve practical utility in high-stakes settings without sacrificing distribution-free guarantees. The joint-threshold approach in SCRC-T, if rigorously justified, would be a notable technical contribution over naive post-selection conformal methods.
major comments (2)
- [SCRC-T algorithm and guarantee statement] The exact finite-sample coverage claim for SCRC-T rests on the assertion that joint threshold computation over the full calibration+test pool preserves exchangeability for the data-dependent selected subset. No derivation is supplied showing that the coverage inequality continues to hold after selection when the selection rule depends on the same scores or nonconformity values used for thresholding; standard conformal arguments apply to the full exchangeable pool but do not automatically transfer to the induced random subset. A detailed proof or counter-example analysis is required in the SCRC-T section.
- [Experiments] The experimental section reports that both methods achieve target coverage and risk levels on two public datasets but supplies no error bars, no description of data splits or exclusion rules, and no comparison against standard conformal baselines or selective classification methods without conformal control. These omissions make it impossible to assess whether the observed performance supports the claimed advantage in compactness while preserving guarantees.
minor comments (2)
- [Method overview] Notation for the selection function and the joint threshold computation should be introduced with explicit definitions before the guarantee statements to improve readability.
- [Abstract and §1] The abstract and introduction should clarify whether the risk control is on the selected subset only or includes a risk term for the rejected samples.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper to incorporate the requested clarifications and additions.
read point-by-point responses
-
Referee: [SCRC-T algorithm and guarantee statement] The exact finite-sample coverage claim for SCRC-T rests on the assertion that joint threshold computation over the full calibration+test pool preserves exchangeability for the data-dependent selected subset. No derivation is supplied showing that the coverage inequality continues to hold after selection when the selection rule depends on the same scores or nonconformity values used for thresholding; standard conformal arguments apply to the full exchangeable pool but do not automatically transfer to the induced random subset. A detailed proof or counter-example analysis is required in the SCRC-T section.
Authors: We agree that the current manuscript states the exact finite-sample coverage for SCRC-T without supplying a full derivation. In the revised version we will add a rigorous proof in the SCRC-T section. The proof will show that joint threshold selection over the combined calibration and test pool preserves the exchangeability of the selected subset even when the selection rule is a function of the same nonconformity scores, by explicitly tracking the dependence and verifying that the coverage inequality still holds via the standard conformal argument applied to the augmented pool. We will also include a brief discussion of edge cases and any necessary counter-example checks. revision: yes
-
Referee: [Experiments] The experimental section reports that both methods achieve target coverage and risk levels on two public datasets but supplies no error bars, no description of data splits or exclusion rules, and no comparison against standard conformal baselines or selective classification methods without conformal control. These omissions make it impossible to assess whether the observed performance supports the claimed advantage in compactness while preserving guarantees.
Authors: We acknowledge these omissions limit the interpretability of the results. In the revision we will add error bars from repeated runs with different random seeds, provide explicit descriptions of the data splits and any exclusion rules applied, and include direct comparisons against standard conformal prediction baselines as well as selective classification methods that do not use conformal risk control. These additions will allow readers to evaluate both the compactness gains and the empirical validity of the coverage and risk guarantees. revision: yes
Circularity Check
No significant circularity; guarantees rest on standard exchangeability without reduction to fitted inputs or self-citations
full rationale
The paper's central claim is that SCRC-T achieves exact finite-sample coverage by jointly computing thresholds over the combined calibration and test pool before selection, thereby preserving exchangeability. This is presented as a direct consequence of the standard conformal prediction exchangeability assumption applied to the joint set, rather than any self-definitional loop, fitted parameter renamed as prediction, or load-bearing self-citation. No equations in the provided abstract or description reduce the coverage guarantee to a quantity defined by the selection rule itself. SCRC-I is explicitly distinguished as providing only PAC-style bounds. The derivation chain therefore remains self-contained against external conformal theory benchmarks and does not trigger any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Data samples are exchangeable
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Lemma 1. Suppose (X1,Y1),…,(Xn+1,Yn+1) are exchangeable, and let I be a symmetric selection rule… Then, conditional on EI, the subcollection {(Xi,Yi)}i∈I is exchangeable.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
SCRC-T preserves exchangeability by computing thresholds jointly over calibration and test samples, offering exact finite-sample guarantees.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 3 Pith papers
-
Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs
Conformal Selective Acting (CSA) fills a gap in conformal methods by providing per-round, pathwise-valid selective risk bounds for adaptive RLVR LLM streams under predictable updates and isotonic calibration.
-
ST-BCP: Tightening Coverage Bound for Backward Conformal Prediction via Non-Conformity Score Transformation
ST-BCP tightens the coverage bound in Backward Conformal Prediction by applying a computable data-dependent transformation to nonconformity scores, reducing the average gap from 4.20% to 1.12% on benchmarks while prov...
-
Explainable Wastewater Digital Twins: Adaptive Context-Conditioned Structured Simulators with Self-Falsifying Decision Support
CCSS-IX is a context-conditioned structured simulator for wastewater digital twins that uses adaptive expert mixing and self-falsifying conformal decision rules to reduce unsafe actions while maintaining low predictio...
Reference graph
Works this paper leans on
-
[1]
Anastasios N Angelopoulos et al. “Conformal Risk Control”. In:ICLR (2024)
work page 2024
-
[2]
A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification
Anastasios N. Angelopoulos and Stephen Bates. “A gentle introduction to conformal prediction and distribution-free uncertainty quantification”. In:”arXiv:2107.07511”(”2021”)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[3]
Selective Conformal Inference with False Coverage- Statement Rate Control
Yajie Bao et al. “Selective Conformal Inference with False Coverage- Statement Rate Control”. In:Biometrika(2024)
work page 2024
-
[4]
Classification with a Reject Option using a Hinge Loss
Peter L. Bartlett and Marten H. Wegkamp. “Classification with a Reject Option using a Hinge Loss”. In:Journal of Machine Learning Research 9.59 (2008), pp. 1823–1840
work page 2008
-
[5]
Weight Uncertainty in Neural Networks
Charles Blundell et al. “Weight Uncertainty in Neural Networks”. In:Pro- ceedings of the 32nd International Conference on Machine Learning. 2015
work page 2015
-
[6]
An optimum character recognition system using decision functions
C. K. Chow. “An optimum character recognition system using decision functions”. In:IRE Transactions on Electronic Computers(1957)
work page 1957
-
[7]
Calibrated Selective Classification
Adam Fisch, Tommi Jaakkola, and Regina Barzilay. “Calibrated Selective Classification”. In:https://arxiv.org/abs/2208.12084(2022)
-
[8]
Conformal Predic- tion Sets with Limited False Positives
Adam Fisch, Tommi Jaakkola, and Regina Barzilay. “Conformal Predic- tion Sets with Limited False Positives”. In:Proceedings of the 39 th In- ternational Conference on Machine Learning. 2022
work page 2022
-
[9]
Optimal Strategies for Reject Option Classifiers
Vojtech Franc, Daniel Prusa, and Vaclav Voracek. “Optimal Strategies for Reject Option Classifiers”. In:Journal of Machine Learning Research24 (2023), pp. 1–49
work page 2023
-
[10]
Dropout as a Bayesian Approxima- tion: Representing Model Uncertainty in Deep Learning
Yarin Gal and Zoubin Ghahramani. “Dropout as a Bayesian Approxima- tion: Representing Model Uncertainty in Deep Learning”. In:Proceedings of the 33rd International Conference on Machine Learning. 2016
work page 2016
-
[11]
Selecting Informative Conformal Prediction Sets with False Coverage Rate Control
Ulysse Gazin et al. “Selecting Informative Conformal Prediction Sets with False Coverage Rate Control”. In:arXiv preprint arXiv:2403.12295(2024)
-
[12]
Selective classification for deep neu- ral networks
Yonatan Geifman and Ran El-Yaniv. “Selective classification for deep neu- ral networks”. In:Proceedings of the 31st International Conference on Neural Information Processing Systems(2017). 16
work page 2017
-
[13]
SelectiveNet: A Deep Neural Net- work with an Integrated Reject Option
Yonatan Geifman and Ran El-Yaniv. “SelectiveNet: A Deep Neural Net- work with an Integrated Reject Option”. In:Proceedings of the 36 th In- ternational Conference on Machine Learning(2019)
work page 2019
-
[14]
On Calibration of Modern Neural Networks
Chuan Guo et al. “On Calibration of Modern Neural Networks”. In:Pro- ceedings of the 34th International Conference on Machine Learning. 2017
work page 2017
-
[15]
Deep Residual Learning for Image Recognition
Kaiming He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016
work page 2016
-
[16]
The Nearest Neighbor Classification Rule with a Re- ject Option
Martin E. Hellman. “The Nearest Neighbor Classification Rule with a Re- ject Option”. In:IEEE Transactions on Systems Science and Cybernetics (1970)
work page 1970
-
[17]
Machine Learning with a Reject Option: A Sur- vey
Kilian Hendrickx et al. “Machine Learning with a Reject Option: A Sur- vey”. In:Machine Learning113 (2024), pp. 3073–3110
work page 2024
-
[18]
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
Dan Hendrycks and Kevin Gimpel. “A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks”. In:Proceedings of International Conference on Learning Representations(2017)
work page 2017
-
[19]
Accurate Un- certainties for Deep Learning Using Calibrated Regression
Volodymyr Kuleshov, Nathan Fenner, and Stefano Ermon. “Accurate Un- certainties for Deep Learning Using Calibrated Regression”. In:Proceed- ings of the 35th International Conference on Machine Learning. 2018
work page 2018
-
[20]
Distribution-free Predictive Inference for Regression
Jing Lei et al. “Distribution-free Predictive Inference for Regression”. In: Journal of the American Statistical Association(2018)
work page 2018
-
[21]
Energy-based Out-of-distribution Detection
Weitang Liu et al. “Energy-based Out-of-distribution Detection”. In:34th Conference on Neural Information Processing Systems(2020)
work page 2020
-
[22]
Inductive Confidence Machines for Regres- sion
Harris Papadopoulos et al. “Inductive Confidence Machines for Regres- sion”. In:ECML. 2002
work page 2002
-
[23]
AUC-based Selective Classifi- cation
Andrea Pugnana and Salvatore Ruggieri. “AUC-based Selective Classifi- cation”. In: 2023
work page 2023
-
[24]
Conformal- ized Quantile Regression
Yaniv Romano, Evan Patterson, and Emmanuel J. Cand` es. “Conformal- ized Quantile Regression”. In:Advances in Neural Information Processing Systems. 2019
work page 2019
-
[25]
A Tutorial on Conformal Prediction
Glenn Shafer and Vladimir Vovk. “A Tutorial on Conformal Prediction”. In:Journal of Machine Learning Research9 (2008), pp. 371–421
work page 2008
-
[26]
Conformal Prediction Under Covariate Shift
Ryan J. Tibshirani et al. “Conformal Prediction Under Covariate Shift”. In:Advances in Neural Information Processing Systems (NeurIPS). 2019
work page 2019
-
[27]
Evaluating Model Calibration in Classifica- tion
Juozas Vaicenavicius et al. “Evaluating Model Calibration in Classifica- tion”. In:Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. 2019
work page 2019
-
[28]
Machine-learning applications of algorithmic randomness
Vladimir Vovk, Alex Gammerman, and Craig Saunders. “Machine-learning applications of algorithmic randomness”. In:Sixteenth International Con- ference on Machine Learning (ICML)(1999). 17
work page 1999
-
[29]
Vladimir Vovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world. Vol. 29. Springer, 2005
work page 2005
-
[30]
Conformal Risk Control for Ordi- nal Classification
Yunpeng Xu, Wenge Guo, and Zhi Wei. “Conformal Risk Control for Ordi- nal Classification”. In:Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI)(2023)
work page 2023
-
[31]
Two-stage Risk Control with Application to Ranked Retrieval
Yunpeng Xu et al. “Two-stage Risk Control with Application to Ranked Retrieval”. In:Proceedings of the Thirty-Fourth International Joint Con- ference on Artificial Intelligence (IJCAI)(2025). 18 Figure 1: CIFAR-10: Coverage control at different values ofξwithα= 0.1 (margin score). 19 Figure 2: CIFAR-10: Risk control at different values ofαwithξ= 0.7 (marg...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.