Probabilistic Object Detection with Conformal Prediction
Pith reviewed 2026-05-11 02:16 UTC · model grok-4.3
The pith
Scaled conformal prediction adapts bounding box intervals to uncertainty estimates, improving sharpness while preserving coverage in object detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By applying conformal prediction coordinate-wise to the four corners of bounding boxes with a Bonferroni correction and then scaling the interval widths by per-prediction aleatoric uncertainty estimates from a loss-attenuated probabilistic object detector, the method produces prediction intervals that are sharper than unscaled conformal prediction while retaining marginal coverage guarantees. This holds across three datasets and in a cross-domain setting.
What carries the argument
Scaled conformal prediction, where interval widths are multiplied by input-dependent aleatoric uncertainty scores before conformal calibration, applied coordinate-wise with Bonferroni correction for box-level validity.
If this is right
- Scaled CP achieves up to 19 percent higher intersection-over-union and 39 percent lower interval scores than unscaled CP at the same coverage level.
- Adding class-wise calibration improves coverage rates for both scaled and unscaled versions with little change in sharpness.
- The two-step pipeline first builds class prediction sets with RAPS and then conditions the conformalized boxes on those sets.
- The improvements persist under distribution shift in cross-domain evaluation on autonomous driving data.
Where Pith is reading between the lines
- These sharper intervals could allow downstream planners to make tighter safety margins around detected objects.
- The approach may extend to other multi-output regression tasks where outputs have natural structure like keypoints or segmentation masks.
- Further gains might come from combining the aleatoric scaling with other forms of uncertainty calibration beyond the two variants tested.
Load-bearing premise
The aleatoric uncertainty estimates produced by the probabilistic detector are well enough calibrated that they can be used directly as scaling factors without breaking the coverage guarantee.
What would settle it
Observing that the empirical coverage of the scaled intervals falls below the target level on a new test set drawn from the same distribution as the calibration data, or that the sharpness gains disappear while coverage remains intact.
Figures
read the original abstract
Conformal Prediction (CP) is a distribution-free method for constructing prediction sets with marginal finite-sample coverage guarantees, making it a suitable framework for reliable uncertainty quantification in safety-critical object detection. However, object detection introduces structured multi-output predictions, complicating the application of classical CP theory developed for single outputs. In addition, standard, unscaled CP produces fixed-width prediction intervals across inputs, leading to unnecessary width for low-uncertainty predictions. While scaled CP addresses this by adapting the interval width to an input-dependent uncertainty estimate, prior work has neither systematically compared unscaled and scaled CP for multi-class object detection, nor integrated CP with a complementary uncertainty quantification method in this setting. We fill this gap by: (i) applying CP coordinate-wise to bounding box corners with a Bonferroni correction for box-level guarantees; (ii) scaling the resulting intervals using per-prediction aleatoric uncertainty estimates derived from a probabilistic object detector trained with loss attenuation, evaluated in uncalibrated and two calibrated variants; (iii) extending to a two-step pipeline that constructs prediction sets for the class using RAPS and conditions the conformalized bounding boxes on the predicted class set. Across three autonomous driving datasets (KITTI, BDD, CODA), including a cross-domain setting under distribution shift, scaled CP consistently improves interval sharpness over unscaled CP, achieving up to 19% higher IoU and 39% lower interval scores, without sacrificing coverage. Class-wise calibration further improves coverage for both variants with a negligible effect on sharpness. Together, these improvements yield more actionable uncertainty estimates for real-time, real-world object detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper applies conformal prediction (CP) to probabilistic object detection for bounding-box uncertainty quantification. It proposes coordinate-wise CP on box corners with Bonferroni correction for box-level coverage, scales the intervals using per-prediction aleatoric uncertainty from a loss-attenuated probabilistic detector (in uncalibrated and calibrated variants), and extends the pipeline with RAPS for class prediction sets. Experiments on KITTI, BDD, and CODA (including cross-domain shift) report that scaled CP yields up to 19% higher IoU and 39% lower interval scores than unscaled CP while maintaining coverage, with class-wise calibration further improving coverage at negligible sharpness cost.
Significance. If the empirical improvements hold under the stated conditions, the work provides a practical, adaptive method for tighter prediction intervals in multi-output detection tasks without apparent loss of coverage. The systematic comparison of scaled vs. unscaled CP, integration with loss-attenuated detectors, and inclusion of a cross-domain shift setting are strengths; the approach could inform real-time safety-critical applications if the coverage behavior is robustly characterized.
major comments (3)
- [Abstract, §4.3] Abstract and §4.3 (cross-domain experiments): The framing that CP supplies 'marginal finite-sample coverage guarantees' for safety-critical detection is undercut by the explicit inclusion of distribution-shift settings (e.g., KITTI-to-BDD or CODA). CP theory requires exchangeability between calibration and test points for the finite-sample guarantee; shift violates this, rendering reported coverage purely empirical. The claim of 'without sacrificing coverage' therefore needs qualification as an observed property rather than a guaranteed one.
- [§3.2] §3.2 (scaled CP construction): The scaling of conformal intervals by per-prediction aleatoric uncertainty estimates assumes these estimates are sufficiently calibrated to preserve validity after scaling. No explicit calibration diagnostics (e.g., reliability diagrams or ECE for the uncertainty outputs) are reported for the loss-attenuated detector variants; if the scaling factors are miscalibrated, the adaptive intervals may lose the marginal coverage property even under exchangeability.
- [§3.1] §3.1 (Bonferroni correction for box-level coverage): Coordinate-wise application with Bonferroni correction is used to obtain box-level guarantees. While conservative, the paper should quantify the resulting over-coverage or interval inflation relative to a joint conformal method; the reported sharpness gains (19% IoU, 39% interval score) may be partly offset by this conservatism, and an ablation comparing Bonferroni to other multiplicity corrections would strengthen the central comparison.
minor comments (3)
- [§3] Notation for the conformal quantile and scaling factor (e.g., Eq. (3) and (5)) is introduced without an explicit table of symbols; adding one would improve readability for readers unfamiliar with CP variants.
- [Figure 3] Figure 3 (interval score vs. IoU plots) lacks error bars or statistical significance markers for the reported improvements; including them would clarify whether the 19%/39% gains are consistent across seeds or splits.
- [§3.3] The two-step pipeline (RAPS class sets then conditional box CP) is described clearly, but the interaction between class-set size and box coverage is not tabulated; a small table showing coverage as a function of class-set cardinality would be helpful.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We respond to each major comment below, indicating where revisions will be made to clarify and strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract, §4.3] The framing that CP supplies 'marginal finite-sample coverage guarantees' for safety-critical detection is undercut by the explicit inclusion of distribution-shift settings (e.g., KITTI-to-BDD or CODA). CP theory requires exchangeability between calibration and test points for the finite-sample guarantee; shift violates this, rendering reported coverage purely empirical. The claim of 'without sacrificing coverage' therefore needs qualification as an observed property rather than a guaranteed one.
Authors: We agree that the finite-sample marginal coverage guarantees of conformal prediction require exchangeability between calibration and test points. This assumption does not hold in the cross-domain shift experiments (e.g., KITTI-to-BDD or CODA), so coverage there is necessarily empirical. We will revise the abstract and §4.3 to qualify the phrase 'without sacrificing coverage' as an empirical observation under the evaluated conditions, while explicitly noting that the theoretical guarantees apply only under exchangeability. The reported empirical results and comparisons remain unchanged. revision: yes
-
Referee: [§3.2] The scaling of conformal intervals by per-prediction aleatoric uncertainty estimates assumes these estimates are sufficiently calibrated to preserve validity after scaling. No explicit calibration diagnostics (e.g., reliability diagrams or ECE for the uncertainty outputs) are reported for the loss-attenuated detector variants; if the scaling factors are miscalibrated, the adaptive intervals may lose the marginal coverage property even under exchangeability.
Authors: This point is valid. Although the manuscript evaluates uncalibrated and calibrated variants of the loss-attenuated detector, explicit calibration diagnostics for the aleatoric uncertainty estimates (such as reliability diagrams or ECE) were not included. In the revised manuscript we will add these diagnostics for all variants to substantiate that the scaling factors are sufficiently well-calibrated to preserve the marginal coverage property under exchangeability. revision: yes
-
Referee: [§3.1] Coordinate-wise application with Bonferroni correction is used to obtain box-level guarantees. While conservative, the paper should quantify the resulting over-coverage or interval inflation relative to a joint conformal method; the reported sharpness gains (19% IoU, 39% interval score) may be partly offset by this conservatism, and an ablation comparing Bonferroni to other multiplicity corrections would strengthen the central comparison.
Authors: We acknowledge that the Bonferroni correction is conservative and can induce over-coverage relative to a joint conformal procedure. We will add a quantification of the observed over-coverage and interval inflation in our experiments. In addition, we will include an ablation comparing Bonferroni to alternative multiplicity corrections (e.g., Holm-Bonferroni) to evaluate the impact on sharpness and to better contextualize the reported gains of 19% IoU and 39% lower interval scores. revision: yes
Circularity Check
No circularity: empirical application of standard CP methods
full rationale
The paper applies existing conformal prediction techniques (coordinate-wise with Bonferroni, RAPS for classes, scaling by aleatoric uncertainty from a separately trained probabilistic detector) to object detection and reports empirical results on held-out test sets across datasets. All claims of improved sharpness (e.g., higher IoU, lower interval scores) without loss of coverage are measured outcomes on external data, not quantities derived by construction from fitted parameters inside the same pipeline. No equations reduce to self-definition, no predictions are statistically forced renamings of inputs, and no load-bearing steps rely on self-citations or imported uniqueness theorems. The work is self-contained as a methodological extension plus experimental validation against standard CP theory.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Data points are exchangeable so that conformal prediction yields marginal coverage guarantees
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
scaled CP consistently improves interval sharpness over unscaled CP, achieving up to 19% higher IoU and 39% lower interval scores, without sacrificing coverage
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
McAllister, Rowan and Gal, Yarin and Kendall, Alex and Van Der Wilk, Mark and Shah, Amar and Cipolla, Roberto and Weller, Adrian , title =. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence , publisher =. 2017 , pages =
work page 2017
-
[2]
Andeol, Leo and Fel, Thomas and de Grancey, Florence and Mossina, Luca , title =. Proceedings of the Twelfth Symposium on Conformal and Probabilistic Prediction with Applications , publisher =. 2023 , pages =
work page 2023
-
[3]
Object detection with probabilistic guarantees: a conformal prediction approach , booktitle =
de Grancey, Florence and Adam, Jean-Luc and Alecu, Lucian and Gerchinovitz, S. Object detection with probabilistic guarantees: a conformal prediction approach , booktitle =. 2022 , pages =
work page 2022
-
[4]
Proceedings of the 33rd International Conference on Machine Learning , publisher =
Gal, Yarin and Ghahramani, Zoubin , title =. Proceedings of the 33rd International Conference on Machine Learning , publisher =. 2016 , pages =
work page 2016
- [5]
-
[6]
Ghoshal, Biraja and Woof, William and Mendes, Bernardo and. Making deep learning models clinically useful --- improving diagnostic confidence in inherited retinal disease with conformal prediction , booktitle =. 2025 , pages =
work page 2025
- [7]
-
[8]
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence , publisher =
Kassem Sbeyti, Moussa and Karg, Michelle and Wirth, Christian and Klein, Nadja and Albayrak, Sahin , title =. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence , publisher =. 2024 , pages =
work page 2024
- [9]
-
[10]
Advances in Neural Information Processing Systems , publisher =
Kendall, Alex and Gal, Yarin , title =. Advances in Neural Information Processing Systems , publisher =. 2017 , volume =
work page 2017
- [11]
-
[12]
Proceedings of the 35th International Conference on Machine Learning , publisher =
Kuleshov, Volodymyr and Fenner, Nathan and Ermon, Stefano , title =. Proceedings of the 35th International Conference on Machine Learning , publisher =. 2018 , pages =
work page 2018
-
[13]
Advances in Neural Information Processing Systems , publisher =
Lakshminarayanan, Balaji and Pritzel, Alexander and Blundell, Charles , title =. Advances in Neural Information Processing Systems , publisher =. 2017 , volume =
work page 2017
-
[14]
Copula-based conformal prediction for object detection: a more efficient approach , booktitle =
Mukama, Bruce Cyusa and Messoudi, Soundouss and Rousseau, Sylvain and Destercke, S. Copula-based conformal prediction for object detection: a more efficient approach , booktitle =. 2024 , volume =
work page 2024
-
[15]
Papadopoulos, Harris and Proedrou, Kostas and Vovk, Volodya and Gammerman, Alex , title =. Machine Learning:. 2002 , pages =
work page 2002
-
[16]
Aleatoric and epistemic uncertainty in conformal prediction , booktitle =
Sale, Yusuf and Javanmardi, Alireza and H. Aleatoric and epistemic uncertainty in conformal prediction , booktitle =. 2025 , volume =
work page 2025
- [17]
-
[18]
Timans, Alexander and Straehle, Christoph-Nikolas and Sakmann, Kaspar and Nalisnick, Eric , title =. Computer Vision --. 2025 , pages =
work page 2025
-
[19]
Yu, Fisher and Chen, Haofeng and Wang, Xin and Xian, Wenqi and Chen, Yingying and Liu, Fangchen and Madhavan, Vashisht and Darrell, Trevor , title =. 2020. 2020 , pages =
work page 2020
-
[20]
Choi, Jiwoong and Chun, Dayoung and Kim, Hyun and Lee, Hyuk-Jae , title =. Proceedings of the. 2019 , pages =
work page 2019
- [21]
-
[22]
Advances in Neural Information Processing Systems , publisher =
Romano, Yaniv and Sesia, Matteo and Candes, Emmanuel , title =. Advances in Neural Information Processing Systems , publisher =. 2020 , volume =
work page 2020
-
[23]
Machine Learning and Knowledge Discovery in Databases: Research Track , publisher =
Kassem Sbeyti, Moussa and Karg, Michelle and Wirth, Christian and Nowzad, Azarm and Albayrak, Sahin , title =. Machine Learning and Knowledge Discovery in Databases: Research Track , publisher =. 2023 , pages =
work page 2023
-
[24]
Fair conformal predictors for applications in medical imaging , booktitle =
Lu, Charles and Lemay, Andr. Fair conformal predictors for applications in medical imaging , booktitle =. 2022 , pages =
work page 2022
-
[25]
Barlow, Richard E. and Brunk, Hugh D. , title =. Journal of the American Statistical Association , year =
-
[26]
Annals of Mathematics and Artificial Intelligence , year =
Vovk, Vladimir , title =. Annals of Mathematics and Artificial Intelligence , year =
- [27]
-
[28]
Analysis of explainers of black box deep neural networks for computer vision: a survey , journal =
Buhrmester, Vanessa and M. Analysis of explainers of black box deep neural networks for computer vision: a survey , journal =. 2021 , volume =
work page 2021
-
[29]
Zou, Ke and Chen, Zhihao and Yuan, Xuedong and Shen, Xiaojing and Wang, Meng and Fu, Huazhu , title =. Meta-Radiology , year =
-
[30]
Zou, Zhengxia and Chen, Keyan and Shi, Zhenwei and Guo, Yuhong and Ye, Jieping , title =. Proceedings of the. 2023 , volume =
work page 2023
-
[31]
Bland, Martin J. and Altman, Douglas G. , title =. BMJ (Clinical Research Ed.) , year =
- [32]
-
[33]
Artificial Intelligence Review , year =
Gawlikowski, Jakob and Tassi, Cedrique Rovile Njieutcheu and Ali, Mohsin and Lee, Jongseok and Humt, Matthias and Feng, Jianxiang and Kruspe, Anna and Triebel, Rudolph and Jung, Peter and Roscher, Ribana and Shahzad, Muhammad and Yang, Wen and Bamler, Richard and Zhu, Xiao Xiang , title =. Artificial Intelligence Review , year =
-
[34]
and Dietmayer, Klaus , title =
Feng, Di and Harakeh, Ali and Waslander, Steven L. and Dietmayer, Klaus , title =. IEEE Transactions on Intelligent Transportation Systems , year =
-
[35]
Reliable prediction errors for deep neural networks using test-time dropout , journal =
Cort. Reliable prediction errors for deep neural networks using test-time dropout , journal =. 2019 , volume =
work page 2019
-
[36]
Karimi, Hamed and Samavi, Reza , title =. Proceedings of the. 2023 , volume =
work page 2023
-
[37]
Felzenszwalb, Pedro F. and Girshick, Ross B. and McAllister, David and Ramanan, Deva , title =. IEEE Transactions on Pattern Analysis and Machine Intelligence , year =
-
[38]
and Wasserman, Larry , title =
Lei, Jing and G'Sell, Max and Rinaldo, Alessandro and Tibshirani, Ryan J. and Wasserman, Larry , title =. Journal of the American Statistical Association , year =
-
[39]
Journal of the American Statistical Association , year =
Sadinle, Mauricio and Lei, Jing and Wasserman, Larry , title =. Journal of the American Statistical Association , year =
- [40]
-
[41]
Copula-based conformal prediction for multi-target regression , journal =
Messoudi, Soundouss and Destercke, S. Copula-based conformal prediction for multi-target regression , journal =. 2021 , volume =
work page 2021
-
[42]
and Foygel Barber, Rina and Candes, Emmanuel and Ramdas, Aaditya , title =
Tibshirani, Ryan J. and Foygel Barber, Rina and Candes, Emmanuel and Ramdas, Aaditya , title =. Advances in Neural Information Processing Systems , year =
-
[43]
Vovk, Vladimir and Gammerman, Alexander and Shafer, Glenn , title =
-
[44]
Li, Kaican and Chen, Kai and Wang, Haoyu and Hong, Lanqing and Ye, Chaoqiang and Han, Jianhua and Chen, Yukuai and Zhang, Wei and Xu, Chunjing and Yeung, Dit-Yan and Liang, Xiaodan and Li, Zhenguo and Xu, Hang , title =. 2022 , booktitle =
work page 2022
-
[45]
Angelopoulos, Anastasios N. and Bates, Stephen , title =. arXiv preprint , note =
-
[46]
Shafer, Glenn and Vovk, Vladimir , title =. arXiv preprint , note =
-
[47]
and Bates, Stephen and Malik, Jitendra and Jordan, Michael I
Angelopoulos, Anastasios N. and Bates, Stephen and Malik, Jitendra and Jordan, Michael I. , title =. International Conference on Learning Representations , year=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.