CONSIGN: Conformal Segmentation Informed by Spatial Groupings via Decomposition
Pith reviewed 2026-05-22 13:50 UTC · model grok-4.3
The pith
Decomposing image segmentation into spatial groupings lets conformal prediction respect pixel correlations and produce tighter uncertainty sets that retain valid coverage guarantees.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CONSIGN decomposes a segmentation map into spatial groupings before applying conformal prediction to each group, thereby incorporating pixel correlations without sacrificing the finite-sample validity of the resulting prediction sets.
What carries the argument
The spatial-grouping decomposition that partitions the image into regions before conformal calibration, allowing the method to capture local dependence while preserving exchangeability within groups.
If this is right
- Prediction sets become smaller and more localized, improving interpretability for clinicians reviewing medical segmentations.
- The same coverage guarantee holds for any segmentation backbone that supplies multiple stochastic outputs.
- Performance gains appear across both medical and natural-image domains when spatial structure is explicitly modeled.
- Uncertainty maps become less conservative, reducing the number of ambiguous pixels flagged for human review.
Where Pith is reading between the lines
- The same grouping idea could be applied to other spatially structured tasks such as depth estimation or semantic instance segmentation.
- Choosing groupings adaptively from the image content itself might further tighten sets without extra model training.
- The approach suggests a general template for bringing conformal methods to any prediction problem whose outputs exhibit known correlation structure.
Load-bearing premise
The chosen spatial decomposition must preserve the statistical validity of conformal prediction without introducing bias from the grouping process itself.
What would settle it
Run CONSIGN on a synthetic dataset in which all spatial correlations have been removed by shuffling pixels within each image; if the method still produces meaningfully smaller sets than pixel-wise conformal prediction while maintaining coverage, the benefit cannot be attributed to spatial modeling.
Figures
read the original abstract
Most machine learning-based image segmentation models produce pixel-wise confidence scores that represent the model's predicted probability for each class label at every pixel. While this information can be particularly valuable in high-stakes domains such as medical imaging, these scores are heuristic in nature and do not constitute rigorous quantitative uncertainty estimates. Conformal prediction (CP) provides a principled framework for transforming heuristic confidence scores into statistically valid uncertainty estimates. However, applying CP directly to image segmentation ignores the spatial correlations between pixels, a fundamental characteristic of image data. This can result in overly conservative and less interpretable uncertainty estimates. To address this, we propose CONSIGN (Conformal Segmentation Informed by Spatial Groupings via Decomposition), a CP-based method that incorporates spatial correlations to improve uncertainty quantification in image segmentation. Our method generates meaningful prediction sets that come with user-specified, high-probability error guarantees. It is compatible with any pre-trained segmentation model capable of generating multiple sample outputs. We evaluate CONSIGN against two CP baselines across three medical imaging datasets and two COCO dataset subsets, using three different pre-trained segmentation models. Results demonstrate that accounting for spatial structure significantly improves performance across multiple metrics and enhances the quality of uncertainty estimates.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CONSIGN, a conformal prediction method for image segmentation that incorporates spatial structure by decomposing images into groupings (e.g., via clustering or superpixels). It claims to generate prediction sets with user-specified marginal coverage guarantees while improving efficiency and interpretability over standard pixel-wise CP, and reports positive results on three medical imaging datasets plus COCO subsets using three pre-trained segmentation models.
Significance. If the spatial decomposition preserves exchangeability and coverage guarantees, the approach would advance uncertainty quantification for structured outputs in computer vision, particularly in medical imaging. The multi-model, multi-dataset evaluation and compatibility with any multi-output segmentation model are strengths; reproducible code or explicit coverage verification would further enhance impact.
major comments (2)
- [§3] §3 (Method): The spatial grouping via decomposition is presented as preserving the validity of conformal prediction, but no formal argument or proof is given that the resulting conformity scores remain exchangeable between calibration and test points when groupings depend on the input data or model outputs. This directly affects the central claim of user-specified coverage guarantees.
- [§4] §4 (Experiments): While improvements across metrics are reported relative to two CP baselines, the tables do not include explicit empirical coverage rates (e.g., fraction of pixels or regions covered at the nominal 1-α level) on held-out test data, making it impossible to verify that the guarantees hold after decomposition.
minor comments (2)
- [Abstract] Abstract: 'high-probability error guarantees' should be replaced with the standard CP terminology of 'marginal coverage at level 1-α' for precision.
- [§3.2] Notation in §3.2: The definition of the grouped conformity score should explicitly state whether the grouping function is fixed or learned from calibration data.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We have carefully considered each point and provide detailed responses below, along with indications of planned revisions to address the concerns raised.
read point-by-point responses
-
Referee: [§3] §3 (Method): The spatial grouping via decomposition is presented as preserving the validity of conformal prediction, but no formal argument or proof is given that the resulting conformity scores remain exchangeable between calibration and test points when groupings depend on the input data or model outputs. This directly affects the central claim of user-specified coverage guarantees.
Authors: We appreciate this observation, as the validity of the coverage guarantees is central to our contribution. The method defines conformity scores at the level of spatial groups obtained via decomposition, which are computed independently for each image. Since the calibration and test images are drawn i.i.d. from the same distribution, and the grouping function is a fixed deterministic mapping applied to each image separately, the resulting group-level conformity scores are exchangeable. We will add a dedicated subsection in §3 providing a formal argument for this exchangeability, including a sketch of the proof that the joint distribution of the scores is invariant to permutations of the calibration and test points. revision: partial
-
Referee: [§4] §4 (Experiments): While improvements across metrics are reported relative to two CP baselines, the tables do not include explicit empirical coverage rates (e.g., fraction of pixels or regions covered at the nominal 1-α level) on held-out test data, making it impossible to verify that the guarantees hold after decomposition.
Authors: We agree that reporting empirical coverage is essential for verifying the practical validity of the guarantees. The current tables focus on efficiency and other performance metrics, but we will revise the experimental section to include explicit empirical coverage rates for all methods, datasets, and models at the target coverage levels (e.g., 0.9). These will be added to the main tables or as a supplementary table to allow direct comparison with the nominal 1-α. revision: yes
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper extends standard conformal prediction to image segmentation by introducing spatial groupings via decomposition. The abstract and description present this as a methodological addition that preserves compatibility with any pre-trained model generating multiple outputs, with performance claims backed by empirical evaluation on medical and COCO datasets against baselines. No equations, definitions, or claims in the provided text reduce by construction to fitted parameters, self-referential definitions, or load-bearing self-citations. The derivation remains self-contained as an independent proposal rather than a tautology or renaming of inputs.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We construct a sample matrix Ŝ(X) = [ŝ1, …, ŝNs], compute its mean and extract the uncertain regions through an SVD … Each column uk … is a basis vector … ak = Qα/2({⟨uk, ŝn − μ(X)⟩}), … Ak = … − λ Σk,k … C∗λ(X) = {Y : ∃c ∈ ×k=1K [Ak(X), Bk(X)] : Y = P(μ(X) + Σ ck uk(X))}
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Lemma 1. If Algorithm 1 terminates with λ̂ < ∞ … then P[Ytest ∈ C∗λ̂(Xtest)] ≥ 1−α
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U Rajendra Acharya, et al. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges.Information Fusion, 76:243–297, 2021
work page 2021
-
[2]
Anastasios Angelopoulos, Stephen Bates, Jitendra Malik, and Michael I Jordan. Uncertainty sets for image classifiers using conformal prediction.arXiv preprint arXiv:2009.14193, 2020
-
[3]
Conformal Prediction: A Gentle Introduction
Anastasios N Angelopoulos, Stephen Bates, et al. Conformal Prediction: A Gentle Introduction. Foundations and Trends® in Machine Learning, 16(4):494–591, 2023
work page 2023
-
[4]
Anastasios Nikolas Angelopoulos, Stephen Bates, Adam Fisch, Lihua Lei, and Tal Schuster. Conformal Risk Control. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[5]
Armato III, Geoffrey McLennan, Luc Bidaut, Michael F
Samuel G. Armato III, Geoffrey McLennan, Luc Bidaut, Michael F. McNitt-Gray, Charles R. Meyer, Anthony P. Reeves, Binsheng Zhao, Denise R. Aberle, Claudia I. Henschke, Eric A. Hoffman, Ella A. Kazerooni, Heber MacMahon, Edwin J. R. Van Beek, David Yankelevitz, Anthony M. Biancardi, Patricia H. Bland, Mark S. Brown, Roger M. Engelmann, Gerald E. Laderach, ...
-
[6]
Phiseg: Capturing uncertainty in medical image segmentation
Christian F Baumgartner, Kerem C Tezcan, Krishna Chaitanya, Andreas M Hötker, Urs J Muehlematter, Khoschy Schawkat, Anton S Becker, Olivio Donati, and Ender Konukoglu. Phiseg: Capturing uncertainty in medical image segmentation. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 119–127. Springer, 2019
work page 2019
-
[7]
Omer Belhasin, Yaniv Romano, Daniel Freedman, Ehud Rivlin, and Michael Elad. Principal Uncertainty Quantification with Spatial Correlation for Image Restoration Problems.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(5):3321–3333, 2023. 10
work page 2023
-
[8]
Jacqueline Isabel Bereska, Hamed Karimi, and Reza Samavi. SACP: Spatially-Aware Conformal Prediction in Uncertainty Quantification of Medical Image Segmentation. InMedical Imaging with Deep Learning, 2025
work page 2025
-
[9]
Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms
Joren Brunekreef, Eric Marcus, Ray Sheombarsing, Jan-Jakob Sonke, and Jonas Teuwen. Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4135–4143, 2024
work page 2024
-
[10]
Victor M Campello, Polyxeni Gkontra, Cristian Izquierdo, Carlos Martin-Isla, Alireza Sojoudi, Peter M Full, Klaus Maier-Hein, Yao Zhang, Zhiqiang He, Jun Ma, et al. Multi-centre, Multi- vendor and Multi-Disease Cardiac Segmentation: The M&Ms challenge.IEEE Transactions on Medical Imaging, 40:3543–3554, 2021
work page 2021
-
[11]
Anne Chao. Nonparametric Estimation of the Number of Classes in a Population.Scandinavian Journal of Statistics, pages 265–270, 1984
work page 1984
-
[12]
Rethinking Atrous Convolution for Semantic Image Segmentation
Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. Rethinking Atrous Convolution for Semantic Image Segmentation.CoRR, abs/1706.05587, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[13]
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European conference on Computer Vision (ECCV), pages 801–818, 2018
work page 2018
-
[14]
Conformal confidence sets for biomedical image segmentation.arXiv preprint arXiv:2410.03406, 2024
Samuel Davenport. Conformal confidence sets for biomedical image segmentation.arXiv preprint arXiv:2410.03406, 2024
-
[15]
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Yarin Gal and Zoubin Ghahramani. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. InInternational Conference on Machine Learning, pages 1050–1059. PMLR, 2016
work page 2016
-
[16]
Shangqi Gao, Hangqi Zhou, Yibo Gao, and Xiahai Zhuang. BayeSeg: Bayesian Modeling for Medical Image Segmentation with Interpretable Generalizability.Medical Image Analysis, 89: 102889, 2023
work page 2023
-
[17]
Adaptive Conformal Inference under Distribution Shift
Isaac Gibbs and Emmanuel Candes. Adaptive Conformal Inference under Distribution Shift. Advances in Neural Information Processing Systems, 34:1660–1672, 2021
work page 2021
-
[18]
Singular value decomposition and least squares solutions
Gene H Golub and Christian Reinsch. Singular value decomposition and least squares solutions. InHandbook for automatic computation: volume II: linear algebra, pages 134–151. Springer, 1971
work page 1971
-
[19]
Ling Huang, Su Ruan, Yucheng Xing, and Mengling Feng. A review of uncertainty quantifica- tion in medical image analysis: Probabilistic and non-probabilistic methods.Medical Image Analysis, page 103223, 2024
work page 2024
-
[20]
Alex Kendall and Yarin Gal. What uncertainties do we need in bayesian deep learning for computer vision?Advances in Neural Information Processing Systems, 30, 2017
work page 2017
-
[21]
Diederik P Kingma, Max Welling, et al. An Introduction to Variational Autoencoders.Founda- tions and Trends® in Machine Learning, 12(4):307–392, 2019
work page 2019
-
[22]
Simon Kohl, Bernardino Romera-Paredes, Clemens Meyer, Jeffrey De Fauw, Joseph R Ledsam, Klaus Maier-Hein, SM Eslami, Danilo Jimenez Rezende, and Olaf Ronneberger. A Probabilistic U-Net for Segmentation of Ambiguous Images.Advances in Neural Information Processing Systems, 31, 2018
work page 2018
-
[23]
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles.Advances in Neural Information Processing Systems, 30, 2017
work page 2017
-
[24]
Benjamin Lambert, Florence Forbes, Senan Doyle, Harmonie Dehaene, and Michel Dojat. Trustworthy clinical AI solutions: A unified review of uncertainty quantification in Deep Learning models for medical image analysis.Artificial Intelligence in Medicine, 150:102830,
-
[25]
Distribution-free prediction bands for non-parametric regression
Jing Lei and Larry Wasserman. Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):71–96, 2014
work page 2014
-
[26]
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft COCO: Common Objects in Context. In Computer Vision–ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13, pages 740–755. Springer, 2014
work page 2014
-
[27]
Kangdao Liu, Tianhao Sun, Hao Zeng, Yongshan Zhang, Chi-Man Pun, and Chi-Man V ong. Spatial-aware conformal prediction for trustworthy hyperspectral image classification.IEEE Transactions on Circuits and Systems for Video Technology, 2025
work page 2025
-
[28]
Carlos Martín-Isla, Víictor M Campello, Cristian Izquierdo, Kaisar Kushibar, Carla Sendra- Balcells, Polyxeni Gkontra, Alireza Sojoudi, Mitchell J Fulton, Tewodros Weldebirhan Arega, Kumaradevan Punithakumar, et al. Deep Learning Segmentation of the Right Ventricle in Cardiac MRI: The M&Ms challenge.IEEE Journal of Biomedical and Health Informatics, 27: 3...
work page 2023
-
[29]
Alireza Mehrtash, William M Wells, Clare M Tempany, Purang Abolmaesumi, and Tina Kapur. Confidence Calibration and Predictive Uncertainty Estimation for Deep Medical Image Segmentation.IEEE Transactions on Medical Imaging, 39(12):3868–3878, 2020
work page 2020
-
[30]
Miguel Monteiro, Loïc Le Folgoc, Daniel Coelho de Castro, Nick Pawlowski, Bernardo Mar- ques, Konstantinos Kamnitsas, Mark Van der Wilk, and Ben Glocker. Stochastic segmentation networks: Modelling spatially correlated aleatoric uncertainty.Advances in neural information processing systems, 33:12756–12767, 2020
work page 2020
-
[31]
Luca Mossina and Corentin Friedrich. Conformal Prediction for Image Segmentation Using Morphological Prediction Sets.arXiv preprint arXiv:2503.05618, 2025
-
[32]
Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty
Luca Mossina, Joseba Dalmau, and Léo Andéol. Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 3574–3584, June 2024
work page 2024
-
[33]
Elias Nehme, Omer Yair, and Tomer Michaeli. Uncertainty Quantification via Neural Posterior Principal Components.Advances in Neural Information Processing Systems, 36:37128–37141, 2023
work page 2023
-
[34]
Inductive confidence machines for regression
Harris Papadopoulos, Kostas Proedrou, V olodya V ovk, and Alex Gammerman. Inductive confidence machines for regression. InMachine learning: ECML 2002: 13th European conference on machine learning Helsinki, Finland, August 19–23, 2002 proceedings 13, pages 345–356. Springer, 2002
work page 2002
-
[35]
U-net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional Networks for Biomedical Image Segmentation. InMedical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015
work page 2015
-
[36]
Predictive Inference with Feature Conformal Prediction
Jiaye Teng, Chuan Wen, Dinghuai Zhang, Yoshua Bengio, Yang Gao, and Yang Yuan. Predictive Inference with Feature Conformal Prediction. InThe Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[37]
Vladimir V ovk, Alexander Gammerman, and Glenn Shafer.Algorithmic Learning in a Random World, volume 29. Springer, 2005
work page 2005
-
[38]
Håkan Wieslander, Philip J Harrison, Gabriel Skogberg, Sonya Jackson, Markus Fridén, Johan Karlsson, Ola Spjuth, and Carolina Wählby. Deep Learning With Conformal Prediction for Hierarchical Analysis of Large-Scale Whole-Slide Tissue Images.IEEE Journal of Biomedical and Health Informatics, 25(2):371–380, 2020
work page 2020
-
[39]
Fuping Wu and Xiahai Zhuang. Minimizing Estimated Risks on Unlabeled Data: A New Formulation for Semi-Supervised Medical Image Segmentation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5):6021–6036, 2022. 12
work page 2022
-
[40]
Conformal Performance Range Prediction for Segmentation Output Quality Control
Anna M Wundram, Paul Fischer, Michael Mühlebach, Lisa M Koch, and Christian F Baumgart- ner. Conformal Performance Range Prediction for Segmentation Output Quality Control. In International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, pages 81–91. Springer, 2024
work page 2024
-
[41]
Xiahai Zhuang. Multivariate Mixture Model for Myocardial Segmentation Combining Multi- Source Images.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(12): 2933–2946, 2018. A Proof of Lemma 1. Proof. This is a direct consequence of [ 4, Theorem 1]: It is clear that the loss λ7→L X,Y (λ) := 1−I(Y∈C λ(X)) is non-increasing for all (X, Y) . ...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.