Redefining Instance Matching: A Unified Framework for Part-Aware Matching in Panoptic Segmentation Evaluation
Pith reviewed 2026-06-28 23:05 UTC · model grok-4.3
The pith
Panoptic Quality holds for three segment matching strategies but not Many-to-Many when IoU drops below 0.5
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By recasting segment matching as a constrained bipartite assignment problem and bounding the degrees on the prediction and ground-truth sides independently, four matching strategies arise. The first three are well-defined within the PQ framework while Many-to-Many falls outside it. Central to the framework is a vertex-based accounting of TP, FN, and FP anchored to ground truth and predicted segments rather than to matching edges. The framework extends naturally to part-aware panoptic segmentation.
What carries the argument
Constrained bipartite assignment with independent degree bounds on each side of the graph, which classifies matching strategies and supports vertex-based TP/FN/FP accounting
If this is right
- Across configurable case studies, different combinations of thresholds and matching strategies can be compared in practice
- The framework extends naturally to part-aware panoptic segmentation evaluation
- A unified open-source package on Panoptica implements the strategies with Voronoi-based analysis and Area Under Threshold Curve options
- These strategies become relevant for fragmented instances, difficult delineations, or noisy annotations
Where Pith is reading between the lines
- This model could be tested on standard panoptic benchmarks to quantify score differences from the original PQ
- Extending the vertex accounting to other segmentation metrics might unify evaluation across tasks
- The part-aware version opens evaluation for hierarchical structures in medical imaging datasets
Load-bearing premise
Independently bounding the degrees on the prediction and ground-truth sides of the bipartite graph produces evaluation strategies that remain meaningful and comparable to the original Panoptic Quality definition
What would settle it
Finding a Many-to-Many matching example where the computed Panoptic Quality score behaves identically to the standard One-to-One matching under the same IoU conditions would challenge the claim that it falls outside the framework
Figures
read the original abstract
The Panoptic Quality (PQ) metric is the standard for jointly evaluating instance and semantic segmentation. However, its original definition relies on a One-to-One matching between predicted and ground truth segments, which is only straightforward when the IoU threshold exceeds 0.5. Below 0.5, multiple matching strategies emerge in a poorly explored problem space. We systematically elucidate this space by recasting segment matching as a constrained bipartite assignment problem. Independently bounding the prediction- and ground-truth-side degrees yields four matching strategies: One-to-One, Many-to-One, One-to-Many, and Many-to-Many. We show that the first three are well-defined within the PQ framework, while Many-to-Many falls outside it. These strategies become relevant when instances are fragmented, adjacent objects are difficult to delineate, or annotations are noisy. Central to our framework is a vertex-based accounting of TP, FN, and FP, anchored to ground truth and predicted segments rather than to matching edges. We further show that the framework extends naturally to part-aware panoptic segmentation, and we explore part-aware evaluation on biomedical data. Across configurable case studies we report how different combinations of thresholds and matching strategies behave in practice. We release a unified open-source package built on Panoptica. It exposes Voronoi-based region-wise analysis, part-aware evaluation, and Area Under Threshold Curve computations as configurable options.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper recasts panoptic segment matching as a constrained bipartite assignment problem whose four strategies (One-to-One, Many-to-One, One-to-Many, Many-to-Many) arise from independent degree bounds on the prediction and ground-truth partitions. It asserts that the first three remain well-defined inside the original PQ framework via a vertex-based (segment-anchored) counting of TP/FN/FP, while Many-to-Many does not; the framework is extended to part-aware panoptic segmentation and demonstrated on biomedical data with configurable thresholds and an open-source implementation on Panoptica.
Significance. A rigorously derived set of matching strategies that reduce exactly to classical PQ when both degree bounds equal 1, together with an invariance proof for SQ and RQ, would supply a principled tool for evaluating fragmented or noisy instances. The release of a configurable open-source package exposing Voronoi analysis, part-aware metrics, and AUC computations is a concrete strength that lowers the barrier to adoption.
major comments (1)
- [Abstract / central claim] The central claim that the first three strategies are 'well-defined within the PQ framework' is load-bearing yet unsupported by any explicit reduction: no derivation shows that the vertex-based TP/FN/FP rules recover the classical PQ formulas exactly when both degree bounds are set to 1, nor that SQ and RQ retain their original interpretation, monotonicity, and [0,1] range once a bound exceeds 1 (abstract, paragraph on recasting as constrained assignment).
Simulated Author's Rebuttal
We thank the referee for the constructive review. The major comment correctly identifies that the central claim requires an explicit supporting derivation, which we will add in revision.
read point-by-point responses
-
Referee: [Abstract / central claim] The central claim that the first three strategies are 'well-defined within the PQ framework' is load-bearing yet unsupported by any explicit reduction: no derivation shows that the vertex-based TP/FN/FP rules recover the classical PQ formulas exactly when both degree bounds are set to 1, nor that SQ and RQ retain their original interpretation, monotonicity, and [0,1] range once a bound exceeds 1 (abstract, paragraph on recasting as constrained assignment).
Authors: We agree that an explicit derivation is required. In the revised manuscript we will insert a new subsection (under Methods) containing: (i) a formal proof that the vertex-based TP/FN/FP accounting reduces exactly to the classical edge-based PQ formulas when both degree bounds equal 1; (ii) proofs that SQ and RQ preserve their original semantic interpretations, monotonicity with respect to matching quality, and the [0,1] range for the three bounded strategies; and (iii) a concise counter-example showing why Many-to-Many violates these properties. The abstract and introductory paragraph will be updated to reference this new subsection. revision: yes
Circularity Check
No significant circularity; framework extends standard bipartite matching without self-referential reduction.
full rationale
The paper recasts panoptic matching as a constrained bipartite assignment problem whose four strategies follow directly from independent degree bounds on the two partitions. The vertex-based TP/FN/FP accounting is introduced as a modeling choice anchored to segments, not derived from or fitted to the same quantities it evaluates. No equations reduce a reported metric to a fitted parameter or prior result by construction, and no load-bearing claim rests on a self-citation chain. The assertion that the first three strategies remain inside the PQ framework is presented as a modeling consequence rather than an unproven invariance that collapses to the input definitions. The derivation is therefore self-contained against external benchmarks (standard assignment problem and original PQ definition) and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
free parameters (2)
- IoU threshold
- Degree bounds on prediction and ground-truth sides
axioms (1)
- domain assumption Bipartite assignment with degree constraints yields well-defined matching strategies that can be compared to the original PQ definition
invented entities (1)
-
Vertex-based accounting of TP, FN, FP
no independent evidence
Reference graph
Works this paper leans on
- [1]
-
[2]
The liver tumor segmentation benchmark (lits).Medical image analysis, 84:102680, 2023
Patrick Bilic, Patrick Christ, Hongwei Bran Li, Eugene V orontsov, Avi Ben-Cohen, Georgios Kaissis, Adi Szeskin, Colin Jacobs, Gabriel Efrain Humpire Mamani, Gabriel Chartrand, et al. The liver tumor segmentation benchmark (lits).Medical image analysis, 84:102680, 2023
2023
-
[3]
Joseph Chazalon and Edwin Carlinet. Revisiting the Coco Panoptic Metric to Enable Visual and Qualitative Analysis of Historical Map Instance Segmentation. In Josep Lladós, Daniel Lopresti, and Seiichi Uchida, editors,16th International Conference on Document Analysis and Recognition, IC- DAR 2021, Lausanne, Switzerland, September 5-10, 2021, Proceedings, ...
-
[4]
Sortedap: rethinking evaluation metrics for instance segmentation
Long Chen, Yuli Wu, Johannes Stegmaier, and Dorit Merhof. Sortedap: rethinking evaluation metrics for instance segmentation. InProceedings of the ieee/cvf international conference on computer vision, pages 3923–3929, 2023
2023
-
[5]
The cityscapes dataset for semantic urban scene understanding
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016
2016
-
[6]
Part-aware panoptic segmentation
Daan De Geus, Panagiotis Meletis, Chenyang Lu, Xiaoxiao Wen, and Gijs Dubbelman. Part-aware panoptic segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5485–5494, 2021
2021
-
[7]
Panoptic quality should be avoided as a metric for assessing cell nuclei segmentation and classification in digital pathology.Scientific reports, 13(1):8614, 2023
Adrien Foucart, Olivier Debeir, and Christine Decaestecker. Panoptic quality should be avoided as a metric for assessing cell nuclei segmentation and classification in digital pathology.Scientific reports, 13(1):8614, 2023
2023
-
[8]
Bharath Hariharan, Pablo Andrés Arbeláez, Ross B. Girshick, and Jitendra Malik. Simultaneous de- tection and segmentation. In David J. Fleet, Tomás Pajdla, Bernt Schiele, and Tinne Tuytelaars, edi- tors,Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII, Lecture Notes in Computer Scienc...
-
[10]
URLhttp://arxiv.org/abs/1703.06870
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge.Medical image analysis, 67:101821, 2021
Nicholas Heller, Fabian Isensee, Klaus H Maier-Hein, Xiaoshuai Hou, Chunmei Xie, Fengyi Li, Yang Nan, Guangrui Mu, Zhiyong Lin, Miofei Han, et al. The state of the art in kidney and kidney tumor segmentation in contrast-enhanced ct imaging: Results of the kits19 challenge.Medical image analysis, 67:101821, 2021
2021
-
[12]
Every component counts: rethinking the measure of success for medical semantic segmentation in multi-instance segmentation tasks
Alexander Jaus, Constantin Marc Seibold, Simon Reiß, Zdravko Marinov, Keyi Li, Zeling Ye, Stefan Krieg, Jens Kleesiek, and Rainer Stiefelhagen. Every component counts: rethinking the measure of success for medical semantic segmentation in multi-instance segmentation tasks. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 3...
2025
-
[13]
Virtual reality-empowered deep-learning analysis of brain cells.Nature Methods, 21(7):1306–1315, 2024
Doris Kaltenecker, Rami Al-Maskari, Moritz Negwer, Luciano Hoeher, Florian Kofler, Shan Zhao, Mihail Todorov, Zhouyi Rong, Johannes Christian Paetzold, Benedikt Wiestler, et al. Virtual reality-empowered deep-learning analysis of brain cells.Nature Methods, 21(7):1306–1315, 2024
2024
-
[14]
Panoptic segmentation
Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, and Piotr Dollár. Panoptic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9404–9413, 2019
2019
-
[15]
Florian Kofler, Hendrik Möller, Josef A Buchner, Ezequiel de la Rosa, Ivan Ezhov, Marcel Rosier, Isra Mekki, Suprosanna Shit, Moritz Negwer, Rami Al-Maskari, et al. Panoptica–instance-wise evaluation of 3d semantic and instance segmentation maps.arXiv preprint arXiv:2312.02608, 2023
-
[16]
Blob loss: Instance imbalance aware loss functions for semantic segmentation
Florian Kofler, Suprosanna Shit, Ivan Ezhov, Lucas Fidon, Izabela Horvath, Rami Al-Maskari, Hong- wei Bran Li, Harsharan Bhatia, Timo Loehr, Marie Piraud, et al. Blob loss: Instance imbalance aware loss functions for semantic segmentation. InInternational Conference on Information Processing in Medical Imaging, pages 755–767. Springer, 2023
2023
-
[17]
H. W. Kuhn. The hungarian method for the assignment problem.Naval Research Logistics Quarterly, 2 (1-2):83–97, 1955. 10
1955
-
[18]
Cluster dice: a simple and fast approach for instance-based semantic segmentation evaluation via many-to-many matching
Soumya Snigdha Kundu, Aaron Kujawa, Marina Ivory, Theodore Barfoot, Jonathan Shapey, and Tom Vercauteren. Cluster dice: a simple and fast approach for instance-based semantic segmentation evaluation via many-to-many matching. InMedical Imaging 2025: Computer-Aided Diagnosis, volume 13407, pages 226–232. SPIE, 2025
2025
-
[19]
Dominic LaBella, Katherine Schumacher, Michael Mix, Kevin Leu, Shan McBurney-Lin, et al. Brain tumor segmentation (BraTS) challenge 2024: Meningioma radiotherapy planning automated segmentation. arXiv preprint arXiv:2405.18383, 2024
-
[20]
Fully Convolutional Networks for Semantic Segmentation
Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully Convolutional Networks for Semantic Segmen- tation. abs/1411.4038. URLhttp://arxiv.org/abs/1411.4038
work page internal anchor Pith review Pith/arXiv arXiv
-
[21]
Metrics reloaded: recommendations for image analysis validation.Nature methods, 21(2):195–212, 2024
Lena Maier-Hein, Annika Reinke, Patrick Godau, Minu D Tizabi, Florian Buettner, Evangelia Christodoulou, Ben Glocker, Fabian Isensee, Jens Kleesiek, Michal Kozubek, et al. Metrics reloaded: recommendations for image analysis validation.Nature methods, 21(2):195–212, 2024
2024
-
[22]
Nazanin Maleki, Raisa Amiruddin, Ahmed W Moawad, Nikolay Yordanov, Athanasios Gkampenis, Pascal Fehringer, Fabian Umeh, Crystal Chukwurah, Fatima Memon, Bojan Petrovic, et al. Analysis of the miccai brain tumor segmentation–metastases (brats-mets) 2025 lighthouse challenge: Brain metastasis segmentation on pre-and post-treatment mri.arXiv preprint arXiv:2...
-
[23]
ccDice: A topology-aware Dice score based on connected components
Pierre Rougé, Odyssée Merveille, and Nicolas Passat. ccDice: A topology-aware Dice score based on connected components. InTopology- and Graph-Informed Imaging Informatics: First International Workshop, TGI3 2024, Held in Conjunction with MICCAI 2024, Lecture Notes in Computer Science, pages 11–21. Springer, 2024. doi: 10.1007/978-3-031-73967-5\_2
-
[24]
Genetically programmable barcodes for correlative volume electron microscopy.2023 Synthetic Biology: Engineering, Evolution & Design (SEED), 2023
Felix Sigmund, Oleksandr Berezin, Sofia Beliakova, Bernhard Magerl, Martin Drawitsch, Alberto Piovesan, Filipa Gonçalves, Silviu-Vasile Bodea, Stefanie Winkler, Zoe Bousraou, et al. Genetically programmable barcodes for correlative volume electron microscopy.2023 Synthetic Biology: Engineering, Evolution & Design (SEED), 2023
2023
-
[25]
C. J. van Rijsbergen.Information Retrieval. Butterworth, 1979. ISBN 0-408-70929-4. 11 Appendix A Voronoi Matching: Formal Construction Let Ω denote the image domain and d(x, g) the (Euclidean) distance from voxel x∈Ω to the nearest voxel of ground truth segmentg∈G. The V oronoi cell ofgis V(g) ={x∈Ω :d(x, g)≤d(x, g ′)∀g ′ ∈G},(14) with ties broken arbitra...
1979
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.