Reverse Classification Accuracy: Predicting Segmentation Performance in the Absence of Ground Truth

arxiv: 1702.03407 · v1 · pith:5NQ6T4RDnew · submitted 2017-02-11 · 💻 cs.CV

Reverse Classification Accuracy: Predicting Segmentation Performance in the Absence of Ground Truth

Vanya V. Valindria , Ioannis Lavdas , Wenjia Bai , Konstantinos Kamnitsas , Eric O. Aboagye , Andrea G. Rockall , Daniel Rueckert , Ben Glocker This is my paper

classification 💻 cs.CV

keywords segmentationperformanceaccuracydatagroundreferencereversetruth

0 comments p. Extension

pith:5NQ6T4RD Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{5NQ6T4RD}

Prints a linked pith:5NQ6T4RD badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

When integrating computational tools such as automatic segmentation into clinical practice, it is of utmost importance to be able to assess the level of accuracy on new data, and in particular, to detect when an automatic method fails. However, this is difficult to achieve due to absence of ground truth. Segmentation accuracy on clinical data might be different from what is found through cross-validation because validation data is often used during incremental method development, which can lead to overfitting and unrealistic performance expectations. Before deployment, performance is quantified using different metrics, for which the predicted segmentation is compared to a reference segmentation, often obtained manually by an expert. But little is known about the real performance after deployment when a reference is unavailable. In this paper, we introduce the concept of reverse classification accuracy (RCA) as a framework for predicting the performance of a segmentation method on new data. In RCA we take the predicted segmentation from a new image to train a reverse classifier which is evaluated on a set of reference images with available ground truth. The hypothesis is that if the predicted segmentation is of good quality, then the reverse classifier will perform well on at least some of the reference images. We validate our approach on multi-organ segmentation with different classifiers and segmentation methods. Our results indicate that it is indeed possible to predict the quality of individual segmentations, in the absence of ground truth. Thus, RCA is ideal for integration into automatic processing pipelines in clinical routine and as part of large-scale image analysis studies.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

In search of truth: Evaluating concordance of AI-based anatomy segmentation models
eess.IV 2025-12 unverdicted novelty 4.0

A harmonization framework enables comparison of six AI segmentation models on 31 structures in NLST CT scans, revealing strong agreement for lungs but invalid outputs for some vertebrae and ribs.