pith. sign in

Recoverable Identifier

arXiv:2605.03144 · detector doi_compliance · incontrovertible · 2026-05-19 15:39:17.314843+00:00

advisory doi_compliance recoverable_identifier

DOI in the printed bibliography is fragmented by whitespace or line breaks. A longer candidate (10.14428/esann/2025.ES2025-39.13) was visible in the surrounding text but could not be confirmed against doi.org as printed.

Paper page Integrity report arXiv Try DOI

Evidence text

Adrien Foucart, Arthur Elskens, and Christine Decaestecker. Ranking the scores of algorithms with confidence. InEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pages 431–436, 2025. doi: https://doi.org/10.14428/esann/ 2025.ES2025-39. 13 A Technical appendices and supplementary material A.1 Literature review on evaluation pipeline limitations A variety of metrics have been used in the literature across different studies and challenges related to nuclear instance segmentation (and classification). Some metrics, such as the Dice score [15] or the Jaccard index [34], were originally designed for semantic segmentation [12]. Although these metrics are also commonly reported for nuclear instance segmentation, they are typically used as secondary metrics, as they do not adequately capture instance-level performance, particularly the ability of a model to separate overlapping nuclei [ 4]. In some studies and challenges, such as the PUMA challenge [27] or nuclear instance segmentation benchmarking for immunofluorescence images [35], although the task is still based on nuclear instance segmentation, detection-based scores such as instance-level F1-scores are also reported. Despite the variety of metrics employed, the panoptic quality (PQ) [14] and aggregated Jaccard index (AJI) [4] have emerged as the most widely adopted primary metrics for nuclear instance segmentation, as they jointly evaluate both the detection and segmentation quali

Evidence payload

{
  "printed_excerpt": "Adrien Foucart, Arthur Elskens, and Christine Decaestecker. Ranking the scores of algorithms with confidence. InEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pages 431\u2013436, 2025. doi: htt",
  "reconstructed_doi": "10.14428/esann/2025.ES2025-39.13",
  "ref_index": 39,
  "resolved_title": null,
  "verdict_class": "incontrovertible"
}