pith. sign in

arxiv: 1901.04056 · v2 · pith:ZNL4WNJ3new · submitted 2019-01-13 · 💻 cs.CV

The Liver Tumor Segmentation Benchmark (LiTS)

Patrick Bilic , Patrick Christ , Hongwei Bran Li , Eugene Vorontsov , Avi Ben-Cohen , Georgios Kaissis , Adi Szeskin , Colin Jacobs
show 101 more authors
This is my paper
classification 💻 cs.CV
keywords liversegmentationtumormiccaibestachievedalgorithmsbenchmark
0
0 comments X
read the original abstract

In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in \url{http://medicaldecathlon.com/}. In addition, both data and online evaluation are accessible via \url{www.lits-challenge.com}.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DeepTumorVQA: A Hierarchical 3D CT Benchmark for Stage-Wise Evaluation of Medical VLMs and Tool-Augmented Agents

    cs.CV 2026-05 accept novelty 8.0

    DeepTumorVQA is a new stage-wise 3D CT VQA benchmark showing that quantitative measurement is the main failure point for current medical VLMs and that tool augmentation substantially improves later reasoning stages.

  2. Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

    cs.CV 2025-09 unverdicted novelty 7.0

    Neural-MedBench reveals sharp performance drops in state-of-the-art VLMs on reasoning-intensive neurology tasks compared to conventional classification benchmarks, with reasoning failures dominating errors.

  3. BenchX: Benchmarking AI Models for Cancer Detection and Localization with Demographic and Protocol Biases

    cs.CV 2026-06 unverdicted novelty 6.0

    BenchX supplies an 85k-scan benchmark that exposes poor performance of 12 tumor-detection models on underrepresented demographic and protocol subgroups.

  4. RadThinking: A Dataset for Longitudinal Clinical Reasoning in Radiology

    cs.CV 2026-05 unverdicted novelty 6.0

    RadThinking releases a large longitudinal CT VQA dataset stratified into foundation perception questions, single-rule reasoning questions, and compositional multi-step chains grounded in clinical reporting standards f...

  5. A Deep Regression Model for Seed Identification in Prostate Brachytherapy

    eess.IV 2019-06 unverdicted novelty 5.0

    A 3D deep regression model detects 94.1% of 2286 seeds across 30 test patients and improves 16% over commercial software on clinical CT data.

  6. MAE-SAM2: Mask Autoencoder-Enhanced SAM2 for Clinical Retinal Vascular Leakage Segmentation

    q-bio.TO 2025-09 unverdicted novelty 4.0

    MAE-SAM2 integrates MAE self-supervised learning with SAM2 to achieve superior segmentation of retinal vascular leakage on fluorescein angiography images, with highest Dice/IoU scores and 5% improvement over original SAM2.