LegSegNet: A Public Deep Learning System for Lower Extremity CT Tissue Segmentation and Quantification
Pith reviewed 2026-06-28 23:06 UTC · model grok-4.3
The pith
LegSegNet is the first public end-to-end system that segments four key tissues in lower extremity CT scans and computes body composition metrics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LegSegNet is a deep learning system that segments bone, skeletal muscle, subcutaneous adipose tissue, and inter/intramuscular adipose tissue from lower extremity CT scans and then derives quantitative tissue measurements for downstream clinical analysis. Trained on 1,302 manually annotated slices and evaluated on 900 held-out test slices with radiologist review, it records the highest average Dice score of 89.31 among tested 2D segmentation methods and generalizes to an external CT dataset. The authors state it is the first publicly released end-to-end system for this task.
What carries the argument
The LegSegNet segmentation model followed by a quantification step that converts predicted masks into tissue-specific measurements such as volume or area.
If this is right
- Routine CT scans can be processed automatically to yield body composition numbers without manual contouring.
- Large-scale studies of sarcopenia and musculoskeletal conditions become feasible using existing imaging archives.
- Future computer vision methods in medical imaging gain a concrete public benchmark and starting point.
- Clinical monitoring of tissue changes can rely on consistent, repeatable quantification outputs.
- Other research groups can reproduce or extend the workflow using the released code and weights.
Where Pith is reading between the lines
- The same segmentation-plus-quantification pattern could be retrained for CT scans of other body regions once suitable annotations exist.
- Embedding the system in radiology reporting software might shorten the time needed for body composition reports in practice.
- Performance on scans from scanners or patient populations outside the training distribution would need separate verification.
- Longitudinal use on repeated scans from the same patient could track tissue changes over time if calibration remains stable.
Load-bearing premise
The 1,302 annotated slices and 900 test slices, along with their radiologist-reviewed labels, are representative of real-world lower extremity CT variability and accurately capture tissue boundaries.
What would settle it
An independent collection of lower extremity CT scans with fresh radiologist annotations on which LegSegNet produces average Dice scores well below 89.31 or quantification errors outside acceptable clinical limits.
Figures
read the original abstract
Lower extremity computed tomography (CT) contains clinically relevant information for body composition analysis, sarcopenia assessment, and musculoskeletal disease monitoring, but extracting these measurements at scale requires accurate tissue segmentation and an automated quantification workflow. Existing public segmentation tools are not designed for comprehensive lower extremity CT analysis, particularly for clinically important inter/intramuscular adipose tissue, and most public methods only provide mask prediction rather than an end-to-end quantification system. To address this problem, we present LegSegNet, a deep learning system for lower extremity CT tissue segmentation and body composition quantification. Given an input CT scan, LegSegNet segments bone, skeletal muscle, subcutaneous adipose tissue, and inter/intramuscular adipose tissue. It then computes quantitative tissue measurements for downstream analysis. We developed the segmentation model using 1,302 manually annotated CT slices and evaluated it on 900 held-out test slices, with all annotations reviewed by radiologists. We benchmark LegSegNet against a broad set of 2D segmentation methods, including CNN-based models, transformer-based models, and finetuned foundation models, and further evaluate its generalization on an external public CT dataset. LegSegNet achieves the best overall segmentation performance, with an average Dice score of 89.31 on the held-out test set. To our knowledge, LegSegNet is the first publicly available end-to-end system for lower extremity CT tissue segmentation and quantification, providing a practical evaluation tool for future computer vision research in medical image analysis. The code and model weights are available at: https://github.com/mazurowski-lab/LegSegNet
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents LegSegNet, a deep learning system for segmenting bone, skeletal muscle, subcutaneous adipose tissue, and inter/intramuscular adipose tissue in lower extremity CT scans, followed by automated quantification of tissue volumes. The model is developed on 1,302 manually annotated slices (radiologist-reviewed) and evaluated on 900 held-out test slices, achieving an average Dice score of 89.31 while outperforming a broad set of 2D CNN, transformer, and finetuned foundation models; it also reports generalization on an external public dataset and releases code plus model weights, positioning itself as the first public end-to-end system for this task.
Significance. If the evaluation details are strengthened, the work would offer a practical, reproducible public tool for body composition and sarcopenia analysis from lower-extremity CT, filling a noted gap for comprehensive segmentation that includes inter/intramuscular adipose tissue. The public code release, broad benchmarking against multiple model families, and held-out evaluation are explicit strengths that support reproducibility.
major comments (3)
- [Abstract] Abstract: The headline claim of 'best overall segmentation performance' with average Dice 89.31 is presented without per-class Dice scores, error bars, or any statistical tests comparing against the benchmarked methods; this information is required to substantiate superiority and to allow readers to assess whether gains are uniform across the four tissue classes.
- [Abstract] Abstract: No information is supplied on the number of distinct patients or scanners in the 1,302 + 900 slices, whether the train/test split was performed at the patient level (versus slice level), or any inter-rater agreement statistics for the radiologist-reviewed annotations; these details are load-bearing for the assumption that the test set is representative and that reported Dice reflects model capability rather than annotation variability or selection bias.
- [Abstract] Abstract: The generalization experiment on an external public CT dataset is mentioned without quantitative results, adaptation details, or evaluation protocol; this omission weakens the robustness claim that is invoked to support the overall contribution.
minor comments (1)
- [Abstract] Abstract: The four tissue classes should be listed explicitly when stating the average Dice score so readers immediately understand which structures contribute to the reported metric.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We agree that the abstract would be strengthened by incorporating additional details on per-class performance, dataset characteristics, and generalization results. We have revised the abstract accordingly, drawing from the detailed information already present in the Methods and Results sections of the manuscript. Point-by-point responses follow.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline claim of 'best overall segmentation performance' with average Dice 89.31 is presented without per-class Dice scores, error bars, or any statistical tests comparing against the benchmarked methods; this information is required to substantiate superiority and to allow readers to assess whether gains are uniform across the four tissue classes.
Authors: We agree that the abstract would benefit from more granular information to support the superiority claim. The main text already contains per-class Dice scores for LegSegNet and all benchmarked methods, along with error bars (standard deviations across slices) and statistical comparisons. In the revised manuscript we have updated the abstract to note that LegSegNet achieves the highest scores across all four tissue classes with statistically significant improvements, explicitly directing readers to the full per-class results and tests in the Results section. revision: yes
-
Referee: [Abstract] Abstract: No information is supplied on the number of distinct patients or scanners in the 1,302 + 900 slices, whether the train/test split was performed at the patient level (versus slice level), or any inter-rater agreement statistics for the radiologist-reviewed annotations; these details are load-bearing for the assumption that the test set is representative and that reported Dice reflects model capability rather than annotation variability or selection bias.
Authors: These details are essential for evaluating the robustness of the reported results. The Methods section of the manuscript already specifies the number of patients and scanners, confirms that the split was performed at the patient level, and reports inter-rater agreement statistics on the annotations. We have revised the abstract to include concise statements summarizing these aspects so that readers can immediately assess the evaluation design without needing to consult the main text. revision: yes
-
Referee: [Abstract] Abstract: The generalization experiment on an external public CT dataset is mentioned without quantitative results, adaptation details, or evaluation protocol; this omission weakens the robustness claim that is invoked to support the overall contribution.
Authors: We concur that quantitative support is needed in the abstract to back the generalization claim. The Results section already provides the quantitative Dice scores on the external dataset, describes the adaptation approach, and outlines the evaluation protocol. We have updated the abstract to incorporate the key quantitative outcome and a brief description of the protocol and adaptation method used. revision: yes
Circularity Check
No circularity: standard held-out evaluation on independent test slices.
full rationale
The paper reports a conventional supervised segmentation workflow: model developed on 1,302 annotated slices, evaluated on 900 explicitly held-out test slices, with Dice 89.31 reported on the test set. No equations, fitted parameters, self-citations, or ansatzes are invoked that would reduce the reported metric to a definition or input by construction. The central performance claim rests on external test data and is therefore independent of the training inputs.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural-network architecture and training hyperparameters
axioms (1)
- domain assumption Radiologist-reviewed manual annotations accurately delineate true tissue boundaries in the CT slices
Reference graph
Works this paper leans on
-
[1]
Yuwen Chen, Helen Zhou, and Zachary C Lipton. Moco-transfer: Investigating out-of-distribution contrastive learning for limited-data domains.arXiv preprint arXiv:2311.09401, 2023. 1
-
[2]
Cao, Adrian Camarena, Christopher Mantyh, Roy Colglazier, and Maciej A
Yaqian Chen, Hanxue Gu, Yuwen Chen, Jicheng Yang, Haoyu Dong, Joseph Y . Cao, Adrian Camarena, Christopher Mantyh, Roy Colglazier, and Maciej A. Mazurowski. Au- tomated muscle and fat segmentation in computed tomogra- phy for comprehensive body composition analysis.Machine Learning for Biomedical Imaging, 3:581–618, 2025. 1, 2
2025
-
[3]
Mazurowski
Yuwen Chen, Nicholas Konz, Hanxue Gu, Haoyu Dong, Yaqian Chen, Lin Li, Jisoo Lee, and Maciej A. Mazurowski. Contourdiff: Unpaired medical image translation with struc- tural consistency.Machine Learning for Biomedical Imag- ing, 3:711–727, 2025. 1
2025
-
[4]
Segmentanymuscle: A universal muscle segmentation model across different locations in mri
Roy Colglazier, Jisoo Lee, Haoyu Dong, Hanxue Gu, Yaqian Chen, Joseph Cao, Zafer Yildiz, Zhonghao Liu, Nicholas Konz, Jichen Yang, et al. Segmentanymuscle: A universal muscle segmentation model across different locations in mri. arXiv preprint arXiv:2506.22467, 2025. 1
-
[5]
Sar- copenia: revised european consensus on definition and diag- nosis.Age and ageing, 48(1):16–31, 2019
Alfonso J Cruz-Jentoft, G ¨ulistan Bahat, J ¨urgen Bauer, Yves Boirie, Olivier Bruy`ere, Tommy Cederholm, Cyrus Cooper, Francesco Landi, Yves Rolland, Avan Aihie Sayer, et al. Sar- copenia: revised european consensus on definition and diag- nosis.Age and ageing, 48(1):16–31, 2019. 1, 2
2019
-
[6]
Mri- core: a foundation model for magnetic resonance imaging
Haoyu Dong, Yuwen Chen, Hanxue Gu, Nicholas Konz, Yaqian Chen, Qihang Li, and Maciej A Mazurowski. Mri- core: a foundation model for magnetic resonance imaging. arXiv preprint arXiv:2506.12186, 2025. 1
-
[7]
Quantitative analysis of skeletal muscle by com- puted tomography imaging—state of the art.Journal of or- thopaedic translation, 15:91–103, 2018
Klaus Engelke, Oleg Museyko, Ling Wang, and Jean-Denis Laredo. Quantitative analysis of skeletal muscle by com- puted tomography imaging—state of the art.Journal of or- thopaedic translation, 15:91–103, 2018. 1, 2
2018
-
[8]
3d slicer as an image comput- ing platform for the quantitative imaging network.Magnetic resonance imaging, 30(9):1323–1341, 2012
Andriy Fedorov, Reinhard Beichel, Jayashree Kalpathy- Cramer, Julien Finet, Jean-Christophe Fillion-Robin, Sonia Pujol, Christian Bauer, Dominique Jennings, Fiona Fen- nessy, Milan Sonka, et al. 3d slicer as an image comput- ing platform for the quantitative imaging network.Magnetic resonance imaging, 30(9):1323–1341, 2012. 3
2012
-
[9]
Segmentanybone: A universal model that segments any bone at any location on mri.Medi- cal Image Analysis, 101:103469, 2025
Hanxue Gu, Roy Colglazier, Haoyu Dong, Jikai Zhang, Yaqian Chen, Zafer Yildiz, Yuwen Chen, Lin Li, Jichen Yang, Jay Willhite, et al. Segmentanybone: A universal model that segments any bone at any location on mri.Medi- cal Image Analysis, 101:103469, 2025. 1, 5
2025
-
[10]
Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images
Ali Hatamizadeh, Vishwesh Nath, Yucheng Tang, Dong Yang, Holger R Roth, and Daguang Xu. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. InInternational MICCAI brainlesion workshop, pages 272–284. Springer, 2021. 4
2021
-
[11]
Mahdi Imani, Jared Buratto, Thang Dao, Erik Meijering, Sara V ogrin, Timothy CY Kwok, Eric S Orwoll, Peggy M Cawthon, and Gustavo Duque. Deep learning technique for automatic segmentation of proximal hip musculoskeletal tis- sues from ct scan images: a mros study.Journal of Cachexia, Sarcopenia and Muscle, 16(2):e13728, 2025. 2
2025
-
[12]
nnu-net: a self-configuring method for deep learning-based biomedical image segmen- tation.Nature methods, 18(2):203–211, 2021
Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Pe- tersen, and Klaus H Maier-Hein. nnu-net: a self-configuring method for deep learning-based biomedical image segmen- tation.Nature methods, 18(2):203–211, 2021. 2, 3, 5
2021
-
[13]
nnu-net revisited: A call for rigorous validation in 3d medical image segmentation
Fabian Isensee, Tassilo Wald, Constantin Ulrich, Michael Baumgartner, Saikat Roy, Klaus Maier-Hein, and Paul F Jaeger. nnu-net revisited: A call for rigorous validation in 3d medical image segmentation. InInternational Confer- ence on Medical Image Computing and Computer-Assisted Intervention, pages 488–498. Springer, 2024. 2, 3
2024
-
[14]
Deep learning-based automatic muscle segmentation of the thigh using lower ex- tremity ct images.Diagnostics, 15(22):2823, 2025
Young Jae Kim, Ji-Eun Kim, Yeonho Park, Jae Won Chai, Kwang Gi Kim, and Ja-Young Choi. Deep learning-based automatic muscle segmentation of the thigh using lower ex- tremity ct images.Diagnostics, 15(22):2823, 2025. 2, 3
2025
-
[15]
Segment any- thing
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 4015–4026, 2023. 2, 4
2023
-
[16]
Saros: A dataset for whole-body region and organ segmentation in ct imaging.Scientific Data, 11(1):483, 2024
Sven Koitka, Giulia Baldini, Lennard Kroll, Natalie van Lan- deghem, Olivia B Pollok, Johannes Haubold, Obioma Pelka, Moon Kim, Jens Kleesiek, Felix Nensa, et al. Saros: A dataset for whole-body region and organ segmentation in ct imaging.Scientific Data, 11(1):483, 2024. 3
2024
-
[17]
Breastseg- net: Multi-label segmentation of breast mri
Qihang Li, Jichen Yang, Yaqian Chen, Yuwen Chen, Hanxue Gu, Lars J Grimm, and Maciej A Mazurowski. Breastseg- net: Multi-label segmentation of breast mri. InDeep Breast Workshop on AI and Imaging for Diagnostic and Treatment Challenges in Breast Care, pages 196–205. Springer, 2025. 3
2025
-
[18]
Segment anything in medical images.Nature communications, 15(1):654, 2024
Jun Ma, Yuting He, Feifei Li, Lin Han, Chenyu You, and Bo Wang. Segment anything in medical images.Nature communications, 15(1):654, 2024. 2, 4
2024
-
[19]
3d mri brain tumor segmentation using autoencoder regularization
Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. InInternational MICCAI brain- lesion workshop, pages 311–320. Springer, 2018. 2, 4
2018
-
[20]
Automated ct seg- mentation for lower extremity tissues in lymphedema eval- uation using deep learning.European Radiology, 35(11): 6842–6852, 2025
Seongwon Na, Se Jin Choi, Yousun Ko, Bushra Urooj, Jimi Huh, Seungwoo Cha, Chul Jung, Hwayeong Cheon, Jae Yong Jeon, and Kyung Won Kim. Automated ct seg- mentation for lower extremity tissues in lymphedema eval- uation using deep learning.European Radiology, 35(11): 6842–6852, 2025. 1, 2, 3
2025
-
[21]
Attention U-Net: Learning Where to Look for the Pancreas
Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa, Kensaku Mori, Steven McDonagh, Nils Y Hammerla, Bernhard Kainz, et al. Atten- tion u-net: Learning where to look for the pancreas.arXiv preprint arXiv:1804.03999, 2018. 2, 4
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[22]
Lean tissue imaging: a new era for nutritional assessment and interven- tion.Journal of Parenteral and Enteral Nutrition, 38(8):940– 953, 2014
Carla MM Prado and Steven B Heymsfield. Lean tissue imaging: a new era for nutritional assessment and interven- tion.Journal of Parenteral and Enteral Nutrition, 38(8):940– 953, 2014. 1, 2
2014
-
[23]
Automated segmenta- tion of five different body tissues on computed tomography using deep learning.Medical physics, 50(1):178–191, 2023
Lucy Pu, Naciye S Gezer, Syed F Ashraf, Iclal Ocak, Daniel E Dresser, and Rajeev Dhupar. Automated segmenta- tion of five different body tissues on computed tomography using deep learning.Medical physics, 50(1):178–191, 2023. 2, 3
2023
-
[24]
U- net: Convolutional networks for biomedical image segmen- tation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical image com- puting and computer-assisted intervention, pages 234–241. Springer, 2015. 2, 4
2015
-
[25]
Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle as- sessment in hip-to-knee clinical ct images.Scientific reports, 15(1):125, 2025
Mazen Soufi, Yoshito Otake, Makoto Iwasa, Keisuke Ue- mura, Tomoki Hakotani, Masahiro Hashimoto, Yoshitake Yamada, Minoru Yamada, Yoichi Yokoyama, Masahiro Jin- zaki, et al. Validation of musculoskeletal segmentation model with uncertainty estimation for bone and muscle as- sessment in hip-to-knee clinical ct images.Scientific reports, 15(1):125, 2025. 1, 2, 3
2025
-
[26]
Rethinking model scaling for convolutional neural networks
Mingxing Tan, Q Efficientnet Le, et al. Rethinking model scaling for convolutional neural networks. InProceedings of the International conference on machine learning, Long Beach, CA, USA, 2019. 2, 4
2019
-
[27]
Antti Tolonen, Tomppa Pakarinen, Antti Sassi, Jere Kytt ¨a, William Cancino, Irina Rinta-Kiikka, Said Pertuz, and Otso Arponen. Methodology, clinical applications, and future di- rections of body composition analysis using computed to- mography (ct) images: a review.European journal of radi- ology, 145:109943, 2021. 1, 2
2021
-
[28]
To- talsegmentator: robust segmentation of 104 anatomic struc- tures in ct images.Radiology: Artificial Intelligence, 5(5): e230024, 2023
Jakob Wasserthal, Hanns-Christian Breit, Manfred T Meyer, Maurice Pradella, Daniel Hinck, Alexander W Sauter, Tobias Heye, Daniel T Boll, Joshy Cyriac, Shan Yang, et al. To- talsegmentator: robust segmentation of 104 anatomic struc- tures in ct images.Radiology: Artificial Intelligence, 5(5): e230024, 2023. 1, 2, 3
2023
-
[29]
Quantification of muscle, bones, and fat on single slice thigh ct
Qi Yang, Xin Yu, Ho Hin Lee, Yucheng Tang, Shunxing Bao, Kristofer S Gravenstein, Ann Zenobia Moore, Sokratis Makrogiannis, Luigi Ferrucci, and Bennett A Landman. Quantification of muscle, bones, and fat on single slice thigh ct. InMedical Imaging 2022: Image Processing, pages 422–
2022
-
[30]
Intermuscular adipose tissue rivals visceral adipose tissue in independent associations with cardiovascu- lar risk.International journal of obesity, 31(9):1400–1405,
JE Yim, S Heshka, J Albu, S Heymsfield, P Kuznia, T Harris, and D Gallagher. Intermuscular adipose tissue rivals visceral adipose tissue in independent associations with cardiovascu- lar risk.International journal of obesity, 31(9):1400–1405,
-
[31]
Deep learning– based fully automated body composition analysis of thigh ct: comparison with dxa measurement.European Radiology, 32 (11):7601–7611, 2022
Hye Jin Yoo, Young Jae Kim, Hyunsook Hong, Sung Hwan Hong, Hee Dong Chae, and Ja-Young Choi. Deep learning– based fully automated body composition analysis of thigh ct: comparison with dxa measurement.European Radiology, 32 (11):7601–7611, 2022. 2
2022
-
[32]
Intermuscular adipose tissue in obesity and related disorders: cellular ori- gins, biological characteristics and regulatory mechanisms
Ting Zhang, Jun Li, Xi Li, and Yanjun Liu. Intermuscular adipose tissue in obesity and related disorders: cellular ori- gins, biological characteristics and regulatory mechanisms. Frontiers in Endocrinology, 14:1280853, 2023. 3
2023
-
[33]
Yixin Zhang, Kevin Kramer, and Maciej A Mazurowski. How to select slices for annotation to train best-performing deep learning segmentation models for cross-sectional med- ical images?arXiv preprint arXiv:2412.08081, 2024. 1, 3
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.