pith. machine review for the scientific record. sign in

arxiv: 2605.05933 · v1 · submitted 2026-05-07 · 💻 cs.CV

Whole-body CT attenuation and volume charts from routine clinical scans via evidence-grounded LLM report filtering

Pith reviewed 2026-05-09 15:44 UTC · model grok-4.3

classification 💻 cs.CV
keywords whole-body CTreference chartsorgan volumetissue attenuationLLM filteringradiology reportshealthy cohortsclinical data mining
0
0 comments X

The pith

Large language models filter pathological findings from radiology reports to create large healthy CT reference cohorts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an ensemble of large language models that read radiology reports, flag potential abnormalities with verbatim evidence, and cross-check to create cohorts free of obvious pathology. From over 350,000 CT exams, this produces reference distributions for volumes and attenuation values in 106 anatomical structures, adjusted for age, sex, and scan parameters. A sympathetic reader would care because quantitative CT biomarkers need these normal ranges to interpret individual scans meaningfully. The resulting charts allow centile scoring in routine practice and support research on imaging phenotypes.

Core claim

By grounding LLM decisions in report text and resolving via cross-verification across five models, the method constructs pathology-reduced cohorts from routine clinical CT data. These cohorts then yield whole-body reference charts for 106 structures, showing how volumes and attenuations vary with age, sex, contrast use, and acquisition settings. Longitudinal follow-ups in the data reveal distinct structure-specific changes over time.

What carries the argument

Evidence-grounded cross-verified LLM ensemble for structure-level abnormality flagging and resolution from verbatim radiology report text.

Load-bearing premise

The LLM ensemble accurately identifies and excludes all pathological cases from the reports while preserving a representative sample of healthy individuals.

What would settle it

A manual review of a sample of the filtered reports showing high rates of missed pathology, or the resulting reference distributions deviating substantially from known values in small healthy volunteer studies.

Figures

Figures reproduced from arXiv: 2605.05933 by Bernhard Renger, Christian Wachinger, Christopher Sp\"ath, Jan Kirschke, Marcus Makowski.

Figure 1
Figure 1. Figure 1: Study overview. Routine clinical CT examinations are automatically seg￾mented to derive structure-wise volumes and mean attenuation (HU) alongside acquisi￾tion covariates. Radiology reports are parsed with a cross-verified, evidence-grounded LLM ensemble: five models propose abnormal structures with verbatim supporting sen￾tences (Stage 1), and disputed candidates are adjudicated by a verification step usi… view at source ↗
Figure 2
Figure 2. Figure 2: Summary of two-stage LLM-based report filtering and validation against view at source ↗
Figure 3
Figure 3. Figure 3: (top) Reference body-volume trajectories estimated with GAMLSS, accounting view at source ↗
Figure 4
Figure 4. Figure 4: (top) Reference attenuation trajectories estimated with GAMLSS and strat view at source ↗
Figure 5
Figure 5. Figure 5: Report-based pathology filtering alters attenuation reference distri￾butions and improves downstream centile discrimination. (A) For three repre￾sentative structures (male, non-contrast), filtered reference curves differ from non-filtered curves, with the largest differences in the distribution tails (p5/p95), which determine extreme centile scores. (B) In CT-Rate, applying filtered versus non-filtered mod… view at source ↗
Figure 6
Figure 6. Figure 6: Individualized centile score analysis for (A) heart volume in cardiomegaly view at source ↗
Figure 7
Figure 7. Figure 7: Longitudinal volume GAMLSS fits by organ. Curves show baseline organ view at source ↗
Figure 8
Figure 8. Figure 8: Longitudinal attenuation GAMM fits for selected organs. Separate panels show view at source ↗
read the original abstract

Interpreting quantitative CT biomarkers, such as organ volume and tissue attenuation, requires large-scale healthy reference distributions. However, creating these is challenging because clinical datasets are often heavily enriched with pathology. Here, we develop an evidence-grounded, cross-verified large language model (LLM) ensemble to filter pathological findings from radiology reports, enabling the construction of pathology-reduced cohorts from over 350,000 CT examinations. Five LLMs, first, flag structure-level abnormality candidates grounded in verbatim report evidence and, second, resolve disagreements via cross-verification. Using distribution-aware generalized additive models for location, scale, and shape, we establish comprehensive whole-body reference charts for 106 anatomical structures (volumes and attenuation) across adulthood, accounting for age, sex, contrast enhancement, and acquisition parameters. Longitudinal analyses reveal structure- and contrast-dependent changes distinct from cross-sectional trends. These resources facilitate covariate-adjusted centile scoring from routine CT, supporting standardized quantitative phenotyping, multi-site imaging studies, and scalable opportunistic screening research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper describes the use of an evidence-grounded ensemble of five LLMs to filter pathological findings from radiology reports of over 350,000 CT examinations. The resulting pathology-reduced cohorts are analyzed with generalized additive models for location, scale, and shape (GAM-LSS) to produce whole-body reference charts for volume and attenuation of 106 anatomical structures across adulthood, incorporating covariates for age, sex, contrast enhancement, and acquisition parameters, along with longitudinal trend analyses.

Significance. If the LLM filtering step reliably produces representative healthy cohorts, the resulting large-scale, covariate-adjusted reference distributions would represent a substantial advance for quantitative CT imaging. Such charts could enable standardized centile scoring from routine scans, support multi-site studies, and facilitate opportunistic screening research without requiring dedicated healthy-subject acquisitions.

major comments (2)
  1. [Methods] Methods (LLM ensemble description): The manuscript reports no quantitative validation of the five-LLM filtering pipeline, such as precision/recall/F1 scores or inter-rater agreement against expert-annotated ground truth on a held-out test set of reports. Because the central claim—that the filtered data yields unbiased reference charts for healthy adults—rests entirely on the accuracy of this step, the absence of such metrics is load-bearing and must be addressed.
  2. [Results] Results (GAM-LSS reference charts): The presented centile curves and longitudinal trends lack reported uncertainty estimates, confidence bands, or effective sample sizes per age/sex/contrast stratum. Without these, it is impossible to assess the reliability of the derived reference values, especially for structures with lower prevalence or in sparsely sampled age ranges.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'cross-verified' is used without specifying the exact disagreement-resolution rule (e.g., majority vote, weighted consensus), which should be stated explicitly even at the abstract level.
  2. [Figures] Figures: Legends should explicitly state the number of examinations contributing to each plotted distribution or curve to allow readers to gauge statistical power.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. The two major comments identify important gaps in the reporting of validation and uncertainty quantification. We address each point below and will revise the manuscript accordingly to strengthen the presentation of the LLM filtering pipeline and the reference charts.

read point-by-point responses
  1. Referee: [Methods] Methods (LLM ensemble description): The manuscript reports no quantitative validation of the five-LLM filtering pipeline, such as precision/recall/F1 scores or inter-rater agreement against expert-annotated ground truth on a held-out test set of reports. Because the central claim—that the filtered data yields unbiased reference charts for healthy adults—rests entirely on the accuracy of this step, the absence of such metrics is load-bearing and must be addressed.

    Authors: We agree that quantitative performance metrics are necessary to support the reliability of the pathology-reduced cohorts. The current manuscript describes the evidence-grounded candidate flagging and cross-verification procedure but does not include held-out test-set evaluation against expert annotations. In the revised version we will add a dedicated validation subsection that reports precision, recall, and F1 scores, together with inter-rater agreement statistics, obtained from a radiologist-annotated set of 500 reports. This addition will directly address the load-bearing nature of the filtering step. revision: yes

  2. Referee: [Results] Results (GAM-LSS reference charts): The presented centile curves and longitudinal trends lack reported uncertainty estimates, confidence bands, or effective sample sizes per age/sex/contrast stratum. Without these, it is impossible to assess the reliability of the derived reference values, especially for structures with lower prevalence or in sparsely sampled age ranges.

    Authors: We concur that uncertainty quantification and stratum-specific sample sizes are required for proper interpretation of the reference charts. Although the GAM-LSS framework supports estimation of these quantities, they were omitted from the submitted figures and tables. In the revision we will overlay 95% confidence bands on all centile curves, report effective sample sizes (and effective degrees of freedom) per age/sex/contrast bin in supplementary tables, and add a brief discussion of precision in sparsely populated strata. revision: yes

Circularity Check

0 steps flagged

No circularity: external LLM filtering + GAM-LSS modeling on resulting cohort

full rationale

The paper's chain proceeds from external LLMs (five models, verbatim grounding, cross-verification) applied to radiology reports to produce a filtered cohort, followed by independent GAM-LSS fitting to derive age/sex/contrast-adjusted reference charts for 106 structures. No equations, parameters, or predictions reduce by construction to the inputs; the filtering step is presented as a methodological contribution using off-the-shelf models rather than a self-referential definition or fitted input renamed as prediction. No load-bearing self-citations, uniqueness theorems, or ansatzes imported from prior author work appear in the provided text. The derivation remains self-contained against external benchmarks and does not exhibit any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the effectiveness of LLM-based filtering and the assumption that filtered clinical data can serve as healthy reference distributions; no new physical entities are postulated.

free parameters (1)
  • GAM smoothing parameters
    Generalized additive models for location, scale, and shape require smoothing parameters that are typically estimated from data.
axioms (2)
  • domain assumption LLM ensemble can reliably detect pathological findings when grounded in verbatim report text and cross-verified
    This assumption is required for the filtering step to produce pathology-reduced cohorts.
  • domain assumption The remaining cohort after filtering represents normal physiological distributions
    Required for the reference charts to be valid across adulthood.

pith-pipeline@v0.9.0 · 5487 in / 1386 out tokens · 52429 ms · 2026-05-09T15:44:21.891158+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 26 canonical work pages · 3 internal anchors

  1. [1]

    TotalSegmentator: Robust segmentation of 104 anatomic structures in CT images

    Jakob Wasserthal, Hanns-Christian Breit, Manfred T Meyer, Maurice Pradella, Daniel Hinck, Alexander W Sauter, Tobias Heye, Daniel T Boll, Joshy Cyriac, Shan Yang, Michael Bach, and Martin Segeroth. TotalSegmentator: Robust Segmenta- tion of 104 Anatomic Structures in CT Images.Radiology. Artificial intelligence, 5 (5):e230024, September 2023. ISSN 2638-61...

  2. [2]

    Boyce, Perry J

    Cody J. Boyce, Perry J. Pickhardt, David H. Kim, Andrew J. Taylor, Thomas C. Winter, Richard J. Bruce, Mary J. Lindstrom, and J. Louis Hinshaw. Hepatic Steatosis (Fatty Liver Disease) in Asymptomatic Adults Identified by Unenhanced Low-Dose CT.American Journal of Roentgenology, 194(3):623–628, 2010. doi: 10.2214/AJR.09.2590

  3. [3]

    Pickhardt, and Scott B

    Jitka Starekova, Diego Hernando, Perry J. Pickhardt, and Scott B. Reeder. Quan- tification of Liver Fat Content with CT and MRI: State of the Art.Radiology, 301 (2):250–262, 2021. doi: 10.1148/radiol.2021204288

  4. [4]

    Automated whole-volume measurement of CT fat fraction of the pancreas: Correlation with Dixon MR imaging.British Journal of Radiology, 96(1146):20220937, June 2023

    Masahiro Tanabe, Mayumi Higashi, Masaya Tanabe, Yosuke Kawano, Atsuo Inoue, Koji Narikiyo, Taiga Kobayashi, Takaaki Ueda, and Katsuyoshi Ito. Automated whole-volume measurement of CT fat fraction of the pancreas: Correlation with Dixon MR imaging.British Journal of Radiology, 96(1146):20220937, June 2023. ISSN 0007-1285. doi: 10.1259/bjr.20220937

  5. [5]

    Kuchel, Stephen Pandol, and Faraz Bishehsari

    Sameer Bhalla, George A. Kuchel, Stephen Pandol, and Faraz Bishehsari. As- sociation of Pancreatic Fatty Infiltration With Age and Metabolic Syndrome Is Sex-Dependent.Gastro Hep Advances, 1(3):344–349, March 2022. ISSN 2772-5723. doi: 10.1016/j.gastha.2022.01.007

  6. [6]

    Graffy, Timothy J

    Samuel Jang, Peter M. Graffy, Timothy J. Ziemlewicz, Scott J. Lee, Ronald M. Summers, and Perry J. Pickhardt. Opportunistic Osteoporosis Screening at Routine 41 Abdominal and Thoracic CT: Normative L1 Trabecular Attenuation Values in More than 20 000 Adults.Radiology, 291(2):360–367, May 2019. ISSN 0033-8419. doi: 10.1148/radiol.2019181648

  7. [7]

    Boutin and Leon Lenchik

    Robert D. Boutin and Leon Lenchik. Value-Added Opportunistic CT: Insights Into Osteoporosis and Sarcopenia.American Journal of Roentgenology, 215(3):582–594, September 2020. ISSN 0361-803X. doi: 10.2214/AJR.20.22874

  8. [8]

    Pickhardt

    Perry J. Pickhardt. Value-added Opportunistic CT Screening: State of the Art. Radiology, 303(2):241–254, May 2022. ISSN 0033-8419, 1527-1315. doi: 10.1148/ radiol.211561

  9. [9]

    Pickhardt, Ronald M

    Perry J. Pickhardt, Ronald M. Summers, John W. Garrett, Arun Krishnaraj, Sheela Agarwal, Keith J. Dreyer, and Gregory N. Nicola. Opportunistic Screening:Radiol- ogyScientific Expert Panel.Radiology, 307(5):e222044, June 2023. ISSN 0033-8419, 1527-1315. doi: 10.1148/radiol.222044

  10. [10]

    The lms method for constructing normalized growth standards.Euro- pean journal of clinical nutrition, 44(1):45–60, 1990

    Tim J Cole. The lms method for constructing normalized growth standards.Euro- pean journal of clinical nutrition, 44(1):45–60, 1990

  11. [11]

    T.J. Cole. The development of growth references and growth charts.Annals of Human Biology, 39(5):382–394, September 2012. ISSN 0301-4460, 1464-5033. doi: 10.3109/03014460.2012.694475

  12. [12]

    R. A.I. Bethlehem, J. Seidlitz, S. R. White, J. W. Vogel, K. M. Anderson, C. Adam- son, S. Adler, G. S. Alexopoulos, E. Anagnostou, A. Areces-Gonzalez, D. E. Astle, B. Auyeung, M. Ayub, J. Bae, G. Ball, S. Baron-Cohen, R. Beare, S. A. Bedford, V. Benegal, F. Beyer, J. Blangero, M. Blesa C´ abez, J. P. Boardman, M. Borzage, J. F. Bosch-Bayard, N. Bourke, V...

  13. [13]

    Ruhe, Christian F

    Saige Rutherford, Seyed Mostafa Kia, Thomas Wolfers, Charlotte Fraza, Mariam Zabihi, Richard Dinga, Pierre Berthet, Amanda Worker, Serena Verdi, Henricus G. Ruhe, Christian F. Beckmann, and Andre F. Marquand. The normative modeling framework for computational psychiatry.Nature Protocols, 17(7):1711–1734, July

  14. [14]

    doi: 10.1038/s41596-022-00696-5

    ISSN 1754-2189. doi: 10.1038/s41596-022-00696-5

  15. [15]

    Makowski

    Christian Wachinger, Bernhard Renger, Christopher Sp¨ ath, and Marcus R. Makowski. Body Charts from CT Segmentations across the Adult Lifespan: Large- scale Cross-sectional and Longitudinal Analyses.Radiology: Artificial Intelligence, page e250506, December 2025. ISSN 2638-6100. doi: 10.1148/ryai.250506

  16. [16]

    Littlejohns, Jo Holliday, Lorna M

    Thomas J. Littlejohns, Jo Holliday, Lorna M. Gibson, Steve Garratt, Niels Oesing- mann, Fidel Alfaro-Almagro, Jimmy D. Bell, Chris Boultwood, Rory Collins, Megan C. Conroy, Nicola Crabtree, Nicola Doherty, Alejandro F. Frangi, Nicholas C. Harvey, Paul Leeson, Karla L. Miller, Stefan Neubauer, Steffen E. Petersen, Jonathan Sellors, Simon Sheard, Stephen M....

  17. [17]

    Schlett, Michael Forsting, Susanne C

    Fabian Bamberg, Hans-Ulrich Kauczor, Sabine Weckbach, Christopher L. Schlett, Michael Forsting, Susanne C. Ladd, Karin Halina Greiser, Marc-Andr´ e Weber, Jeanette Schulz-Menger, and Thoralf Niendorf. Whole-Body MR Imaging in the German National Cohort: Rationale, Design, and Technical Background.Radiology, 2015

  18. [18]

    European Society of Radiology (ESR) http://www. myESR. org communica- 44 tions@ myESR. org. Good practice for radiological reporting. guidelines from the european society of radiology (esr).Insights into imaging, 2(2):93–96, 2011

  19. [19]

    Natural language processing technologies in radiology research and clinical applications.Radiograph- ics, 36(1):176–191, 2016

    Tianrun Cai, Andreas A Giannopoulos, Sheng Yu, Tatiana Kelil, Beth Ripley, Kanako K Kumamaru, Frank J Rybicki, and Dimitrios Mitsouras. Natural language processing technologies in radiology research and clinical applications.Radiograph- ics, 36(1):176–191, 2016

  20. [20]

    A survey of large language models in medicine: Progress, application, and challenge

    Hongjian Zhou, Fenglin Liu, Boyang Gu, Xinyu Zou, Jinfa Huang, Jinge Wu, Yiru Li, Sam S Chen, Peilin Zhou, Junling Liu, et al. A survey of large language models in medicine: Progress, application, and challenge.arXiv preprint arXiv:2311.05112, 2023

  21. [21]

    The use of large language models in clinical documentation: A scoping review

    Brigitte Fong Yeong Woo, Kenrick Cato, Hannah Cho, Sang Bin You, and Jiyoun Song. The use of large language models in clinical documentation: A scoping review. International Journal of Nursing Studies, page 105322, 2025

  22. [22]

    R. A. Rigby and D. M. Stasinopoulos. Generalized Additive Models for Location, Scale and Shape.Journal of the Royal Statistical Society Series C: Applied Statistics, 54(3):507–554, June 2005. ISSN 0035-9254. doi: 10.1111/j.1467-9876.2005.00510.x

  23. [23]

    Frongillo, Laurence Grummer-Strawn, S

    Elaine Borghi, Mercedes de Onis, Cutberto Garza, Jan Van den Broeck, Edward A. Frongillo, Laurence Grummer-Strawn, S. Van Buuren, H. Pan, L. Molinari, and Reynaldo Martorell. Construction of the World Health Organization child growth standards: Selection of methods for attained growth curves.Statistics in medicine, 25(2):247–265, 2006. doi: 10.1002/sim.2227

  24. [24]

    Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography, April 2025

    Ibrahim Ethem Hamamci, Sezgin Er, Chenyu Wang, Furkan Almas, Ayse Gul- nihan Simsek, Sevval Nil Esirgun, Irem Doga, Omer Faruk Durugol, Weicheng Dai, Murong Xu, Muhammed Furkan Dasdelen, Bastian Wittmann, Tamaz Ami- ranashvili, Enis Simsar, Mehmet Simsar, Emine Bensu Erdemir, Abdullah Alanbay, 45 Anjany Sekuboyina, Berkan Lafci, Christian Bluethgen, Kayha...

  25. [25]

    IN- SPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

    Shih-Cheng Huang, Zepeng Huo, Ethan Steinberg, Chia-Chun Chiang, Matthew P Lungren, Curtis P Langlotz, Serena Yeung, Nigam H Shah, and Jason A Fries. IN- SPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis. InNeurIPS, 2023

  26. [26]

    Shah, Andrew Johnston, Robert D

    Louis Blankemeier, Joseph Paul Cohen, Ashwin Kumar, Dave Van Veen, Syed Ja- mal Safdar Gardezi, Magdalini Paschali, Zhihong Chen, Jean-Benoit Delbrouck, Ed- uardo Reis, Cesar Truyts, Christian Bluethgen, Malte Engmann Kjeldskov Jensen, Sophie Ostmeier, Maya Varma, Jeya Maria Jose Valanarasu, Zhongnan Fang, Zepeng Huo, Zaid Nabulsi, Diego Ardila, Wei-Hung ...

  27. [27]

    Boutin, Justin M

    Robert D. Boutin, Justin M. Kaptuch, Cyrus P. Bateni, James S. Chalfant, and Lawrence Yao. Influence of IV Contrast Administration on CT Measures of Muscle and Bone Attenuation: Implications for Sarcopenia and Osteoporosis Evaluation. AJR. American journal of roentgenology, 207(5):1046–1054, November 2016. ISSN 1546-3141. doi: 10.2214/AJR.16.16387

  28. [28]

    Holcombe, Steven R

    Sven A. Holcombe, Steven R. Horbal, Brian E. Ross, Edward Brown, Brian A. Derstine, and Stewart C. Wang. Variation in aorta attenuation in contrast-enhanced CT and its implications for calcification thresholds.PLOS ONE, 17(11):e0277111, November 2022. ISSN 1932-6203. doi: 10.1371/journal.pone.0277111. 46

  29. [29]

    Body size and tube voltage dependent corrections for hounsfield unit in medical x-ray computed tomog- raphy: theory and experiments.Scientific Reports, 10(1):15696, 2020

    Xiaoming Zheng, Yazan Al-Hayek, Chris Cummins, Xiaotian Li, Laura Nardi, Khaled Albari, James Evans, Evan Roworth, and Ty Seaton. Body size and tube voltage dependent corrections for hounsfield unit in medical x-ray computed tomog- raphy: theory and experiments.Scientific Reports, 10(1):15696, 2020

  30. [30]

    Performance evaluation of computed tomography systems.The report of AAPM N, 233, 2019

    American Association of Physicists in Medicine et al. Performance evaluation of computed tomography systems.The report of AAPM N, 233, 2019

  31. [31]

    Meier, Abass Alavi, Sireesha Iruvuri, Saad Alzeair, Rex Parker, Mo- hamed Houseni, Miguel Hernandez-Pampaloni, Andrew Mong, and Drew A

    Jeffrey M. Meier, Abass Alavi, Sireesha Iruvuri, Saad Alzeair, Rex Parker, Mo- hamed Houseni, Miguel Hernandez-Pampaloni, Andrew Mong, and Drew A. To- rigian. Assessment of Age-Related Changes in Abdominal Organ Structure and Function With Computed Tomography and Positron Emission Tomography.Sem- inars in Nuclear Medicine, 37(3):154–172, May 2007. ISSN 00...

  32. [32]

    Reeder, Alejandro Mu˜ noz del Rio, and Perry J

    Luke Hahn, Scott B. Reeder, Alejandro Mu˜ noz del Rio, and Perry J. Pickhardt. Longitudinal Changes in Liver Fat Content in Asymptomatic Adults: Hepatic At- tenuation on Unenhanced CT as an Imaging Biomarker for Steatosis.AJR. Amer- ican journal of roentgenology, 205(6):1167–1172, December 2015. ISSN 1546-3141. doi: 10.2214/AJR.15.14724

  33. [33]

    Marques, Vilmundur Gudnason, Thomas Lang, Sigur- dur Sigurdsson, Palmi V

    Pedro Figueiredo, Elisa A. Marques, Vilmundur Gudnason, Thomas Lang, Sigur- dur Sigurdsson, Palmi V. Jonsson, Thor Aspelund, Kristin Siggeirsdottir, Lenore Launer, Gudny Eiriksdottir, and Tamara B. Harris. Computed tomography-based skeletal muscle and adipose tissue attenuation: Variations by age, sex, and mus- cle.Experimental gerontology, 149:111306, Ju...

  34. [34]

    Graffy, Jiamin Liu, Perry J

    Peter M. Graffy, Jiamin Liu, Perry J. Pickhardt, Joseph E. Burns, Jianhua Yao, and Ronald M. Summers. Deep learning-based muscle segmentation and quantification 47 at abdominal CT: Application to a longitudinal adult screening cohort for sarcope- nia assessment.The British Journal of Radiology, 92(1100):20190327, August 2019. ISSN 0007-1285. doi: 10.1259/...

  35. [35]

    Qwen3 Technical Report

    Chang Yu, Chengen Gao, Chenxu Huang, Chujie Lv, Dayiheng Zheng, Fan Liu, Fei Zhou, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou, Jingren Zhou, Junyang Lin, Kai Dang, Keqin Bao, Kexin Yang, Le Yu, Lianghao Deng, Mei Li, Mingfeng Xue, Mingze Li, Pei Zhang, Peng Wang, Qin Zhu,...

  36. [36]

    Qwen2.5 technical report,

    An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jian- hong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tianyi...

  37. [37]

    Qwen2.5 Technical Report

    URLhttps://arxiv.org/abs/2412.15115. arXiv preprint

  38. [38]

    Llama 3.3 model card, 2025

    Meta AI. Llama 3.3 model card, 2025. URLhttps://www.llama.com/docs/ model-cards-and-prompt-formats/llama3_3/. Model card for the Llama 3.3 70B instruction-tuned model

  39. [39]

    The Llama 3 Herd of Models

    Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex 48 Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, As- ton Zhang, Aurelien Rodriguez, Austen Gregerson, ...

  40. [40]

    Openbiollm-70b: An open-source biomedical large language model,

    Saama AI Labs. Openbiollm-70b: An open-source biomedical large language model,

  41. [41]

    Origi- nal model card for the OpenBioLLM-70B biomedical LLM

    URLhttps://huggingface.co/aaditya/Llama3-OpenBioLLM-70B. Origi- nal model card for the OpenBioLLM-70B biomedical LLM

  42. [42]

    Medgemma-27b-it model card, 2026

    Google. Medgemma-27b-it model card, 2026. URLhttps://huggingface. co/google/medgemma-27b-it. Model card for the MedGemma 27B text-only instruction-tuned model. 49

  43. [43]

    Medgemma: Gemma- based medical vision-language models, 2025

    Andrew Sellergren, Shayan Kazemzadeh, Rachad Jaroensri, Franz Kiraly, Anselm Traverse, Timo Kohlberger, Mengzhou Xu, Taha Jamil, et al. Medgemma: Gemma- based medical vision-language models, 2025. URLhttps://arxiv.org/abs/2507. 05201. MedGemma Technical Report

  44. [44]

    Patrick Royston and Douglas G. Altman. Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling.Applied statistics, pages 429–467, 1994. doi: 10.2307/2986270

  45. [45]

    Skew-normal distribution

    Adelchi Azzalini. Skew-normal distribution. InInternational Encyclopedia of Sta- tistical Science, pages 1342–1344. Springer, 1986

  46. [46]

    Saykin, Martin Reuter, and Anna Rieckmann

    Christian Wachinger, Kwangsik Nho, Andrew J. Saykin, Martin Reuter, and Anna Rieckmann. A Longitudinal Imaging Genetics Study of Neuroanatomical Asymme- try in Alzheimer’s Disease.Biological psychiatry, 2018

  47. [47]

    Wood.Generalized Additive Models

    Simon N. Wood.Generalized Additive Models. Chapman and Hall/CRC, May 2017. ISBN 978-1-315-37027-9. doi: 10.1201/9781315370279