pith. sign in

arxiv: 2602.04983 · v2 · submitted 2026-02-04 · 📡 eess.IV · cs.AI· cs.LG

AI-Based Detection of Temporal Changes in MR-Linac Images Acquired During Routine Prostate Radiotherapy

Pith reviewed 2026-05-16 06:39 UTC · model grok-4.3

classification 📡 eess.IV cs.AIcs.LG
keywords MR-Linac imagingprostate radiotherapytemporal change detectiondeep learningpairwise comparisonsaliency mapsinter-fraction changes
0
0 comments X

The pith

Deep learning model detects subtle inter-fraction changes in routine MR-Linac prostate images by learning temporal order.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that a deep learning model can learn to detect subtle anatomical changes across treatment fractions by training it to order MR-Linac prostate images temporally. The approach uses pairwise comparisons, with the strongest results coming from first-to-last fraction pairs that reach 95 percent accuracy and 0.99 AUC while surpassing human radiologist performance. Saliency maps reveal that the prostate, bladder, and pubic symphysis drive most of the model's decisions. Accuracy improves with longer time gaps between fractions and drops when non-radiation-exposed images are included, indicating the model picks up both natural variation and radiation-related effects. These findings point to the possibility that standard MR-Linac scans already contain enough signal for automated tracking of daily changes during prostate radiotherapy.

Core claim

The authors establish that an AI model based on temporal ordering of pairwise MR-Linac images can reliably identify inter-fraction changes in prostate radiotherapy patients. Trained on data from 761 patients, the first-to-last model achieves an AUC of 0.99 and accuracy of 0.95, outperforming a radiologist. Regions such as the prostate, bladder, and pubic symphysis are highlighted by saliency analysis as key contributors. The model's performance scales with the number of fractions between images and weakens for pre-treatment time points, supporting the view that MR-Linac imaging captures detectable temporal information suitable for broader clinical use.

What carries the argument

deep learning model for temporal ordering via pairwise comparison of first-to-last fraction image pairs

Load-bearing premise

Superior performance on the temporal ordering task means the model is detecting clinically relevant anatomical changes caused by treatment rather than technical artifacts like scanner drift or positioning differences.

What would settle it

A test showing that the model performs no better than random on pairs of images taken at the same fraction or on simulation scans where no treatment has occurred would falsify the claim that it detects inter-fraction biological changes.

Figures

Figures reproduced from arXiv: 2602.04983 by Daniel Margolis, Emily S. Weg, Heejong Kim, Himanshu Nagar, Mert R. Sabuncu, Peilin Wang, Ryan Pennell, Seungbin Park, Timothy McClure.

Figure 1
Figure 1. Figure 1: Overview of the MR-Linac workflow and AI framework. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_1.png] view at source ↗
Figure 1
Figure 1. Figure 1: (Continued) (A) Process of MR-Linac sessions across multiple fractions. [PITH_FULL_IMAGE:figures/full_fig_p020_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Pairwise performance (A) Model logits of the All-pairs model for different image pairs. Model logits reflect the model’s confidence in its predictions, which can be interpreted as the magnitude of changes detected between paired images. Pairs of numbers on the x-axis indicate the fractions at which the paired images were acquired. Cases were grouped by the first fraction and ordered and color-coded by the … view at source ↗
Figure 3
Figure 3. Figure 3: Saliency map on an atlas. The atlas was built from F1 of all patients in the test data. The visualized heatmap is the mean of saliency maps for all patients in the test data, obtained from All-pairs model inference on F1-FL pairs and transformed into the atlas space. Column titles indicate the slice orientations (axial, sagittal) and the primary regions of interest highlighted by the saliency maps, althoug… view at source ↗
read the original abstract

Purpose: To investigate whether an AI-based method can detect subtle inter-fraction changes in MR-Linac images acquired during radiotherapy and explore the broader potential of MRLinac imaging. Methods: This retrospective study included longitudinal 0.35T MR-Linac images from 761 patients. To identify temporal changes, we employed a deep learning model using temporal ordering via pairwise comparison, previously shown effective for longitudinal imaging studies. The model was trained using first-to-last fraction pairs (F1-FL) and all pairs (All-pairs). Performance was assessed using quantitative metrics (accuracy and AUC) and compared against a radiologist's performance. Qualitative evaluation was performed using saliency maps, which identify anatomical regions associated with temporal imaging changes. Results: The F1-FL model demonstrated high performance (AUC=0.99, accuracy=0.95) and outperformed the radiologist in temporal ordering task. The All-pairs model also showed high performance (AUC=0.97, accuracy=0.91). Regions contributing to predictions included the prostate, bladder, and pubic symphysis. The performance was correlated to fractional intervals and was reduced for non-radiation-exposed timepoints (Sim and F1), suggesting that observed changes may reflect both temporal variation and radiation exposure. Conclusion: MR-Linac imaging appears capable of capturing subtle changes during prostate radiotherapy that can be detected by AI models, even over approximately two-day intervals. The model's high performance, together with quantitative and qualitative analyses, supports a potential role for MR-Linac in clinical applications beyond image guidance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that a deep learning model trained on pairwise temporal ordering of MR-Linac images from 761 prostate radiotherapy patients can detect subtle inter-fraction anatomical changes. The F1-FL model achieves AUC 0.99 and accuracy 0.95, outperforming a radiologist; the All-pairs model reaches AUC 0.97. Saliency maps highlight prostate, bladder, and pubic symphysis; performance correlates with fractional interval and drops for pre-radiation (Sim/F1) pairs, supporting the conclusion that MR-Linac imaging captures radiotherapy-related changes detectable by AI.

Significance. If the mapping from ordering performance to clinically relevant biological change holds, the work would support expanded use of MR-Linac for longitudinal monitoring and adaptive planning. The large cohort size and quantitative outperformance of a human reader are positive features; however, the central interpretation remains provisional without stronger controls for acquisition confounders.

major comments (3)
  1. [Results] Results section: The central claim that AUC=0.99 on the F1-FL temporal-ordering task demonstrates detection of radiotherapy-induced anatomical evolution is not yet load-bearing. Fraction index is confounded by scanner drift, fixed positioning workflows, and non-radiation time-varying factors; the reported performance drop on Sim/F1 pairs does not isolate radiation exposure from systematic acquisition differences.
  2. [Methods] Methods: No information is given on cross-validation strategy, patient-level data splits, or explicit handling of image-quality variation across fractions. These omissions are critical for assessing whether the high AUC reflects generalizable detection of change rather than dataset-specific correlations.
  3. [Results] Results/Discussion: Saliency maps localize to prostate and bladder, yet this does not establish that the learned features correspond to measurable clinical quantities (volume change, deformation) rather than global intensity shifts or residual alignment statistics.
minor comments (2)
  1. Abstract: Include brief description of model architecture, loss function, and training hyperparameters to improve reproducibility.
  2. [Results] Results: Provide quantitative details on the radiologist comparison (task instructions, number of readers, inter-reader agreement) to allow direct assessment of the reported outperformance.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed review. We have addressed each major comment point by point below, revising the manuscript where needed to strengthen the interpretation and methodological transparency while remaining faithful to the study design and results.

read point-by-point responses
  1. Referee: [Results] Results section: The central claim that AUC=0.99 on the F1-FL temporal-ordering task demonstrates detection of radiotherapy-induced anatomical evolution is not yet load-bearing. Fraction index is confounded by scanner drift, fixed positioning workflows, and non-radiation time-varying factors; the reported performance drop on Sim/F1 pairs does not isolate radiation exposure from systematic acquisition differences.

    Authors: We appreciate the referee's caution regarding potential confounders. The manuscript already reports both the performance drop on Sim/F1 pairs and the correlation between model performance and fractional interval, which together indicate sensitivity to changes accumulating over the course of radiotherapy rather than purely time-based or acquisition-based effects. We have revised the Results and Discussion sections to adopt more measured language, stating that the high AUC reflects detection of temporal anatomical changes with supporting evidence for a radiotherapy-related component, while explicitly acknowledging scanner drift, positioning workflows, and other non-radiation factors as possible contributors. This revision avoids overclaiming isolation of radiation effects. revision: partial

  2. Referee: [Methods] Methods: No information is given on cross-validation strategy, patient-level data splits, or explicit handling of image-quality variation across fractions. These omissions are critical for assessing whether the high AUC reflects generalizable detection of change rather than dataset-specific correlations.

    Authors: We thank the referee for identifying these important omissions. The revised Methods section now specifies that a 5-fold cross-validation was performed with strict patient-level partitioning to prevent any leakage of images from the same patient across folds. Image-quality variation was mitigated through per-fraction intensity normalization to a common reference and exclusion of fractions failing a minimum signal-to-noise threshold. These additions demonstrate that the reported performance is based on generalizable, patient-independent evaluation. revision: yes

  3. Referee: [Results] Results/Discussion: Saliency maps localize to prostate and bladder, yet this does not establish that the learned features correspond to measurable clinical quantities (volume change, deformation) rather than global intensity shifts or residual alignment statistics.

    Authors: We agree that saliency maps provide localization evidence but do not constitute quantitative proof that the model has learned specific clinical metrics such as organ volume change or deformation. The revised manuscript clarifies that the saliency analysis is intended as qualitative support showing the model's attention to anatomically plausible regions (prostate, bladder, pubic symphysis) rather than uniform intensity or alignment artifacts. We have added an explicit limitation statement and a suggestion for future work correlating model activations with deformation vector fields and volume measurements derived from the same images. revision: partial

Circularity Check

0 steps flagged

No significant circularity; temporal-ordering performance derived from held-out chronological labels without reduction to fitted inputs

full rationale

The paper trains a pairwise temporal-ordering model on known fraction indices (F1-FL and All-pairs) and reports AUC/accuracy on held-out pairs. This is standard supervised evaluation; the performance metric is not algebraically equivalent to any fitted parameter by construction. The cited prior method for temporal ordering is external to the present derivation and does not carry the central claim. No self-definitional equations, ansatz smuggling, or renaming of known results appear in the reported pipeline. The skeptic concern about confounders (scanner drift, positioning) is a validity issue, not a circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that temporal ordering accuracy measures genuine anatomical change rather than imaging artifacts; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Pairwise temporal ordering performance on MR-Linac images reflects real inter-fraction anatomical variation rather than scanner or positioning artifacts
    Invoked when interpreting high AUC as evidence that MR-Linac captures subtle changes during radiotherapy.

pith-pipeline@v0.9.0 · 5618 in / 1143 out tokens · 21996 ms · 2026-05-16T06:39:26.077493+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    EAU-ESTRO-SIOG Guidelines on Prostate Cancer

    Mottet N, Bellmunt J, Bolla M, Briers E, Cumberbatch MG, De Santis M, et al. EAU-ESTRO-SIOG Guidelines on Prostate Cancer. Part 1: Screening, Diagnosis, and Local Treatment with Curative Intent. European Urology. 2017 Apr;71(4):618-29. Available from:https://linkinghub.elsevier. com/retrieve/pii/S0302283816304705

  2. [2]

    Dosimet- ric effects of adaptive prostate cancer radiotherapy in an MR-linac workflow

    Mannerberg A, Persson E, Jonsson J, Gustafsson CJ, Gunnlaugsson A, Olsson LE, et al. Dosimet- ric effects of adaptive prostate cancer radiotherapy in an MR-linac workflow. Radiation Oncology. 2020 Dec;15(1):168. Available from:https://ro-journal.biomedcentral.com/articles/10.1186/ s13014-020-01604-5

  3. [3]

    Towards Accurate and Precise Image-Guided Radio- therapy: Clinical Applications of the MR-Linac

    Randall JW, Rammohan N, Das IJ, Yadav P. Towards Accurate and Precise Image-Guided Radio- therapy: Clinical Applications of the MR-Linac. Journal of Clinical Medicine. 2022 Jul;11(14):4044. Available from:https://www.mdpi.com/2077-0383/11/14/4044

  4. [4]

    MRI-LINAC: A transformative technology in radiation oncology

    Ng J, Gregucci F, Pennell RT, Nagar H, Golden EB, Knisely JPS, et al. MRI-LINAC: A transformative technology in radiation oncology. Frontiers in Oncology. 2023 Jan;13:1117874. Available from:https: //www.frontiersin.org/articles/10.3389/fonc.2023.1117874/full

  5. [5]

    MRI-Guided Radiotherapy for Prostate Cancer: a New Paradigm

    Murgi´ c J. MRI-Guided Radiotherapy for Prostate Cancer: a New Paradigm. Acta Clinica Croatica

  6. [6]

    Available from:https://hrcak.srce.hr/clanak/414413

  7. [7]

    Case Report: MR-Guided Adaptive Ra- diotherapy, Some Room to Maneuver

    Li W, Winter J, Padayachee J, Dang J, Kong V, Chung P. Case Report: MR-Guided Adaptive Ra- diotherapy, Some Room to Maneuver. Frontiers in Oncology. 2022 Apr;12:877452. Available from: https://www.frontiersin.org/articles/10.3389/fonc.2022.877452/full. 12

  8. [8]

    Adaptive Radiotherapy for Anatomical Changes

    Sonke JJ, Aznar M, Rasch C. Adaptive Radiotherapy for Anatomical Changes. Seminars in Radia- tion Oncology. 2019 Jul;29(3):245-57. Available from:https://linkinghub.elsevier.com/retrieve/ pii/S1053429619300165

  9. [9]

    Adaptive Radiotherapy: Next-Generation Radiotherapy

    Dona Lemus OM, Cao M, Cai B, Cummings M, Zheng D. Adaptive Radiotherapy: Next-Generation Radiotherapy. Cancers. 2024 Mar;16(6):1206. Available from:https://www.mdpi.com/2072-6694/ 16/6/1206

  10. [10]

    Prostate Volume Changes during Extreme and Moderately Hypofractionated Magnetic Resonance Image-guided Radio- therapy

    Alexander SE, McNair HA, Oelfke U, Huddart R, Murray J, Pathmanathan A, et al. Prostate Volume Changes during Extreme and Moderately Hypofractionated Magnetic Resonance Image-guided Radio- therapy. Clinical Oncology. 2022 Sep;34(9):e383-91. Available from:https://linkinghub.elsevier. com/retrieve/pii/S0936655522001777

  11. [11]

    Algohary A, Alhusseini M, Breto AL, Kwon D, Xu IR, Gaston SM, et al. Longitudinal Changes and Predictive Value of Multiparametric MRI Features for Prostate Cancer Patients Treated with MRI- Guided Lattice Extreme Ablative Dose (LEAD) Boost Radiotherapy. Cancers. 2022 Sep;14(18):4475. Available from:https://www.mdpi.com/2072-6694/14/18/4475

  12. [12]

    Longitudinal analysis of T2 relaxation time variations following radiotherapy for prostate cancer

    Hanzlikova P, Vilimek D, Vilimkova Kahankova R, Ladrova M, Skopelidou V, Ruzickova Z, et al. Longitudinal analysis of T2 relaxation time variations following radiotherapy for prostate cancer. He- liyon. 2024 Jan;10(2):e24557. Available from:https://linkinghub.elsevier.com/retrieve/pii/ S2405844024005887

  13. [13]

    Longitudinal quantitative MRI in prostate cancer after radiation therapy with and without androgen deprivation therapy

    Wang YF, Tadimalla S, Thiruthaneeswaran N, Holloway L, Turner S, Hayden A, et al. Longitudinal quantitative MRI in prostate cancer after radiation therapy with and without androgen deprivation therapy. Magnetic Resonance Imaging. 2025 Oct;122:110431

  14. [14]

    Almansour H, Schick F, Nachbar M, Afat S, Fritz V, Thorwarth D, et al. Longitudinal monitoring of Apparent Diffusion Coefficient (ADC) in patients with prostate cancer undergoing MR-guided ra- diotherapy on an MR-Linac at 1.5 T: a prospective feasibility study. Radiology and Oncology. 2023 Jun;57(2):184-90. Available from:https://www.sciendo.com/article/1...

  15. [15]

    Repeatability and re- producibility of prostate apparent diffusion coefficient values on a 1.5 T magnetic resonance lin- ear accelerator

    Fernando N, Tadic T, Li W, Patel T, Padayachee J, Santiago AT, et al. Repeatability and re- producibility of prostate apparent diffusion coefficient values on a 1.5 T magnetic resonance lin- ear accelerator. Physics and Imaging in Radiation Oncology. 2024 Apr;30:100570. Available from: https://linkinghub.elsevier.com/retrieve/pii/S240563162400040X

  16. [16]

    Learning-based inference of longitudinal image changes: Applications in embryo de- 13 velopment, wound healing, and aging brain

    Kim H, Karaman BK, Zhao Q, Wang AQ, Sabuncu MR, for the Alzheimer’s Disease Neuroimag- ing Initiative. Learning-based inference of longitudinal image changes: Applications in embryo de- 13 velopment, wound healing, and aging brain. Proceedings of the National Academy of Sciences. 2025;122(8):e2411492122. Available from:https://pnas.org/doi/10.1073/pnas.2411492122

  17. [17]

    Signature verification using a “Siamese” time delay neural network

    Bromley J, Guyon I, LeCun Y, S¨ ackinger E, Shah R. Signature verification using a “Siamese” time delay neural network. In: Advances in Neural Information Processing Systems. vol. 6; 1994. p. 737-44

  18. [18]

    Learning a Similarity Metric Discriminatively, with Application to Face Verification

    Chopra S, Hadsell R, LeCun Y. Learning a Similarity Metric Discriminatively, with Application to Face Verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). vol. 1. IEEE; 2005. p. 539-46. Available from:http://ieeexplore.ieee.org/document/ 1467314/

  19. [19]

    Learning to Compare Longitudinal Images

    Kim H, Sabuncu MR. Learning to Compare Longitudinal Images. arXiv preprint arXiv:230402531. 2023

  20. [20]

    Deep Residual Learning for Image Recognition

    He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016. p. 770-8

  21. [21]

    ISBN 9781605585161

    Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning. Montreal Quebec Canada: ACM; 2009. p. 41-8. Available from:https://dl.acm.org/doi/10.1145/1553374.1553380

  22. [22]

    Grad-CAM: Vi- sual Explanations From Deep Networks via Gradient-Based Localization

    Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Vi- sual Explanations From Deep Networks via Gradient-Based Localization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV); 2017. p. 618-26. Available from:https://openaccess.thecvf.com/content_iccv_2017/html/Selvaraju_Grad-CAM_Visual_ Explanations_ICCV...

  23. [23]

    Clinical application of MR-Linac in tumor radiotherapy: a systematic review

    Liu X, Li Z, Yin Y. Clinical application of MR-Linac in tumor radiotherapy: a systematic review. Radi- ation Oncology. 2023;18(1):52. Available from:https://ro-journal.biomedcentral.com/articles/ 10.1186/s13014-023-02221-8

  24. [24]

    Urinary toxicity in patients treated with radi- cal EBRT for prostate cancer: Analysis of predictive factors in an historical series

    Pisani C, Galla A, Loi G, Beld` ı D, Krengli M. Urinary toxicity in patients treated with radi- cal EBRT for prostate cancer: Analysis of predictive factors in an historical series. Bulletin du Cancer. 2022;109(7):826-33. Available from:https://linkinghub.elsevier.com/retrieve/pii/ S0007455122001503

  25. [25]

    Willigenburg T, Van Der Velden JM, Zachiu C, Teunissen FR, Lagendijk JJW, Raaymakers BW, et al. Accumulated bladder wall dose is correlated with patient-reported acute urinary toxicity in prostate cancer patients treated with stereotactic, daily adaptive MR-guided radiotherapy. Radiotherapy 14 and Oncology. 2022;171:182-8. Available from:https://linkinghu...

  26. [26]

    Magnetic resonance imaging fea- tures of pubic symphysis urinary fistula with pubic bone osteomyelitis in the treated prostate cancer patient

    Sexton SJ, Lavien G, Said N, Eward W, Peterson AC, Gupta RT. Magnetic resonance imaging fea- tures of pubic symphysis urinary fistula with pubic bone osteomyelitis in the treated prostate cancer patient. Abdominal Radiology. 2019;44(4):1453-60. Available from:http://link.springer.com/10. 1007/s00261-018-1827-2

  27. [27]

    nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation

    Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods. 2021;18(2):203-11

  28. [28]

    Adam: A Method for Stochastic Optimization

    Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:14126980. 2014

  29. [29]

    The ANTsX ecosystem for quantitative biological and medical imaging

    Tustison NJ, Cook PA, Holbrook AJ, Johnson HJ, Muschelli J, Devenyi GA, et al. The ANTsX ecosystem for quantitative biological and medical imaging. Scientific Reports. 2021 Apr;11(1):9068. Available from:https://doi.org/10.1038/s41598-021-87564-6. 15 Characteristic Training Data Validation Data Test Data No. of patients 457 152 152 Age (y) 73 (10) 73 (11)...