pith. machine review for the scientific record. sign in

arxiv: 2605.06160 · v1 · submitted 2026-05-07 · 💻 cs.CV

Recognition: unknown

Beyond Forgetting in Continual Medical Image Segmentation: A Comprehensive Benchmark Study

Authors on Pith no claims yet

Pith reviewed 2026-05-08 13:43 UTC · model grok-4.3

classification 💻 cs.CV
keywords continual learningmedical image segmentationbenchmarkcatastrophic forgettingplasticitydomain shiftincremental learningreplay-based methods
0
0 comments X

The pith

Benchmark shows replay-based methods best balance stability and plasticity in continual medical image segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that continual learning for medical image segmentation must address more than forgetting to work in real clinics where domains, classes, and organs change over time. It introduces three scenarios that mirror clinical shifts and a set of metrics covering performance, forgetting, plasticity, forward generalization, and efficiency. Experiments with representative methods indicate that replay approaches come closest to balancing retention of old knowledge with acquisition of new capabilities. A reader would care because medical segmentation models deployed in hospitals need to adapt without full retraining or loss of prior accuracy on previous patients and tasks.

Core claim

The authors claim that developing a model satisfying all requirements simultaneously remains challenging. Their studies suggest replay-based methods achieve the best overall balance between stability and plasticity, parameter-isolation methods reduce forgetting effectively but increase model size, and forward generalizability is a significantly understudied aspect of the field.

What carries the argument

The evaluation framework with three clinically motivated scenarios—Domain-CL for cross-center domain shift, Class-CL for incremental anatomical structures, and Organ-CL for cross-organ segmentation—plus metrics for general performance, forgetting, plasticity, forward generalizability, parameter efficiency, and replay burden.

Load-bearing premise

The selected representative continual learning methods adequately represent their categories and the metrics fully capture essential properties for clinical deployment.

What would settle it

A method that simultaneously delivers high general performance, low forgetting, high plasticity, strong forward generalizability, high parameter efficiency, and low replay burden across all three scenarios would show that satisfying every requirement at once is not as difficult as concluded.

Figures

Figures reproduced from arXiv: 2605.06160 by Bomin Wang, Hangqi Zhou, Xiahai Zhuang, Yibo Gao.

Figure 1
Figure 1. Figure 1: Overview of the proposed benchmark for continual medical image segmentation. view at source ↗
Figure 2
Figure 2. Figure 2: Task-order robustness performance of CL methods. Subfigures show (a) A-Dice and (b) BWTR results with view at source ↗
Figure 3
Figure 3. Figure 3: Additional analyses of replay-based methods and foundation-model behavior. (a) Effect of buffer size on A-Dice view at source ↗
Figure 4
Figure 4. Figure 4: CL and its connections with related machine learning paradigms. view at source ↗
read the original abstract

Continual learning (CL) is essential for deploying medical image segmentation models in clinical environments where imaging domains, anatomical targets, and diagnostic tasks evolve over time. However, continual segmentation still faces three main challenges. First, the scenarios for this task remain insufficiently standardized for real-world clinical settings. Second, existing research has been primarily focused on mitigating forgetting, overlooking the other essential properties such as plasticity. Third, a benchmark work with comprehensive evaluation on existing methods is stll desirable. To address these gaps, we present such benchmark study of continual medical image segmentation. We first define three clinically motivated scenarios, namely Domain-CL, Class-CL, and Organ-CL, to respectively capture the cross-center domain shift, the incremental anatomical structure segmentation, and the cross-organ segmentation. We then introduce an evaluation framework that measures not only general performance and forgetting, but also plasticity, forward generalizability, parameter efficiency, and replay burden. The results, from extensive experiments with representative CL methods, showed that it was still challenging to develop a model that could satisfy all the requirements simultaneously. Nevertheless, these studies also suggested that the replay-based methods achieve the best overall balance between stability and plasticity, the parameter-isolation methods should be effective at reducing forgetting, though at the cost of increased model size, and the forward generalizability remain a significantly understudied aspect of this research field. Finally, we discuss related learning paradigms and outline future directions for continual medical image segmentation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript presents a benchmark study on continual learning (CL) for medical image segmentation. It defines three clinically motivated scenarios (Domain-CL, Class-CL, Organ-CL) to address standardization gaps. An evaluation framework is introduced that assesses not only forgetting but also plasticity, forward generalizability, parameter efficiency, and replay burden. Extensive experiments with representative CL methods lead to conclusions that balancing all requirements is challenging, replay-based methods offer the best stability-plasticity trade-off, parameter-isolation methods reduce forgetting at the expense of model size, and forward generalizability is understudied.

Significance. If the experimental results and category-level conclusions hold, this work provides a valuable comprehensive benchmark that shifts focus beyond forgetting to other clinically relevant properties in continual medical image segmentation. It highlights the difficulty in satisfying all desiderata simultaneously and identifies promising directions, such as the strengths of replay-based approaches. The introduction of new metrics and scenarios is a strength, offering a more holistic view than prior work focused narrowly on catastrophic forgetting.

major comments (2)
  1. [§4 (Experiments)] The claim that replay-based methods achieve the best overall balance relies on the selected representative methods for each category. The paper should provide more details on the specific algorithms chosen (e.g., which replay variants and how many), their hyperparameter tuning process, and why they are representative, as narrow selection could bias the category-level rankings and the 'still challenging' assessment.
  2. [§3 (Evaluation Framework)] The newly proposed metrics (plasticity, forward generalizability, replay burden) are central to the conclusions, but their definitions and validation against clinical deployment needs are not sufficiently justified. For example, it is unclear how these proxies correlate with real-world clinical requirements, which could affect the interpretation that forward generalizability remains understudied.
minor comments (3)
  1. [Abstract] Typo: 'stll' should be 'still'.
  2. [Abstract] The phrasing 'the parameter-isolation methods should be effective' is tentative; consider making it more definitive or explaining the basis if results support it.
  3. [Throughout] Ensure all method names and acronyms are defined at first use for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive overall assessment of our benchmark study. We address each major comment point by point below, providing the strongest honest responses possible based on the manuscript content. We will revise the paper to incorporate additional details and clarifications.

read point-by-point responses
  1. Referee: [§4 (Experiments)] The claim that replay-based methods achieve the best overall balance relies on the selected representative methods for each category. The paper should provide more details on the specific algorithms chosen (e.g., which replay variants and how many), their hyperparameter tuning process, and why they are representative, as narrow selection could bias the category-level rankings and the 'still challenging' assessment.

    Authors: We agree that greater transparency on method selection is warranted to support the category-level conclusions. The manuscript already selects methods as standard representatives from each CL category based on their prevalence in the literature and suitability for segmentation (e.g., replay variants covering buffer-based and generative approaches). In the revision, we will expand §4 with: the complete list of chosen algorithms per category with citations, the hyperparameter tuning protocol (grid search over key parameters such as learning rate, buffer size, and regularization coefficients using validation splits from initial tasks), and explicit rationale for representativeness drawn from recent CL surveys. This addresses potential bias concerns without altering the experimental results or the assessment that balancing all requirements remains challenging. revision: yes

  2. Referee: [§3 (Evaluation Framework)] The newly proposed metrics (plasticity, forward generalizability, replay burden) are central to the conclusions, but their definitions and validation against clinical deployment needs are not sufficiently justified. For example, it is unclear how these proxies correlate with real-world clinical requirements, which could affect the interpretation that forward generalizability remains understudied.

    Authors: The metrics are defined in §3 to extend beyond forgetting and align with the three clinically motivated scenarios introduced in the paper. Plasticity captures adaptation to new domains/classes/organs, forward generalizability evaluates performance on future unseen data (critical for evolving clinical environments), and replay burden quantifies storage overhead relevant to deployment constraints. We will revise §3 to include more explicit linkages to clinical needs, supported by references to medical imaging deployment literature, and clarify limitations of these proxies. While a full empirical correlation study with real-world clinical outcomes exceeds the scope of this benchmark, the added discussion will better justify why forward generalizability is identified as understudied. revision: partial

Circularity Check

0 steps flagged

Empirical benchmark study with no circular derivations or self-referential predictions

full rationale

The paper defines three clinically motivated scenarios (Domain-CL, Class-CL, Organ-CL), introduces an evaluation framework with metrics for general performance, forgetting, plasticity, forward generalizability, parameter efficiency, and replay burden, then reports results from experiments on representative CL methods. No mathematical derivations, equations, or predictions exist that reduce to fitted parameters or inputs by construction. Conclusions follow directly from the experimental comparisons against external method categories, rendering the work self-contained without load-bearing self-citations, ansatz smuggling, or renaming of known results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on empirical comparisons rather than new theoretical derivations; the main assumptions are about the representativeness of the scenarios and methods tested.

axioms (1)
  • domain assumption The three defined scenarios (Domain-CL, Class-CL, Organ-CL) adequately represent real-world clinical continual learning challenges.
    Invoked when presenting the scenarios as clinically motivated to capture cross-center shift, incremental structures, and cross-organ segmentation.

pith-pipeline@v0.9.0 · 5565 in / 1464 out tokens · 68766 ms · 2026-05-08T13:43:58.793478+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

80 extracted references · 3 canonical work pages · 1 internal anchor

  1. [1]

    Incorporating neuro-inspired adapt- ability for continual learning in artificial intelligence[J]

    Wang L, Zhang X, Li Q, et al. Incorporating neuro-inspired adapt- ability for continual learning in artificial intelligence[J]. Nature Machine Intelligence, 2023, 5(12): 1356-1368

  2. [2]

    Computational models of the neural bases of learning and memory[J]

    Gluck M A, Granger R. Computational models of the neural bases of learning and memory[J]. Annual review of neuroscience, 1993, 16(1): 667-706

  3. [3]

    AI in health and medicine[J]

    Rajpurkar P , Chen E, Banerjee O, et al. AI in health and medicine[J]. Nature medicine, 2022, 28(1): 31-38

  4. [4]

    Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J]

    Gulshan V , Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J]. Jama, 2016, 316(22): 2402-2410

  5. [5]

    A continual learning survey: Defying forgetting in classification tasks[J]

    De Lange M, Aljundi R, Masana M, et al. A continual learning survey: Defying forgetting in classification tasks[J]. IEEE transac- tions on pattern analysis and machine intelligence, 2021, 44(7): 3366-3385

  6. [6]

    A comprehensive survey of contin- ual learning: Theory, method and application[J]

    Wang L, Zhang X, Su H, et al. A comprehensive survey of contin- ual learning: Theory, method and application[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

  7. [7]

    Continual learning: A review of techniques, challenges, and future directions[J]

    Wickramasinghe B, Saha G, Roy K. Continual learning: A review of techniques, challenges, and future directions[J]. IEEE Transac- tions on Artificial Intelligence, 2023, 5(6): 2526-2546

  8. [8]

    Clinical applications of continual learning machine learning[J]

    Lee C S, Lee A Y. Clinical applications of continual learning machine learning[J]. The Lancet Digital Health, 2020, 2(6): e279- e281

  9. [9]

    Domain generalization for medical imaging classification with linear-dependency regularization[J]

    Li H, Wang Y F, Wan R, et al. Domain generalization for medical imaging classification with linear-dependency regularization[J]. Advances in neural information processing systems, 2020, 33: 3118-3129

  10. [10]

    Health insurance portability and accountability act of 1996[J]

    Act A. Health insurance portability and accountability act of 1996[J]. Public law, 1996, 104: 191

  11. [11]

    General data protection regulation[J]

    Regulation P . General data protection regulation[J]. Intouch, 2018, 25: 1-5

  12. [12]

    From development to deployment: dataset shift, causality, and shift-stable models in health AI[J]

    Subbaswamy A, Saria S. From development to deployment: dataset shift, causality, and shift-stable models in health AI[J]. Biostatistics, 2020, 21(2): 345-352

  13. [13]

    Development and evaluation of an artificial intelligence system for COVID-19 diagnosis[J]

    Jin C, Chen W, Cao Y, et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis[J]. Nature Communications, 2020, 11(1): 5088

  14. [14]

    Segment anything in medical images[J]

    Ma J, He Y, Li F, et al. Segment anything in medical images[J]. Nature Communications, 2024, 15(1): 654

  15. [15]

    Continual learning with deep generative replay[J]

    Shin H, Lee J K, Kim J, et al. Continual learning with deep generative replay[J]. Advances in neural information processing systems, 2017, 30

  16. [16]

    Overcoming catas- trophic forgetting in neural networks[J]

    Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming catas- trophic forgetting in neural networks[J]. Proceedings of the na- tional academy of sciences, 2017, 114(13): 3521-3526

  17. [17]

    Continual learning through synaptic intelligence[C]//International conference on machine learning

    Zenke F, Poole B, Ganguli S. Continual learning through synaptic intelligence[C]//International conference on machine learning. PMLR, 2017: 3987-3995

  18. [18]

    Learning without forgetting[J]

    Li Z, Hoiem D. Learning without forgetting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(12): 2935- 2947

  19. [19]

    Medical image analysis on left atrial LGE MRI for atrial fibrillation studies: A review[J]

    Li L, Zimmer V A, Schnabel J A, et al. Medical image analysis on left atrial LGE MRI for atrial fibrillation studies: A review[J]. Medical Image Analysis, 2022, 77: 102360

  20. [20]

    Measuring and regulariz- ing networks in function space[C]//International Conference on Learning Representations

    Benjamin A, Rolnick D, Kording K. Measuring and regulariz- ing networks in function space[C]//International Conference on Learning Representations

  21. [21]

    Forget-free continual learning with winning subnetworks[C]//International Confer- ence on Machine Learning

    Kang H, Mina R J L, Madjid S R H, et al. Forget-free continual learning with winning subnetworks[C]//International Confer- ence on Machine Learning. PMLR, 2022: 10734-10750

  22. [22]

    Progressive Neural Networks

    Rusu A A, Rabinowitz N C, Desjardins G, et al. Progressive neural networks[J]. arXiv preprint arXiv:1606.04671, 2016

  23. [23]

    Incremental learning through deep adaptation[J]

    Rosenfeld A, Tsotsos J K. Incremental learning through deep adaptation[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(3): 651-663

  24. [24]

    Experience replay for continual learning[J]

    Rolnick D, Ahuja A, Schwarz J, et al. Experience replay for continual learning[J]. Advances in neural information processing systems, 2019, 32

  25. [25]

    Gcr: Gradient coreset based replay buffer selection for continual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Tiwari R, Killamsetty K, Iyer R, et al. Gcr: Gradient coreset based replay buffer selection for continual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 99-108

  26. [26]

    AtrialJSQnet: a new framework for joint segmentation and quantification of left atrium and scars incorporating spatial and shape information[J]

    Li L, Zimmer V A, Schnabel J A, et al. AtrialJSQnet: a new framework for joint segmentation and quantification of left atrium and scars incorporating spatial and shape information[J]. Medical Image Analysis, 2022, 76: 102303

  27. [27]

    Learning to Learn with- out Forgetting by Maximizing Transfer and Minimizing Interfer- ence[C]//International Conference on Learning Representations

    Riemer M, Cases I, Ajemian R, et al. Learning to Learn with- out Forgetting by Maximizing Transfer and Minimizing Interfer- ence[C]//International Conference on Learning Representations

  28. [28]

    Gradient episodic memory for con- tinual learning[J]

    Lopez-Paz D, Ranzato M A. Gradient episodic memory for con- tinual learning[J]. Advances in neural information processing systems, 2017, 30

  29. [29]

    Efficient Lifelong Learning with A-GEM[C]//International Conference on Learning Representations

    Chaudhry A, Ranzato M A, Rohrbach M, et al. Efficient Lifelong Learning with A-GEM[C]//International Conference on Learning Representations. 2018

  30. [30]

    Gradient Projection Memory for Continual Learning[C]//International Conference on Learning Representa- tions

    Saha G, Garg I, Roy K. Gradient Projection Memory for Continual Learning[C]//International Conference on Learning Representa- tions

  31. [31]

    Dark experience for general continual learning: a strong, simple baseline[J]

    Buzzega P , Boschini M, Porrello A, et al. Dark experience for general continual learning: a strong, simple baseline[J]. Advances in neural information processing systems, 2020, 33: 15920-15930

  32. [32]

    Class-incremental contin- ual learning into the extended der-verse[J]

    Boschini M, Bonicelli L, Buzzega P , et al. Class-incremental contin- ual learning into the extended der-verse[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(5): 5497-5512

  33. [33]

    Modeling the background for incremental learning in semantic segmenta- tion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Cermelli F, Mancini M, Bulo S R, et al. Modeling the background for incremental learning in semantic segmenta- tion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 9233-9242

  34. [34]

    Plop: Learning without forgetting for continual semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Douillard A, Chen Y, Dapogny A, et al. Plop: Learning without forgetting for continual semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 4040-4050

  35. [35]

    Liu Q, Dou Q, Heng P A. Shape-aware meta-learning for general- izing prostate MRI segmentation to unseen domains[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23. Springer International Publishing, 2020: 475-485

  36. [36]

    Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI[J]

    Zhuang X, Shen J. Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI[J]. Medical image analysis, 2016, 31: 77-87

  37. [37]

    Multivariate mixture model for myocardial segmen- tation combining multi-source images[J]

    Zhuang X. Multivariate mixture model for myocardial segmen- tation combining multi-source images[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 41(12): 2933-2946

  38. [38]

    The liver tumor segmentation benchmark (lits)[J]

    Bilic P , Christ P , Li H B, et al. The liver tumor segmentation benchmark (lits)[J]. Medical Image Analysis, 2023, 84: 102680

  39. [39]

    The federated tumor segmentation (fets) challenge,

    Pati S, Baid U, Zenk M, et al. The federated tumor segmentation (fets) challenge[J]. arXiv preprint arXiv:2105.05874, 2021

  40. [40]

    Computer-aided detection and diagnosis for prostate cancer based on mono and multi- parametric MRI: a review[J]

    Lema ˆıtre G, Mart´ı R, Freixenet J, et al. Computer-aided detection and diagnosis for prostate cancer based on mono and multi- parametric MRI: a review[J]. Computers in biology and medicine, 2015, 60: 8-31

  41. [41]

    Bloch, N., Madabhushi, A., Huisman, H., Freymann, J., et al.: NCI-ISBI 2013 Challenge: Automated Segmentation of Prostate Structures, 2014

  42. [42]

    Medical Image Analysis, 2014, 18:359-373

    Litjens, G., Toth, R., Ven, W., Hoeks, C., et al.: Evaluation of prostate segmentation algorithms for mri: The promise12 chal- lenge[J]. Medical Image Analysis, 2014, 18:359-373

  43. [43]

    Federated learning for pre- dicting clinical outcomes in patients with COVID-19[J]

    Dayan I, Roth H R, Zhong A, et al. Federated learning for pre- dicting clinical outcomes in patients with COVID-19[J]. Nature medicine, 2021, 27(10): 1735-1743

  44. [44]

    The future of digital health with federated learning[J]

    Rieke N, Hancox J, Li W, et al. The future of digital health with federated learning[J]. NPJ digital medicine, 2020, 3(1): 1-7

  45. [45]

    Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption[J]

    Froelicher D, Troncoso-Pastoriza J R, Raisaro J L, et al. Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption[J]. Nature Communications, 2021, 12(1): 5910

  46. [46]

    Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer[J]

    Ogier du Terrail J, Leopold A, Joly C, et al. Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer[J]. Nature Medicine, 2023, 29(1): 135-146

  47. [47]

    Privacy-first health research with federated learning[J]

    Sadilek A, Liu L, Nguyen D, et al. Privacy-first health research with federated learning[J]. NPJ digital medicine, 2021, 4(1): 132

  48. [48]

    Federated continual learning with weighted inter-client transfer[C]//International Conference on Machine Learning

    Yoon J, Jeong W, Lee G, et al. Federated continual learning with weighted inter-client transfer[C]//International Conference on Machine Learning. PMLR, 2021: 12073-12086

  49. [49]

    Non-iid data and continual learning processes in federated learning: A long road ahead[J]

    Criado M F, Casado F E, Iglesias R, et al. Non-iid data and continual learning processes in federated learning: A long road ahead[J]. Information Fusion, 2022, 88: 263-280. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 15

  50. [50]

    No one left behind: Real-world fed- erated class-incremental learning[J]

    Dong J, Li H, Cong Y, et al. No one left behind: Real-world fed- erated class-incremental learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

  51. [51]

    Asynchronous federated con- tinual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Shenaj D, Toldo M, Rigon A, et al. Asynchronous federated con- tinual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 5055-5063

  52. [52]

    Tent: Fully Test-Time Adap- tation by Entropy Minimization[C]//International Conference on Learning Representations

    Wang D, Shelhamer E, Liu S, et al. Tent: Fully Test-Time Adap- tation by Entropy Minimization[C]//International Conference on Learning Representations

  53. [53]

    Hu M, Song T, Gu Y, et al. Fully test-time adaptation for image segmentation[C]//Medical Image Computing and Computer As- sisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24. Springer International Publishing, 2021: 251-260

  54. [54]

    Unsupervised domain adaptation in semantic segmentation: a review[J]

    Toldo M, Maracani A, Michieli U, et al. Unsupervised domain adaptation in semantic segmentation: a review[J]. Technologies, 2020, 8(2): 35

  55. [55]

    Lora: Low-rank adaptation of large language models[J]

    Hu E J, Shen Y, Wallis P , et al. Lora: Low-rank adaptation of large language models[J]. ICLR, 2022, 1(2): 3

  56. [56]

    Unsupervised domain adapta- tion for semantic image segmentation: a comprehensive survey[J]

    Csurka G, Volpi R, Chidlovskii B. Unsupervised domain adapta- tion for semantic image segmentation: a comprehensive survey[J]. arXiv preprint arXiv:2112.03241, 2021

  57. [57]

    Foundation models for generalist medical artificial intelligence[J]

    Moor M, Banerjee O, Abad Z S H, et al. Foundation models for generalist medical artificial intelligence[J]. Nature, 2023, 616(7956): 259-265

  58. [58]

    A foundation model for generalizable disease detection from retinal images[J]

    Zhou Y, Chia M A, Wagner S K, et al. A foundation model for generalizable disease detection from retinal images[J]. Nature, 2023, 622(7981): 156-163

  59. [59]

    Towards artificial general intelligence via a multimodal foundation model[J]

    Fei N, Lu Z, Gao Y, et al. Towards artificial general intelligence via a multimodal foundation model[J]. Nature Communications, 2022, 13(1): 3094

  60. [60]

    scGPT: toward building a foun- dation model for single-cell multi-omics using generative AI[J]

    Cui H, Wang C, Maan H, et al. scGPT: toward building a foun- dation model for single-cell multi-omics using generative AI[J]. Nature Methods, 2024: 1-11

  61. [61]

    A multi-center study on the adaptability of a shared foundation model for electronic health records[J]

    Guo L L, Fries J, Steinberg E, et al. A multi-center study on the adaptability of a shared foundation model for electronic health records[J]. npj Digital Medicine, 2024, 7(1): 171

  62. [62]

    Parameter-efficient fine-tuning of large-scale pre-trained language models[J]

    Ding N, Qin Y, Yang G, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models[J]. Nature Machine Intelligence, 2023, 5(3): 220-235

  63. [63]

    Segment any- thing[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision

    Kirillov A, Mintun E, Ravi N, et al. Segment any- thing[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 4015-4026

  64. [64]

    Sun J, Darbehani F, Zaidi M, et al. Saunet: Shape attentive u-net for interpretable medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part IV 23. Springer International Publishing, 2020: 797-806

  65. [65]

    Mdnet: A semantically and visually interpretable medical image diagnosis network[C]//Proceedings of the IEEE conference on computer vision and pattern recogni- tion

    Zhang Z, Xie Y, Xing F, et al. Mdnet: A semantically and visually interpretable medical image diagnosis network[C]//Proceedings of the IEEE conference on computer vision and pattern recogni- tion. 2017: 6428-6436

  66. [66]

    BayeSeg: Bayesian modeling for medical image segmentation with interpretable generalizability[J]

    Gao S, Zhou H, Gao Y, et al. BayeSeg: Bayesian modeling for medical image segmentation with interpretable generalizability[J]. Medical Image Analysis, 2023, 89: 102889

  67. [67]

    Segmentation ability map: Interpret deep features for medical image segmentation[J]

    He S, Feng Y, Grant P E, et al. Segmentation ability map: Interpret deep features for medical image segmentation[J]. Medical image analysis, 2023, 84: 102726

  68. [68]

    Chanda T, Hauser K, Hobelsberger S, et al

    He S, Feng Y, Grant P E, et al. Chanda T, Hauser K, Hobelsberger S, et al. Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma[J]. Nature Communications, 2024, 15(1): 524

  69. [69]

    Transparent medical image AI via an image–text foundation model grounded in medical literature[J]

    Kim C, Gadgil S U, DeGrave A J, et al. Transparent medical image AI via an image–text foundation model grounded in medical literature[J]. Nature Medicine, 2024: 1-12

  70. [70]

    AI for radiographic COVID-19 detection selects shortcuts over signal[J]

    DeGrave A J, Janizek J D, Lee S I. AI for radiographic COVID-19 detection selects shortcuts over signal[J]. Nature Machine Intelli- gence, 2021, 3(7): 610-619

  71. [71]

    Lifelong nnU- Net: a framework for standardized medical continual learning[J]

    Gonz ´alez C, Ranem A, Pinto dos Santos D, et al. Lifelong nnU- Net: a framework for standardized medical continual learning[J]. Scientific Reports, 2023, 13(1): 9381

  72. [72]

    What is wrong with contin- ual learning in medical image segmentation?[C]//Proceedings of the International Workshop on Personalized Incremental Learning in Medicine

    Gonzalez C, Lemke N, Ranem A, et al. What is wrong with contin- ual learning in medical image segmentation?[C]//Proceedings of the International Workshop on Personalized Incremental Learning in Medicine. 2025: 25-34

  73. [73]

    Learning incrementally to segment multiple organs in a CT image[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention

    Liu P , Wang X, Fan M, et al. Learning incrementally to segment multiple organs in a CT image[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2022: 714-724

  74. [74]

    Continual learning for abdominal multi-organ and tumor segmentation[C]//International confer- ence on medical image computing and computer-assisted inter- vention

    Zhang Y, Li X, Chen H, et al. Continual learning for abdominal multi-organ and tumor segmentation[C]//International confer- ence on medical image computing and computer-assisted inter- vention. Cham: Springer Nature Switzerland, 2023: 35-45

  75. [75]

    Zhang J, Xue P , Gu R, et al. Learning towards synchronous network memorizability and generalizability for continual seg- mentation across multiple sites[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2022: 380-390

  76. [76]

    S 3 R: Shape and semantics-based selective regularization for explainable continual segmentation across multiple sites[J]

    Zhang J, Gu R, Xue P , et al. S 3 R: Shape and semantics-based selective regularization for explainable continual segmentation across multiple sites[J]. IEEE Transactions on Medical Imaging, 2023, 42(9): 2539-2551

  77. [77]

    Boosting knowledge diversity, accuracy, and stability via tri-enhanced distillation for domain continual medical image segmentation[J]

    Zhu Z, Ma X, Wang W, et al. Boosting knowledge diversity, accuracy, and stability via tri-enhanced distillation for domain continual medical image segmentation[J]. Medical image analysis, 2024, 94: 103112

  78. [78]

    A survey on continual semantic segmentation: Theory, challenge, method and application[J]

    Yuan B, Zhao D. A survey on continual semantic segmentation: Theory, challenge, method and application[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 10891- 10910

  79. [79]

    Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects[J]

    Kumari P , Chauhan J, Bozorgpour A, et al. Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects[J]. Medical Image Analysis, 2025: 103730

  80. [80]

    A comprehensive survey of contin- ual learning: Theory, method and application[J]

    Wang L, Zhang X, Su H, et al. A comprehensive survey of contin- ual learning: Theory, method and application[J]. IEEE transactions on pattern analysis and machine intelligence, 2024, 46(8): 5362- 5383. Bomin Wangis a Ph.D. candidate at the School of Data Science, Fudan University, under the supervision of Prof. Xiahai Zhuang. He received his Bachelor’s de...