arxiv: 2605.06160 · v1 · submitted 2026-05-07 · 💻 cs.CV

Recognition: unknown

Beyond Forgetting in Continual Medical Image Segmentation: A Comprehensive Benchmark Study

Bomin Wang , Hangqi Zhou , Yibo Gao , Xiahai Zhuang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 13:43 UTC · model grok-4.3

classification 💻 cs.CV

keywords continual learningmedical image segmentationbenchmarkcatastrophic forgettingplasticitydomain shiftincremental learningreplay-based methods

0 comments

The pith

Benchmark shows replay-based methods best balance stability and plasticity in continual medical image segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that continual learning for medical image segmentation must address more than forgetting to work in real clinics where domains, classes, and organs change over time. It introduces three scenarios that mirror clinical shifts and a set of metrics covering performance, forgetting, plasticity, forward generalization, and efficiency. Experiments with representative methods indicate that replay approaches come closest to balancing retention of old knowledge with acquisition of new capabilities. A reader would care because medical segmentation models deployed in hospitals need to adapt without full retraining or loss of prior accuracy on previous patients and tasks.

Core claim

The authors claim that developing a model satisfying all requirements simultaneously remains challenging. Their studies suggest replay-based methods achieve the best overall balance between stability and plasticity, parameter-isolation methods reduce forgetting effectively but increase model size, and forward generalizability is a significantly understudied aspect of the field.

What carries the argument

The evaluation framework with three clinically motivated scenarios—Domain-CL for cross-center domain shift, Class-CL for incremental anatomical structures, and Organ-CL for cross-organ segmentation—plus metrics for general performance, forgetting, plasticity, forward generalizability, parameter efficiency, and replay burden.

Load-bearing premise

The selected representative continual learning methods adequately represent their categories and the metrics fully capture essential properties for clinical deployment.

What would settle it

A method that simultaneously delivers high general performance, low forgetting, high plasticity, strong forward generalizability, high parameter efficiency, and low replay burden across all three scenarios would show that satisfying every requirement at once is not as difficult as concluded.

Figures

Figures reproduced from arXiv: 2605.06160 by Bomin Wang, Hangqi Zhou, Xiahai Zhuang, Yibo Gao.

**Figure 1.** Figure 1: Overview of the proposed benchmark for continual medical image segmentation. view at source ↗

**Figure 2.** Figure 2: Task-order robustness performance of CL methods. Subfigures show (a) A-Dice and (b) BWTR results with view at source ↗

**Figure 3.** Figure 3: Additional analyses of replay-based methods and foundation-model behavior. (a) Effect of buffer size on A-Dice view at source ↗

**Figure 4.** Figure 4: CL and its connections with related machine learning paradigms. view at source ↗

read the original abstract

Continual learning (CL) is essential for deploying medical image segmentation models in clinical environments where imaging domains, anatomical targets, and diagnostic tasks evolve over time. However, continual segmentation still faces three main challenges. First, the scenarios for this task remain insufficiently standardized for real-world clinical settings. Second, existing research has been primarily focused on mitigating forgetting, overlooking the other essential properties such as plasticity. Third, a benchmark work with comprehensive evaluation on existing methods is stll desirable. To address these gaps, we present such benchmark study of continual medical image segmentation. We first define three clinically motivated scenarios, namely Domain-CL, Class-CL, and Organ-CL, to respectively capture the cross-center domain shift, the incremental anatomical structure segmentation, and the cross-organ segmentation. We then introduce an evaluation framework that measures not only general performance and forgetting, but also plasticity, forward generalizability, parameter efficiency, and replay burden. The results, from extensive experiments with representative CL methods, showed that it was still challenging to develop a model that could satisfy all the requirements simultaneously. Nevertheless, these studies also suggested that the replay-based methods achieve the best overall balance between stability and plasticity, the parameter-isolation methods should be effective at reducing forgetting, though at the cost of increased model size, and the forward generalizability remain a significantly understudied aspect of this research field. Finally, we discuss related learning paradigms and outline future directions for continual medical image segmentation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript presents a benchmark study on continual learning (CL) for medical image segmentation. It defines three clinically motivated scenarios (Domain-CL, Class-CL, Organ-CL) to address standardization gaps. An evaluation framework is introduced that assesses not only forgetting but also plasticity, forward generalizability, parameter efficiency, and replay burden. Extensive experiments with representative CL methods lead to conclusions that balancing all requirements is challenging, replay-based methods offer the best stability-plasticity trade-off, parameter-isolation methods reduce forgetting at the expense of model size, and forward generalizability is understudied.

Significance. If the experimental results and category-level conclusions hold, this work provides a valuable comprehensive benchmark that shifts focus beyond forgetting to other clinically relevant properties in continual medical image segmentation. It highlights the difficulty in satisfying all desiderata simultaneously and identifies promising directions, such as the strengths of replay-based approaches. The introduction of new metrics and scenarios is a strength, offering a more holistic view than prior work focused narrowly on catastrophic forgetting.

major comments (2)

[§4 (Experiments)] The claim that replay-based methods achieve the best overall balance relies on the selected representative methods for each category. The paper should provide more details on the specific algorithms chosen (e.g., which replay variants and how many), their hyperparameter tuning process, and why they are representative, as narrow selection could bias the category-level rankings and the 'still challenging' assessment.
[§3 (Evaluation Framework)] The newly proposed metrics (plasticity, forward generalizability, replay burden) are central to the conclusions, but their definitions and validation against clinical deployment needs are not sufficiently justified. For example, it is unclear how these proxies correlate with real-world clinical requirements, which could affect the interpretation that forward generalizability remains understudied.

minor comments (3)

[Abstract] Typo: 'stll' should be 'still'.
[Abstract] The phrasing 'the parameter-isolation methods should be effective' is tentative; consider making it more definitive or explaining the basis if results support it.
[Throughout] Ensure all method names and acronyms are defined at first use for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive overall assessment of our benchmark study. We address each major comment point by point below, providing the strongest honest responses possible based on the manuscript content. We will revise the paper to incorporate additional details and clarifications.

read point-by-point responses

Referee: [§4 (Experiments)] The claim that replay-based methods achieve the best overall balance relies on the selected representative methods for each category. The paper should provide more details on the specific algorithms chosen (e.g., which replay variants and how many), their hyperparameter tuning process, and why they are representative, as narrow selection could bias the category-level rankings and the 'still challenging' assessment.

Authors: We agree that greater transparency on method selection is warranted to support the category-level conclusions. The manuscript already selects methods as standard representatives from each CL category based on their prevalence in the literature and suitability for segmentation (e.g., replay variants covering buffer-based and generative approaches). In the revision, we will expand §4 with: the complete list of chosen algorithms per category with citations, the hyperparameter tuning protocol (grid search over key parameters such as learning rate, buffer size, and regularization coefficients using validation splits from initial tasks), and explicit rationale for representativeness drawn from recent CL surveys. This addresses potential bias concerns without altering the experimental results or the assessment that balancing all requirements remains challenging. revision: yes
Referee: [§3 (Evaluation Framework)] The newly proposed metrics (plasticity, forward generalizability, replay burden) are central to the conclusions, but their definitions and validation against clinical deployment needs are not sufficiently justified. For example, it is unclear how these proxies correlate with real-world clinical requirements, which could affect the interpretation that forward generalizability remains understudied.

Authors: The metrics are defined in §3 to extend beyond forgetting and align with the three clinically motivated scenarios introduced in the paper. Plasticity captures adaptation to new domains/classes/organs, forward generalizability evaluates performance on future unseen data (critical for evolving clinical environments), and replay burden quantifies storage overhead relevant to deployment constraints. We will revise §3 to include more explicit linkages to clinical needs, supported by references to medical imaging deployment literature, and clarify limitations of these proxies. While a full empirical correlation study with real-world clinical outcomes exceeds the scope of this benchmark, the added discussion will better justify why forward generalizability is identified as understudied. revision: partial

Circularity Check

0 steps flagged

Empirical benchmark study with no circular derivations or self-referential predictions

full rationale

The paper defines three clinically motivated scenarios (Domain-CL, Class-CL, Organ-CL), introduces an evaluation framework with metrics for general performance, forgetting, plasticity, forward generalizability, parameter efficiency, and replay burden, then reports results from experiments on representative CL methods. No mathematical derivations, equations, or predictions exist that reduce to fitted parameters or inputs by construction. Conclusions follow directly from the experimental comparisons against external method categories, rendering the work self-contained without load-bearing self-citations, ansatz smuggling, or renaming of known results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on empirical comparisons rather than new theoretical derivations; the main assumptions are about the representativeness of the scenarios and methods tested.

axioms (1)

domain assumption The three defined scenarios (Domain-CL, Class-CL, Organ-CL) adequately represent real-world clinical continual learning challenges.
Invoked when presenting the scenarios as clinically motivated to capture cross-center shift, incremental structures, and cross-organ segmentation.

pith-pipeline@v0.9.0 · 5565 in / 1464 out tokens · 68766 ms · 2026-05-08T13:43:58.793478+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

80 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Incorporating neuro-inspired adapt- ability for continual learning in artificial intelligence[J]

Wang L, Zhang X, Li Q, et al. Incorporating neuro-inspired adapt- ability for continual learning in artificial intelligence[J]. Nature Machine Intelligence, 2023, 5(12): 1356-1368

2023
[2]

Computational models of the neural bases of learning and memory[J]

Gluck M A, Granger R. Computational models of the neural bases of learning and memory[J]. Annual review of neuroscience, 1993, 16(1): 667-706

1993
[3]

AI in health and medicine[J]

Rajpurkar P , Chen E, Banerjee O, et al. AI in health and medicine[J]. Nature medicine, 2022, 28(1): 31-38

2022
[4]

Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J]

Gulshan V , Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs[J]. Jama, 2016, 316(22): 2402-2410

2016
[5]

A continual learning survey: Defying forgetting in classification tasks[J]

De Lange M, Aljundi R, Masana M, et al. A continual learning survey: Defying forgetting in classification tasks[J]. IEEE transac- tions on pattern analysis and machine intelligence, 2021, 44(7): 3366-3385

2021
[6]

A comprehensive survey of contin- ual learning: Theory, method and application[J]

Wang L, Zhang X, Su H, et al. A comprehensive survey of contin- ual learning: Theory, method and application[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

2024
[7]

Continual learning: A review of techniques, challenges, and future directions[J]

Wickramasinghe B, Saha G, Roy K. Continual learning: A review of techniques, challenges, and future directions[J]. IEEE Transac- tions on Artificial Intelligence, 2023, 5(6): 2526-2546

2023
[8]

Clinical applications of continual learning machine learning[J]

Lee C S, Lee A Y. Clinical applications of continual learning machine learning[J]. The Lancet Digital Health, 2020, 2(6): e279- e281

2020
[9]

Domain generalization for medical imaging classification with linear-dependency regularization[J]

Li H, Wang Y F, Wan R, et al. Domain generalization for medical imaging classification with linear-dependency regularization[J]. Advances in neural information processing systems, 2020, 33: 3118-3129

2020
[10]

Health insurance portability and accountability act of 1996[J]

Act A. Health insurance portability and accountability act of 1996[J]. Public law, 1996, 104: 191

1996
[11]

General data protection regulation[J]

Regulation P . General data protection regulation[J]. Intouch, 2018, 25: 1-5

2018
[12]

From development to deployment: dataset shift, causality, and shift-stable models in health AI[J]

Subbaswamy A, Saria S. From development to deployment: dataset shift, causality, and shift-stable models in health AI[J]. Biostatistics, 2020, 21(2): 345-352

2020
[13]

Development and evaluation of an artificial intelligence system for COVID-19 diagnosis[J]

Jin C, Chen W, Cao Y, et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis[J]. Nature Communications, 2020, 11(1): 5088

2020
[14]

Segment anything in medical images[J]

Ma J, He Y, Li F, et al. Segment anything in medical images[J]. Nature Communications, 2024, 15(1): 654

2024
[15]

Continual learning with deep generative replay[J]

Shin H, Lee J K, Kim J, et al. Continual learning with deep generative replay[J]. Advances in neural information processing systems, 2017, 30

2017
[16]

Overcoming catas- trophic forgetting in neural networks[J]

Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming catas- trophic forgetting in neural networks[J]. Proceedings of the na- tional academy of sciences, 2017, 114(13): 3521-3526

2017
[17]

Continual learning through synaptic intelligence[C]//International conference on machine learning

Zenke F, Poole B, Ganguli S. Continual learning through synaptic intelligence[C]//International conference on machine learning. PMLR, 2017: 3987-3995

2017
[18]

Learning without forgetting[J]

Li Z, Hoiem D. Learning without forgetting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(12): 2935- 2947

2017
[19]

Medical image analysis on left atrial LGE MRI for atrial fibrillation studies: A review[J]

Li L, Zimmer V A, Schnabel J A, et al. Medical image analysis on left atrial LGE MRI for atrial fibrillation studies: A review[J]. Medical Image Analysis, 2022, 77: 102360

2022
[20]

Measuring and regulariz- ing networks in function space[C]//International Conference on Learning Representations

Benjamin A, Rolnick D, Kording K. Measuring and regulariz- ing networks in function space[C]//International Conference on Learning Representations
[21]

Forget-free continual learning with winning subnetworks[C]//International Confer- ence on Machine Learning

Kang H, Mina R J L, Madjid S R H, et al. Forget-free continual learning with winning subnetworks[C]//International Confer- ence on Machine Learning. PMLR, 2022: 10734-10750

2022
[22]

Progressive Neural Networks

Rusu A A, Rabinowitz N C, Desjardins G, et al. Progressive neural networks[J]. arXiv preprint arXiv:1606.04671, 2016

work page internal anchor Pith review arXiv 2016
[23]

Incremental learning through deep adaptation[J]

Rosenfeld A, Tsotsos J K. Incremental learning through deep adaptation[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(3): 651-663

2020
[24]

Experience replay for continual learning[J]

Rolnick D, Ahuja A, Schwarz J, et al. Experience replay for continual learning[J]. Advances in neural information processing systems, 2019, 32

2019
[25]

Gcr: Gradient coreset based replay buffer selection for continual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Tiwari R, Killamsetty K, Iyer R, et al. Gcr: Gradient coreset based replay buffer selection for continual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 99-108

2022
[26]

AtrialJSQnet: a new framework for joint segmentation and quantification of left atrium and scars incorporating spatial and shape information[J]

Li L, Zimmer V A, Schnabel J A, et al. AtrialJSQnet: a new framework for joint segmentation and quantification of left atrium and scars incorporating spatial and shape information[J]. Medical Image Analysis, 2022, 76: 102303

2022
[27]

Learning to Learn with- out Forgetting by Maximizing Transfer and Minimizing Interfer- ence[C]//International Conference on Learning Representations

Riemer M, Cases I, Ajemian R, et al. Learning to Learn with- out Forgetting by Maximizing Transfer and Minimizing Interfer- ence[C]//International Conference on Learning Representations
[28]

Gradient episodic memory for con- tinual learning[J]

Lopez-Paz D, Ranzato M A. Gradient episodic memory for con- tinual learning[J]. Advances in neural information processing systems, 2017, 30

2017
[29]

Efficient Lifelong Learning with A-GEM[C]//International Conference on Learning Representations

Chaudhry A, Ranzato M A, Rohrbach M, et al. Efficient Lifelong Learning with A-GEM[C]//International Conference on Learning Representations. 2018

2018
[30]

Gradient Projection Memory for Continual Learning[C]//International Conference on Learning Representa- tions

Saha G, Garg I, Roy K. Gradient Projection Memory for Continual Learning[C]//International Conference on Learning Representa- tions
[31]

Dark experience for general continual learning: a strong, simple baseline[J]

Buzzega P , Boschini M, Porrello A, et al. Dark experience for general continual learning: a strong, simple baseline[J]. Advances in neural information processing systems, 2020, 33: 15920-15930

2020
[32]

Class-incremental contin- ual learning into the extended der-verse[J]

Boschini M, Bonicelli L, Buzzega P , et al. Class-incremental contin- ual learning into the extended der-verse[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(5): 5497-5512

2022
[33]

Modeling the background for incremental learning in semantic segmenta- tion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Cermelli F, Mancini M, Bulo S R, et al. Modeling the background for incremental learning in semantic segmenta- tion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 9233-9242

2020
[34]

Plop: Learning without forgetting for continual semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Douillard A, Chen Y, Dapogny A, et al. Plop: Learning without forgetting for continual semantic segmentation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 4040-4050

2021
[35]

Liu Q, Dou Q, Heng P A. Shape-aware meta-learning for general- izing prostate MRI segmentation to unseen domains[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23. Springer International Publishing, 2020: 475-485

2020
[36]

Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI[J]

Zhuang X, Shen J. Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI[J]. Medical image analysis, 2016, 31: 77-87

2016
[37]

Multivariate mixture model for myocardial segmen- tation combining multi-source images[J]

Zhuang X. Multivariate mixture model for myocardial segmen- tation combining multi-source images[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 41(12): 2933-2946

2018
[38]

The liver tumor segmentation benchmark (lits)[J]

Bilic P , Christ P , Li H B, et al. The liver tumor segmentation benchmark (lits)[J]. Medical Image Analysis, 2023, 84: 102680

2023
[39]

The federated tumor segmentation (fets) challenge,

Pati S, Baid U, Zenk M, et al. The federated tumor segmentation (fets) challenge[J]. arXiv preprint arXiv:2105.05874, 2021

work page arXiv 2021
[40]

Computer-aided detection and diagnosis for prostate cancer based on mono and multi- parametric MRI: a review[J]

Lema ˆıtre G, Mart´ı R, Freixenet J, et al. Computer-aided detection and diagnosis for prostate cancer based on mono and multi- parametric MRI: a review[J]. Computers in biology and medicine, 2015, 60: 8-31

2015
[41]

Bloch, N., Madabhushi, A., Huisman, H., Freymann, J., et al.: NCI-ISBI 2013 Challenge: Automated Segmentation of Prostate Structures, 2014

2013
[42]

Medical Image Analysis, 2014, 18:359-373

Litjens, G., Toth, R., Ven, W., Hoeks, C., et al.: Evaluation of prostate segmentation algorithms for mri: The promise12 chal- lenge[J]. Medical Image Analysis, 2014, 18:359-373

2014
[43]

Federated learning for pre- dicting clinical outcomes in patients with COVID-19[J]

Dayan I, Roth H R, Zhong A, et al. Federated learning for pre- dicting clinical outcomes in patients with COVID-19[J]. Nature medicine, 2021, 27(10): 1735-1743

2021
[44]

The future of digital health with federated learning[J]

Rieke N, Hancox J, Li W, et al. The future of digital health with federated learning[J]. NPJ digital medicine, 2020, 3(1): 1-7

2020
[45]

Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption[J]

Froelicher D, Troncoso-Pastoriza J R, Raisaro J L, et al. Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption[J]. Nature Communications, 2021, 12(1): 5910

2021
[46]

Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer[J]

Ogier du Terrail J, Leopold A, Joly C, et al. Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer[J]. Nature Medicine, 2023, 29(1): 135-146

2023
[47]

Privacy-first health research with federated learning[J]

Sadilek A, Liu L, Nguyen D, et al. Privacy-first health research with federated learning[J]. NPJ digital medicine, 2021, 4(1): 132

2021
[48]

Federated continual learning with weighted inter-client transfer[C]//International Conference on Machine Learning

Yoon J, Jeong W, Lee G, et al. Federated continual learning with weighted inter-client transfer[C]//International Conference on Machine Learning. PMLR, 2021: 12073-12086

2021
[49]

Non-iid data and continual learning processes in federated learning: A long road ahead[J]

Criado M F, Casado F E, Iglesias R, et al. Non-iid data and continual learning processes in federated learning: A long road ahead[J]. Information Fusion, 2022, 88: 263-280. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 15

2022
[50]

No one left behind: Real-world fed- erated class-incremental learning[J]

Dong J, Li H, Cong Y, et al. No one left behind: Real-world fed- erated class-incremental learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

2023
[51]

Asynchronous federated con- tinual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Shenaj D, Toldo M, Rigon A, et al. Asynchronous federated con- tinual learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 5055-5063

2023
[52]

Tent: Fully Test-Time Adap- tation by Entropy Minimization[C]//International Conference on Learning Representations

Wang D, Shelhamer E, Liu S, et al. Tent: Fully Test-Time Adap- tation by Entropy Minimization[C]//International Conference on Learning Representations
[53]

Hu M, Song T, Gu Y, et al. Fully test-time adaptation for image segmentation[C]//Medical Image Computing and Computer As- sisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24. Springer International Publishing, 2021: 251-260

2021
[54]

Unsupervised domain adaptation in semantic segmentation: a review[J]

Toldo M, Maracani A, Michieli U, et al. Unsupervised domain adaptation in semantic segmentation: a review[J]. Technologies, 2020, 8(2): 35

2020
[55]

Lora: Low-rank adaptation of large language models[J]

Hu E J, Shen Y, Wallis P , et al. Lora: Low-rank adaptation of large language models[J]. ICLR, 2022, 1(2): 3

2022
[56]

Unsupervised domain adapta- tion for semantic image segmentation: a comprehensive survey[J]

Csurka G, Volpi R, Chidlovskii B. Unsupervised domain adapta- tion for semantic image segmentation: a comprehensive survey[J]. arXiv preprint arXiv:2112.03241, 2021

work page arXiv 2021
[57]

Foundation models for generalist medical artificial intelligence[J]

Moor M, Banerjee O, Abad Z S H, et al. Foundation models for generalist medical artificial intelligence[J]. Nature, 2023, 616(7956): 259-265

2023
[58]

A foundation model for generalizable disease detection from retinal images[J]

Zhou Y, Chia M A, Wagner S K, et al. A foundation model for generalizable disease detection from retinal images[J]. Nature, 2023, 622(7981): 156-163

2023
[59]

Towards artificial general intelligence via a multimodal foundation model[J]

Fei N, Lu Z, Gao Y, et al. Towards artificial general intelligence via a multimodal foundation model[J]. Nature Communications, 2022, 13(1): 3094

2022
[60]

scGPT: toward building a foun- dation model for single-cell multi-omics using generative AI[J]

Cui H, Wang C, Maan H, et al. scGPT: toward building a foun- dation model for single-cell multi-omics using generative AI[J]. Nature Methods, 2024: 1-11

2024
[61]

A multi-center study on the adaptability of a shared foundation model for electronic health records[J]

Guo L L, Fries J, Steinberg E, et al. A multi-center study on the adaptability of a shared foundation model for electronic health records[J]. npj Digital Medicine, 2024, 7(1): 171

2024
[62]

Parameter-efficient fine-tuning of large-scale pre-trained language models[J]

Ding N, Qin Y, Yang G, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models[J]. Nature Machine Intelligence, 2023, 5(3): 220-235

2023
[63]

Segment any- thing[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision

Kirillov A, Mintun E, Ravi N, et al. Segment any- thing[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 4015-4026

2023
[64]

Sun J, Darbehani F, Zaidi M, et al. Saunet: Shape attentive u-net for interpretable medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part IV 23. Springer International Publishing, 2020: 797-806

2020
[65]

Mdnet: A semantically and visually interpretable medical image diagnosis network[C]//Proceedings of the IEEE conference on computer vision and pattern recogni- tion

Zhang Z, Xie Y, Xing F, et al. Mdnet: A semantically and visually interpretable medical image diagnosis network[C]//Proceedings of the IEEE conference on computer vision and pattern recogni- tion. 2017: 6428-6436

2017
[66]

BayeSeg: Bayesian modeling for medical image segmentation with interpretable generalizability[J]

Gao S, Zhou H, Gao Y, et al. BayeSeg: Bayesian modeling for medical image segmentation with interpretable generalizability[J]. Medical Image Analysis, 2023, 89: 102889

2023
[67]

Segmentation ability map: Interpret deep features for medical image segmentation[J]

He S, Feng Y, Grant P E, et al. Segmentation ability map: Interpret deep features for medical image segmentation[J]. Medical image analysis, 2023, 84: 102726

2023
[68]

Chanda T, Hauser K, Hobelsberger S, et al

He S, Feng Y, Grant P E, et al. Chanda T, Hauser K, Hobelsberger S, et al. Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma[J]. Nature Communications, 2024, 15(1): 524

2024
[69]

Transparent medical image AI via an image–text foundation model grounded in medical literature[J]

Kim C, Gadgil S U, DeGrave A J, et al. Transparent medical image AI via an image–text foundation model grounded in medical literature[J]. Nature Medicine, 2024: 1-12

2024
[70]

AI for radiographic COVID-19 detection selects shortcuts over signal[J]

DeGrave A J, Janizek J D, Lee S I. AI for radiographic COVID-19 detection selects shortcuts over signal[J]. Nature Machine Intelli- gence, 2021, 3(7): 610-619

2021
[71]

Lifelong nnU- Net: a framework for standardized medical continual learning[J]

Gonz ´alez C, Ranem A, Pinto dos Santos D, et al. Lifelong nnU- Net: a framework for standardized medical continual learning[J]. Scientific Reports, 2023, 13(1): 9381

2023
[72]

What is wrong with contin- ual learning in medical image segmentation?[C]//Proceedings of the International Workshop on Personalized Incremental Learning in Medicine

Gonzalez C, Lemke N, Ranem A, et al. What is wrong with contin- ual learning in medical image segmentation?[C]//Proceedings of the International Workshop on Personalized Incremental Learning in Medicine. 2025: 25-34

2025
[73]

Learning incrementally to segment multiple organs in a CT image[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention

Liu P , Wang X, Fan M, et al. Learning incrementally to segment multiple organs in a CT image[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2022: 714-724

2022
[74]

Continual learning for abdominal multi-organ and tumor segmentation[C]//International confer- ence on medical image computing and computer-assisted inter- vention

Zhang Y, Li X, Chen H, et al. Continual learning for abdominal multi-organ and tumor segmentation[C]//International confer- ence on medical image computing and computer-assisted inter- vention. Cham: Springer Nature Switzerland, 2023: 35-45

2023
[75]

Zhang J, Xue P , Gu R, et al. Learning towards synchronous network memorizability and generalizability for continual seg- mentation across multiple sites[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2022: 380-390

2022
[76]

S 3 R: Shape and semantics-based selective regularization for explainable continual segmentation across multiple sites[J]

Zhang J, Gu R, Xue P , et al. S 3 R: Shape and semantics-based selective regularization for explainable continual segmentation across multiple sites[J]. IEEE Transactions on Medical Imaging, 2023, 42(9): 2539-2551

2023
[77]

Boosting knowledge diversity, accuracy, and stability via tri-enhanced distillation for domain continual medical image segmentation[J]

Zhu Z, Ma X, Wang W, et al. Boosting knowledge diversity, accuracy, and stability via tri-enhanced distillation for domain continual medical image segmentation[J]. Medical image analysis, 2024, 94: 103112

2024
[78]

A survey on continual semantic segmentation: Theory, challenge, method and application[J]

Yuan B, Zhao D. A survey on continual semantic segmentation: Theory, challenge, method and application[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 10891- 10910

2024
[79]

Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects[J]

Kumari P , Chauhan J, Bozorgpour A, et al. Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects[J]. Medical Image Analysis, 2025: 103730

2025
[80]

A comprehensive survey of contin- ual learning: Theory, method and application[J]

Wang L, Zhang X, Su H, et al. A comprehensive survey of contin- ual learning: Theory, method and application[J]. IEEE transactions on pattern analysis and machine intelligence, 2024, 46(8): 5362- 5383. Bomin Wangis a Ph.D. candidate at the School of Data Science, Fudan University, under the supervision of Prof. Xiahai Zhuang. He received his Bachelor’s de...

2024