arxiv: 2604.02877 · v1 · submitted 2026-04-03 · 💻 cs.CV

Recognition: 1 theorem link

· Lean Theorem

Unlocking Positive Transfer in Incrementally Learning Surgical Instruments: A Self-reflection Hierarchical Prompt Framework

Yu Zhu , Kang Li , Zheng Li , Pheng-Ann Heng

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:55 UTC · model grok-4.3

classification 💻 cs.CV

keywords class incremental learningsurgical instrument segmentationprompt learningknowledge transfercatastrophic forgettinghierarchical promptsself-reflectionincremental segmentation

0 comments

The pith

A hierarchical prompt framework on frozen models unlocks positive forward and backward knowledge transfer for incrementally learning surgical instruments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper aims to fix the oversight in prior incremental segmentation work by actively exploiting how past instrument knowledge can simplify learning new classes and how new classes can in turn sharpen old representations. It freezes a pre-trained backbone and grows a tree of prompts whose shared partitions let new instruments borrow reusable features while a graph-propagation step lets the model reflect on associations to polish everything already known. If the approach holds, models could keep expanding their instrument vocabulary over time without losing earlier skills, which matters for surgical video systems that must handle evolving tool sets in real procedures. The method reports gains above five percent on one benchmark and eleven percent on the other, and it works for both CNN and transformer backbones.

Core claim

The framework freezes a pre-trained model and adaptively appends instrument-aware prompts organized into a hierarchical parsing tree, with an instrument-shared root, n-part-shared intermediate nodes, and instrument-distinct leaves, so that new classes can draw on historical reusable knowledge for faster learning. It then performs self-reflection by propagating knowledge associations along a directed-weighted graph built from the tree, refining existing prompt representations to improve their quality without catastrophic forgetting of prior instruments.

What carries the argument

The hierarchical prompt parsing tree together with directed-weighted graph propagation for self-reflection, which organizes prompts by degree of sharing and updates them bidirectionally to support positive transfer.

If this is right

New instrument classes learn more efficiently by accessing reusable historical knowledge through the shared prompt partitions.
Representations of previously learned instruments are refined and improved when new classes are added.
Catastrophic forgetting of old instruments is avoided while the model expands its capabilities.
The same prompt-tree mechanism delivers measurable gains on both CNN-based and transformer-based models.
Performance exceeds competing incremental methods by more than five percent on one public benchmark and eleven percent on the other.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tree-plus-graph structure could be tested on incremental segmentation tasks outside surgery, such as autonomous driving scenes or robotic assembly.
If the graph stays sparse, the self-reflection step might continue to work across dozens of sequentially added classes without retraining the backbone.
Real-time operating-room systems could use the framework to incorporate novel tools observed during a procedure and immediately improve recognition of familiar ones.

Load-bearing premise

The hierarchical prompt parsing tree and directed-weighted graph propagation will reliably expose reusable knowledge and refine old representations without introducing negative transfer or instability when instrument classes are added sequentially.

What would settle it

Training the framework on a long sequence of instrument classes and observing that accuracy on earlier classes falls below a non-incremental baseline or that new-class accuracy shows no improvement over plain prompt tuning would falsify the positive-transfer claim.

Figures

Figures reproduced from arXiv: 2604.02877 by Kang Li, Pheng-Ann Heng, Yu Zhu, Zheng Li.

**Figure 2.** Figure 2: Overview of our self-reflection hierarchical prompt framework. We progressively append instrument-aware prompts into a pre [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Visual comparisons of our approaches and highly competitive approaches. More comparison results are in the supplementary. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Effects of the decay factor γ and the teleport probability α on model performance. method in comparison with other SOTA methods, as shown in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 2.** Figure 2: Visualization comparison of segmentation results across all methods. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Automatic estimation of instrument part count. [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

read the original abstract

To continuously enhance model adaptability in surgical video scene parsing, recent studies incrementally update it to progressively learn to segment an increasing number of surgical instruments over time. However, prior works constantly overlooked the potential of positive forward knowledge transfer, i.e., how past knowledge could help learn new classes, and positive backward knowledge transfer, i.e., how learning new classes could help refine past knowledge. In this paper, we propose a self-reflection hierarchical prompt framework that unlocks the power of positive forward and backward knowledge transfer in class incremental segmentation, aiming to proficiently learn new instruments, improve existing skills of regular instruments, and avoid catastrophic forgetting of old instruments. Our framework is built on a frozen, pre-trained model that adaptively appends instrument-aware prompts for new classes throughout training episodes. To enable positive forward knowledge transfer, we organize instrument prompts into a hierarchical prompt parsing tree with the instrument-shared prompt partition as the root node, n-part-shared prompt partitions as intermediate nodes and instrument-distinct prompt partitions as leaf nodes, to expose the reusable historical knowledge for new classes to simplify their learning. Conversely, to encourage positive backward knowledge transfer, we conduct self-reflection refining on existing knowledge by directed-weighted graph propagation, examining the knowledge associations recorded in the tree to improve its representativeness without causing catastrophic forgetting. Our framework is applicable to both CNN-based models and advanced transformer-based foundation models, yielding more than 5% and 11% improvements over the competing methods on two public benchmarks respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's hierarchical prompt tree plus directed graph self-reflection is a concrete new mechanism for bidirectional positive transfer in incremental surgical segmentation, but the abstract gives no ablations or protocol details so the backward-transfer claim is still unverified.

read the letter

The main takeaway is a prompt-based continual-learning setup for surgical instrument segmentation that tries to get both forward help for new classes and backward refinement for old ones. Prompts are arranged in a tree with an instrument-shared root, n-part-shared intermediates, and distinct leaves; a directed-weighted graph then propagates associations to update old representations when new instruments arrive. The model stays frozen and only appends these prompts, which keeps the approach lightweight and compatible with both CNNs and transformers. They report more than 5% and 11% gains over prior methods on two public benchmarks, which is the kind of practical signal that matters for adaptive surgical AI. The tree structure itself is a clear design choice that makes shared versus distinct knowledge explicit, and the graph step is a specific way to do self-reflection that is not just standard replay or regularization. That combination is what feels new relative to earlier prompt-tuning work in incremental segmentation. The soft spot is the missing experimental grounding. The abstract describes the mechanism but gives no incremental protocol, no order of class arrival, no ablation that isolates the graph's effect on old-class performance, and no statistical checks. Without those, it is impossible to tell whether the propagation reliably avoids negative transfer when instruments share visual features. The stress-test concern about instability therefore still stands until the full results section is examined. This is aimed at people already working on continual learning or prompt adaptation in medical vision. A reader who needs a concrete architecture for positive transfer could borrow the tree-plus-graph idea even if the numbers require checking. I would send it to peer review because the methodological proposal is distinct enough to be worth referee scrutiny, though the experiments will need substantial expansion.

Referee Report

3 major / 1 minor

Summary. The paper claims to unlock positive forward and backward knowledge transfer in class-incremental surgical instrument segmentation via a self-reflection hierarchical prompt framework. It builds on a frozen pre-trained model that appends instrument-aware prompts organized in a hierarchical parsing tree (instrument-shared root, n-part-shared intermediates, distinct leaves) to enable forward transfer of reusable knowledge, and uses directed-weighted graph propagation for self-reflection to refine old representations for backward transfer without catastrophic forgetting. The approach is stated to apply to both CNN and transformer models, with reported gains exceeding 5% and 11% over competing methods on two public benchmarks.

Significance. If the empirical claims hold under detailed validation, the work would be significant for incremental learning in medical computer vision by explicitly targeting positive transfer in both directions, an aspect often neglected in prior incremental segmentation methods. Applicability to both CNN-based models and transformer foundation models is a concrete strength that could extend impact to surgical video analysis pipelines.

major comments (3)

[Abstract] Abstract: the central claim of positive forward and backward transfer yielding >5% and >11% improvements rests on benchmark results, yet the abstract (and visible description) provides no details on experimental protocol, statistical tests, ablation studies, or exact baselines, making the positive-transfer assertion difficult to evaluate.
[Method (hierarchical prompt parsing tree and directed-weighted graph propagation)] Hierarchical prompt parsing tree and directed-weighted graph propagation: the directed-weighted graph propagation is presented as enabling positive backward transfer by examining associations in the tree, but no quantitative isolation of its isolated effect on old-class performance after new instruments are added is shown; without this, the risk of negative transfer remains unaddressed when instrument features overlap (e.g., graspers and scissors).
[Method] Method: prompt partition granularity is a free parameter in the hierarchical tree construction; this undercuts the claim of fully adaptive, reusable knowledge exposure for new classes without additional tuning.

minor comments (1)

[Method] The notation and propagation rule for the directed-weighted graph could be formalized with an equation or pseudocode to clarify how associations are updated without introducing instability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve clarity and provide additional evidence where needed.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of positive forward and backward transfer yielding >5% and >11% improvements rests on benchmark results, yet the abstract (and visible description) provides no details on experimental protocol, statistical tests, ablation studies, or exact baselines, making the positive-transfer assertion difficult to evaluate.

Authors: We agree the abstract should better contextualize the claims. In revision we will expand it to name the two benchmarks (EndoVis and CholecSeg8k), list the primary baselines, and note that gains are supported by multiple runs with statistical testing. Full protocols, exact numbers, and ablations will stay in the main text and supplement due to length limits. revision: yes
Referee: [Method (hierarchical prompt parsing tree and directed-weighted graph propagation)] Hierarchical prompt parsing tree and directed-weighted graph propagation: the directed-weighted graph propagation is presented as enabling positive backward transfer by examining associations in the tree, but no quantitative isolation of its isolated effect on old-class performance after new instruments are added is shown; without this, the risk of negative transfer remains unaddressed when instrument features overlap (e.g., graspers and scissors).

Authors: We acknowledge the need for isolation. We will add an ablation in the revised Section 4.3 that reports old-class mIoU before/after new-class addition, with and without the graph-propagation module. This will quantify backward-transfer gains and show mitigation of negative transfer for overlapping instruments (graspers/scissors) via the hierarchical associations. revision: yes
Referee: [Method] Method: prompt partition granularity is a free parameter in the hierarchical tree construction; this undercuts the claim of fully adaptive, reusable knowledge exposure for new classes without additional tuning.

Authors: The granularity follows a fixed, predefined instrument taxonomy (root = all instruments, intermediates = shared parts such as shaft/tip, leaves = distinct tools) derived from standard surgical knowledge; it is not tuned per run or per incremental step. We will revise the method section to state this construction explicitly and add a sensitivity study in the supplement demonstrating robustness. revision: partial

Circularity Check

0 steps flagged

No circularity detected; framework is an independent architectural proposal

full rationale

The paper proposes a self-reflection hierarchical prompt framework consisting of a prompt parsing tree for forward transfer and directed-weighted graph propagation for backward transfer. These are presented as explicit design choices in the abstract and methods, with claimed gains validated empirically on public benchmarks rather than derived from equations that reduce outputs to fitted inputs or self-citations. No load-bearing step matches any enumerated circularity pattern; the derivation chain remains self-contained against external data.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The framework rests on a frozen pre-trained backbone and introduces two new conceptual structures whose effectiveness is asserted rather than derived from prior results.

free parameters (1)

prompt partition granularity
Choice of how to group instruments into n-part-shared nodes is not derived from data and must be selected per dataset.

axioms (1)

domain assumption A frozen pre-trained model remains effective for new classes when only prompts are added and updated.
Explicitly stated as the foundation of the framework.

invented entities (2)

hierarchical prompt parsing tree no independent evidence
purpose: Organize prompts to expose reusable historical knowledge for forward transfer
Newly defined structure with root, intermediate, and leaf partitions.
directed-weighted graph propagation for self-reflection no independent evidence
purpose: Refine existing knowledge associations for backward transfer
New mechanism that examines tree-recorded associations.

pith-pipeline@v0.9.0 · 5570 in / 1350 out tokens · 48845 ms · 2026-05-13T20:55:02.870823+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we organize instrument prompts into a hierarchical prompt parsing tree with the instrument-shared prompt partition as the root node, n-part-shared prompt partitions as intermediate nodes and instrument-distinct prompt partitions as leaf nodes... directed-weighted graph propagation

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 1 internal anchor

[1]

2017 robotic instrument segmentation challenge.arXiv preprint arXiv:1902.06426, 2019

Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, et al. 2017 robotic instrument segmentation challenge.arXiv preprint arXiv:1902.06426, 2019. 6, 1

work page arXiv 2017
[2]

2018 robotic scene segmentation challenge

Max Allan, Satoshi Kondo, Sebastian Bodenstedt, Stefan Leger, Rahim Kadkhodamohammadi, Imanol Luengo, Fe- lix Fuentes, Evangello Flouty, Ahmed Mohammed, Marius Pedersen, et al. 2018 robotic scene segmentation challenge. arXiv preprint arXiv:2001.11190, 2020. 6, 1

work page arXiv 2018
[3]

Code-cl: Conceptor-based gradient projection for deep con- tinual learning

Marco PE Apolinario, Sakshi Choudhary, and Kaushik Roy. Code-cl: Conceptor-based gradient projection for deep con- tinual learning. InProceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 775–784,

work page
[4]

Algebraic perron-frobenius theory.Linear Algebra and its Applications, 11(3):219–233,

GP Barker and Hans Schneider. Algebraic perron-frobenius theory.Linear Algebra and its Applications, 11(3):219–233,

work page
[5]

Modeling the background for incremental learning in semantic segmentation

Fabio Cermelli, Massimiliano Mancini, Samuel Rota Bulo, Elisa Ricci, and Barbara Caputo. Modeling the background for incremental learning in semantic segmentation. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9233–9242, 2020. 7

work page 2020
[6]

Ma-sam: Modality-agnostic sam adap- tation for 3d medical image segmentation.Medical Image Analysis, 98:103310, 2024

Cheng Chen, Juzheng Miao, Dufan Wu, Aoxiao Zhong, Zhiling Yan, Sekeun Kim, Jiang Hu, Zhengliang Liu, Lichao Sun, Xiang Li, et al. Ma-sam: Modality-agnostic sam adap- tation for 3d medical image segmentation.Medical Image Analysis, 98:103310, 2024. 3, 1

work page 2024
[7]

Rethinking Atrous Convolution for Semantic Image Segmentation

Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. Rethinking atrous convolution for seman- tic image segmentation.arXiv preprint arXiv:1706.05587,

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Don't forget, there is more than forgetting: new metrics for Continual Learning

Natalia D ´ıaz-Rodr´ıguez, Vincenzo Lomonaco, David Filliat, and Davide Maltoni. Don’t forget, there is more than for- getting: new metrics for continual learning.arXiv preprint arXiv:1810.13166, 2018. 2, 7, 1

work page Pith review arXiv 2018
[9]

Plop: Learning without forgetting for contin- ual semantic segmentation

Arthur Douillard, Yifu Chen, Arnaud Dapogny, and Matthieu Cord. Plop: Learning without forgetting for contin- ual semantic segmentation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4040–4050, 2021. 7

work page 2021
[10]

Prompt tuning for parameter-efficient medical image segmentation

Marc Fischer, Alexander Bartler, and Bin Yang. Prompt tuning for parameter-efficient medical image segmentation. Medical Image Analysis, 91:103024, 2024. 3

work page 2024
[11]

Multi-domain incremental learning for semantic segmenta- tion

Prachi Garg, Rohit Saluja, Vineeth N Balasubramanian, Chetan Arora, Anbumani Subramanian, and CV Jawahar. Multi-domain incremental learning for semantic segmenta- tion. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 761–771, 2022. 3

work page 2022
[12]

Isinet: an instance-based approach for surgical instru- ment segmentation

Cristina Gonz ´alez, Laura Bravo-S ´anchez, and Pablo Arbe- laez. Isinet: an instance-based approach for surgical instru- ment segmentation. InInternational conference on medical image computing and computer-assisted intervention, pages 595–605. Springer, 2020. 6

work page 2020
[13]

Multiple prompt fusion for zero-shot lesion detection using vision-language models

Miaotian Guo, Huahui Yi, Ziyuan Qin, Haiying Wang, Aidong Men, and Qicheng Lao. Multiple prompt fusion for zero-shot lesion detection using vision-language models. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 283–292. Springer,

work page
[14]

Cholecseg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on cholec80.arXiv preprint arXiv:2012.12453, 2020

W-Y Hong, C-L Kao, Y-H Kuo, J-R Wang, W-L Chang, and C-S Shih. Cholecseg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on cholec80.arXiv preprint arXiv:2012.12453, 2020. 6

work page arXiv 2012
[15]

Segment any- thing

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 4015–4026, 2023. 3, 6, 7, 1

work page 2023
[16]

Domain- incremental cardiac image segmentation with style-oriented replay and domain-sensitive feature whitening.IEEE Trans- actions on Medical Imaging, 42(3):570–581, 2022

Kang Li, Lequan Yu, and Pheng-Ann Heng. Domain- incremental cardiac image segmentation with style-oriented replay and domain-sensitive feature whitening.IEEE Trans- actions on Medical Imaging, 42(3):570–581, 2022. 3

work page 2022
[17]

A dual enrichment synergistic strategy to handle data heterogeneity for domain incremental cardiac segmentation.IEEE Trans- actions on Medical Imaging, 43(6):2279–2290, 2024

Kang Li, Yu Zhu, Lequan Yu, and Pheng-Ann Heng. A dual enrichment synergistic strategy to handle data heterogeneity for domain incremental cardiac segmentation.IEEE Trans- actions on Medical Imaging, 43(6):2279–2290, 2024. 3

work page 2024
[18]

Caprompt: Cyclic prompt aggregation for pre-trained model based class incremental learning

Qiwei Li and Jiahuan Zhou. Caprompt: Cyclic prompt aggregation for pre-trained model based class incremental learning. InProceedings of the AAAI Conference on Arti- ficial Intelligence, pages 18421–18429, 2025. 2, 3

work page 2025
[19]

Learning without forgetting

Zhizhong Li and Derek Hoiem. Learning without forgetting. IEEE transactions on pattern analysis and machine intelli- gence, 40(12):2935–2947, 2017. 7

work page 2017
[20]

Learning incrementally to segment mul- tiple organs in a ct image

Pengbo Liu, Xia Wang, Mengsi Fan, Hongli Pan, Minmin Yin, Xiaohong Zhu, Dandan Du, Xiaoying Zhao, Li Xiao, Lian Ding, et al. Learning incrementally to segment mul- tiple organs in a ct image. InInternational Conference on Medical Image Computing and Computer-Assisted Interven- tion, pages 714–724. Springer, 2022. 7

work page 2022
[21]

Gradient episodic memory for continual learning.Advances in neu- ral information processing systems, 30, 2017

David Lopez-Paz and Marc’Aurelio Ranzato. Gradient episodic memory for continual learning.Advances in neu- ral information processing systems, 30, 2017. 7, 1

work page 2017
[22]

m2caiseg: Semantic segmentation of laparoscopic images using convolutional neural networks.arXiv preprint arXiv:2008.10134, 2020

Salman Maqbool, Aqsa Riaz, Hasan Sajid, and Osman Hasan. m2caiseg: Semantic segmentation of laparoscopic images using convolutional neural networks.arXiv preprint arXiv:2008.10134, 2020. 6

work page arXiv 2008
[23]

Incremental learn- ing techniques for semantic segmentation

Umberto Michieli and Pietro Zanuttigh. Incremental learn- ing techniques for semantic segmentation. InProceedings of the IEEE/CVF international conference on computer vision workshops, pages 0–0, 2019. 7

work page 2019
[24]

Continual semantic segmentation via repulsion-attraction of sparse and disentan- gled latent representations

Umberto Michieli and Pietro Zanuttigh. Continual semantic segmentation via repulsion-attraction of sparse and disentan- gled latent representations. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1114–1124, 2021. 7

work page 2021
[25]

The pagerank citation ranking: Bringing order to the web

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford infolab, 1999. 5

work page 1999
[26]

Pearl: Input- agnostic prompt enhancement with negative feedback regu- lation for class-incremental learning

Yongchun Qin, Pengfei Fang, and Hui Xue. Pearl: Input- agnostic prompt enhancement with negative feedback regu- lation for class-incremental learning. InProceedings of the AAAI Conference on Artificial Intelligence, pages 20051– 20059, 2025. 3

work page 2025
[27]

The surprising positive knowledge transfer in contin- ual 3d object shape reconstruction

Anh Thai, Stefan Stojanov, Zixuan Huang, and James M Rehg. The surprising positive knowledge transfer in contin- ual 3d object shape reconstruction. In2022 International Conference on 3D Vision (3DV), pages 209–218. IEEE,

work page
[28]

Digraph inception con- volutional networks.Advances in neural information pro- cessing systems, 33:17907–17918, 2020

Zekun Tong, Yuxuan Liang, Changsheng Sun, Xinke Li, David Rosenblum, and Andrew Lim. Digraph inception con- volutional networks.Advances in neural information pro- cessing systems, 33:17907–17918, 2020. 5, 1

work page 2020
[29]

Three types of incremental learning.Nature Machine Intelligence, 4(12):1185–1197, 2022

Gido M Van de Ven, Tinne Tuytelaars, and Andreas S To- lias. Three types of incremental learning.Nature Machine Intelligence, 4(12):1185–1197, 2022. 2

work page 2022
[30]

Emerging robotic platforms for min- imally invasive surgery.IEEE reviews in biomedical engi- neering, 6:111–126, 2012

Valentina Vitiello, Su-Lin Lee, Thomas P Cundy, and Guang-Zhong Yang. Emerging robotic platforms for min- imally invasive surgery.IEEE reviews in biomedical engi- neering, 6:111–126, 2012. 1

work page 2012
[31]

Afec: Active forgetting of negative transfer in continual learning

Liyuan Wang, Mingtian Zhang, Zhongfan Jia, Qian Li, Chenglong Bao, Kaisheng Ma, Jun Zhu, and Yi Zhong. Afec: Active forgetting of negative transfer in continual learning. Advances in Neural Information Processing Systems, 34: 22379–22391, 2021. 2

work page 2021
[32]

S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning.Advances in Neural Informa- tion Processing Systems, 35:5682–5695, 2022

Yabin Wang, Zhiwu Huang, and Xiaopeng Hong. S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning.Advances in Neural Informa- tion Processing Systems, 35:5682–5695, 2022. 2

work page 2022
[33]

Dualprompt: Complementary prompting for rehearsal-free continual learning

Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vin- cent Perot, Jennifer Dy, et al. Dualprompt: Complementary prompting for rehearsal-free continual learning. InEuropean conference on computer vision, pages 631–648. Springer,

work page
[34]

Learning to prompt for con- tinual learning

Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jen- nifer Dy, and Tomas Pfister. Learning to prompt for con- tinual learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 139–149,

work page
[35]

Endpoints weight fusion for class incremental semantic segmentation

Jia-Wen Xiao, Chang-Bin Zhang, Jiekang Feng, Xialei Liu, Joost van de Weijer, and Ming-Ming Cheng. Endpoints weight fusion for class incremental semantic segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7204–7213, 2023. 3, 7

work page 2023
[36]

Privacy-preserving synthetic continual semantic seg- mentation for robotic surgery.IEEE transactions on medical imaging, 2024

Mengya Xu, Mobarakol Islam, Long Bai, and Hongliang Ren. Privacy-preserving synthetic continual semantic seg- mentation for robotic surgery.IEEE transactions on medical imaging, 2024. 3, 7

work page 2024
[37]

Contin- ual learning through synaptic intelligence

Friedemann Zenke, Ben Poole, and Surya Ganguli. Contin- ual learning through synaptic intelligence. InInternational conference on machine learning, pages 3987–3995. PMLR,

work page
[38]

Representation compensation networks for continual semantic segmentation

Chang-Bin Zhang, Jia-Wen Xiao, Xialei Liu, Ying-Cong Chen, and Ming-Ming Cheng. Representation compensation networks for continual semantic segmentation. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7053–7064, 2022. 7

work page 2022
[39]

Coinseg: Contrast inter-and intra-class representations for incremental segmentation

Zekang Zhang, Guangyu Gao, Jianbo Jiao, Chi Harold Liu, and Yunchao Wei. Coinseg: Contrast inter-and intra-class representations for incremental segmentation. InProceed- ings of the IEEE/CVF International Conference on Com- puter Vision, pages 843–853, 2023. 7

work page 2023
[40]

Memory-efficient prompt tuning for incremental histopathol- ogy classification

Yu Zhu, Kang Li, Lequan Yu, and Pheng Ann Heng. Memory-efficient prompt tuning for incremental histopathol- ogy classification. InProceedings of the AAAI Conference on Artificial Intelligence, pages 7802–7810, 2024. 3 Supplementary In the supplementary, we additionally provide implementa- tion details, training scheme descriptions, complete visual- izatio...

work page 2024
[41]

How many parts does this instrument consist of?

MLLM-based Part Count Estimation In most surgical video datasets, instrument annotations are provided in a part-wise manner, i.e., each part of the instru- ment is labeled as a different category. Consequently, the number of parts for each instrument class can be directly obtained as a prior (all four datasets used in our paper fol- low this convention). ...

work page
[42]

Based on the supplementary ma- terials, along with the provided code and the open-source implementations of [6] and [28], our project can be fully reproduced

Implementation details We follow the preprocessing procedure described in [6] to preprocess the dataset. Based on the supplementary ma- terials, along with the provided code and the open-source implementations of [6] and [28], our project can be fully reproduced. When using SAM, we adopt the SAM ViT- H [15] version as our encoderθ Enc. In our practical im...

work page
[43]

Illustration of HPPT construction

Metrics for Class-Incremental Learning To provide a deeper understanding of model performance in the class-incremental learning (CIL) setting, we adopt two widely-used evaluation metrics: Backward Transfer (BWT) Figure 1. Illustration of HPPT construction. UP CAMCS PF LND BF GR SI VS 1-part 2-part 3-part 1-part-shared Partition 2-part-shared Partition 3-p...

work page
[44]

Training and Inference Scheme In the first episode, we freeze the encoder of SAM and jointly train the decoder, segmentation head, adapter, and the prompt parsing tree. Since each image in the dataset typically contains more than one class of surgical instru- ment, we activate all instrument-aware prompts during the forward propagation, allowing each pixe...

work page
[45]

Discussion Although prompt-based approaches and foundation mod- els have recently emerged in surgical instrument segmen- tation, none have addressed the Class-Incremental Learning (CIL) setting. To the best of our knowledge, this work is the first to integrate prompt tuning with the Segment Anything Model (SAM) specifically for surgical instrument class i...

work page 2017