DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation
Pith reviewed 2026-05-19 08:05 UTC · model grok-4.3
The pith
Partitioning user clicks into coherent subsets lets SAM adapt at test time without cue conflicts for better interactive segmentation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Instead of forcing a single model to incorporate all user clicks at once, DC-TTA partitions the clicks into more coherent subsets, each processed independently via test-time adaptation with a separated model; the adapted models are then merged to form a unified predictor that integrates the specialized knowledge from each subset.
What carries the argument
The divide-and-conquer partitioning of clicks into coherent subsets for independent per-subset test-time adaptation followed by model merging.
Load-bearing premise
Splitting the clicks into coherent subsets reduces conflicts among diverse cues and enables more localized updates that improve the final merged predictor.
What would settle it
A benchmark run in which merging separately adapted models on partitioned clicks produces equal or lower accuracy than a single model adapted on all clicks together.
Figures
read the original abstract
Interactive segmentation (IS) allows users to iteratively refine object boundaries with minimal cues, such as positive and negative clicks. While the Segment Anything Model (SAM) has garnered attention in the IS community for its promptable segmentation capabilities, it often struggles in specialized domains or when handling complex scenarios (e.g., camouflaged or multi-part objects). To overcome these challenges, we propose DC-TTA, a novel test-time adaptation (TTA) framework that adapts SAM on a per-sample basis by leveraging user interactions as supervision. Instead of forcing a single model to incorporate all user clicks at once, DC-TTA partitions the clicks into more coherent subsets, each processed independently via TTA with a separated model. This Divide-and-Conquer strategy reduces conflicts among diverse cues and enables more localized updates. Finally, we merge the adapted models to form a unified predictor that integrates the specialized knowledge from each subset. Experimental results across various benchmarks demonstrate that DC-TTA significantly outperforms SAM's zero-shot results and conventional TTA methods, effectively handling complex tasks such as camouflaged object segmentation with fewer interactions and improved accuracy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents DC-TTA, a divide-and-conquer framework for test-time adaptation in interactive segmentation. It partitions user clicks into coherent subsets, adapts separate SAM models on each subset using TTA, and merges the results to create a final predictor. This is claimed to handle conflicts in cues better than applying TTA to all clicks at once, leading to improved performance on standard benchmarks and complex tasks like camouflaged object segmentation compared to zero-shot SAM and conventional TTA methods.
Significance. Should the gains prove to stem from the coherence of the partitions rather than from the general benefits of adapting and merging multiple models, the approach could meaningfully advance interactive segmentation by allowing more effective use of user inputs in difficult scenarios. The framework is empirical and relies on the specific partitioning strategy, so its significance hinges on demonstrating that this strategy is superior to alternatives like random partitioning.
major comments (2)
- [§3] §3 (Method): The central claim that coherent partitioning specifically reduces cue conflicts and enables superior localized updates is not isolated experimentally. No control is described that holds the number of adapted models and total TTA optimization steps fixed while comparing coherent partitioning against random partitioning or a single-model baseline.
- [§4] §4 (Experiments): The reported outperformance on benchmarks lacks accompanying error bars, statistical tests, or ablations that directly vary only the partitioning criterion. This leaves open whether gains derive from ensembling effects rather than the coherence assumption highlighted in the abstract.
minor comments (2)
- [Abstract] The abstract could specify the exact benchmarks, metrics, and interaction counts used to support claims of 'fewer interactions and improved accuracy'.
- Notation for the final merging step would benefit from an explicit equation defining how the specialized predictors are combined into the unified model.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight important aspects for strengthening the empirical support of our claims. We address each major comment below and will incorporate revisions to provide the requested controls and statistical rigor.
read point-by-point responses
-
Referee: [§3] §3 (Method): The central claim that coherent partitioning specifically reduces cue conflicts and enables superior localized updates is not isolated experimentally. No control is described that holds the number of adapted models and total TTA optimization steps fixed while comparing coherent partitioning against random partitioning or a single-model baseline.
Authors: We agree that the current experiments do not fully isolate the contribution of coherent partitioning. While the manuscript includes a single-model TTA baseline, it lacks a matched control with random partitioning under fixed model count and optimization steps. We will add this ablation in the revised version to directly test whether coherence (rather than the mere presence of multiple adapted models) drives the observed benefits. revision: yes
-
Referee: [§4] §4 (Experiments): The reported outperformance on benchmarks lacks accompanying error bars, statistical tests, or ablations that directly vary only the partitioning criterion. This leaves open whether gains derive from ensembling effects rather than the coherence assumption highlighted in the abstract.
Authors: We acknowledge that the experimental section would benefit from error bars, statistical significance testing, and ablations that isolate the partitioning criterion. In the revision we will report error bars across multiple runs, include statistical tests, and add an ablation that varies only the partitioning strategy (coherent vs. random) while holding the number of models and total TTA steps constant, thereby clarifying that improvements are attributable to cue coherence rather than generic ensembling. revision: yes
Circularity Check
No circularity: empirical framework with independent experimental validation
full rationale
The paper introduces DC-TTA as a practical divide-and-conquer TTA method for interactive segmentation: user clicks are partitioned into coherent subsets, each subset drives independent per-sample adaptation of a separate SAM-based model, and the resulting models are merged into a final predictor. All performance claims rest on reported benchmark experiments comparing against SAM zero-shot and standard single-model TTA baselines. No equations, fitted parameters, or derivations appear in the provided text that would reduce the reported gains to quantities defined by construction within the paper itself. The central design choice (coherent partitioning) is motivated by an empirical hypothesis about cue conflict and is evaluated directly on external datasets rather than being justified by self-citation chains or self-referential definitions. Consequently the derivation chain is self-contained and the circularity score is 0.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption User clicks provide reliable supervision for test-time adaptation
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DC-TTA partitions the clicks into more coherent subsets, each processed independently via TTA with a separated model... merge the adapted models to form a unified predictor
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Divide-and-Conquer strategy reduces conflicts among diverse cues and enables more localized updates
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Interactive graph cuts for optimal boundary & region segmentation of objects in nd images
Yuri Y Boykov and M-P Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Proceedings eighth IEEE international confer- ence on computer vision. ICCV 2001, pages 105–112. IEEE,
work page 2001
-
[2]
Contrastive test-time adaptation
Dian Chen, Dequan Wang, Trevor Darrell, and Sayna Ebrahimi. Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 295–305, 2022. 2
work page 2022
-
[3]
Focalclick: Towards prac- tical interactive image segmentation
Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, and Hengshuang Zhao. Focalclick: Towards prac- tical interactive image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1300–1309, 2022. 2, 7, 8
work page 2022
- [4]
-
[5]
To adapt or not to adapt? real- time adaptation for semantic segmentation
Marc Botet Colomer, Pier Luigi Dovesi, Theodoros Pana- giotakopoulos, Joao Frederico Carvalho, Linus H ¨arenstam- Nielsen, Hossein Azizpour, Hedvig Kjellstr ¨om, Daniel Cre- mers, and Matteo Poggi. To adapt or not to adapt? real- time adaptation for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages ...
work page 2023
-
[6]
Adaptive stochastic weight averaging
Caglar Demir, Arnab Sharma, and Axel-Cyrille Ngonga Ngomo. Adaptive stochastic weight averaging. arXiv preprint arXiv:2406.19092, 2024. 3
-
[7]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. 7
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[8]
The pascal visual object classes (voc) challenge
Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010. 6, 7
work page 2010
-
[9]
Camouflaged object de- tection
Deng-Ping Fan, Ge-Peng Ji, Guolei Sun, Ming-Ming Cheng, Jianbing Shen, and Ling Shao. Camouflaged object de- tection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2777–2787,
-
[10]
Uncertainty reduction for model adaptation in semantic segmentation
Francois Fleuret et al. Uncertainty reduction for model adaptation in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9613–9623, 2021. 2
work page 2021
-
[11]
Getting to 99% accuracy in interactive seg- mentation
Marco Forte, Brian Price, Scott Cohen, Ning Xu, and Franc ¸ois Piti´e. Getting to 99% accuracy in interactive seg- mentation. arXiv preprint arXiv:2003.07932, 2020. 2
-
[12]
Random walks for image segmentation
Leo Grady. Random walks for image segmentation. IEEE transactions on pattern analysis and machine intelligence , 28(11):1768–1783, 2006. 2
work page 2006
-
[13]
Geodesic star convexity for interactive image segmentation
Varun Gulshan, Carsten Rother, Antonio Criminisi, Andrew Blake, and Andrew Zisserman. Geodesic star convexity for interactive image segmentation. In 2010 IEEE Computer So- ciety Conference on Computer Vision and Pattern Recogni- tion, pages 3129–3136. IEEE, 2010. 2
work page 2010
-
[14]
Trash- can: A semantically-segmented dataset towards visual de- tection of marine debris
Jungseok Hong, Michael Fulton, and Junaed Sattar. Trash- can: A semantically-segmented dataset towards visual de- tection of marine debris. arXiv preprint arXiv:2007.08097,
-
[15]
Foc- sam: Delving deeply into focused objects in segmenting any- thing
You Huang, Zongyu Lan, Liujuan Cao, Xianming Lin, Shengchuan Zhang, Guannan Jiang, and Rongrong Ji. Foc- sam: Delving deeply into focused objects in segmenting any- thing. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 3120–3130,
-
[16]
Editing Models with Task Arithmetic
Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic. arXiv preprint arXiv:2212.04089, 2022. 2, 3
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[17]
Interactive image seg- mentation via backpropagating refinement scheme
Won-Dong Jang and Chang-Su Kim. Interactive image seg- mentation via backpropagating refinement scheme. In Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5297–5306, 2019. 2
work page 2019
-
[18]
Fine-tuning linear layers only is a simple yet effective way for task arithmetic
Ruochen Jin, Bojian Hou, Jiancong Xiao, Weijie Su, and Li Shen. Fine-tuning linear layers only is a simple yet effective way for task arithmetic. arXiv preprint arXiv:2407.07089 ,
-
[19]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. In Proceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 4015–4026, 2023. 1, 2, 3, 7, 8
work page 2023
-
[20]
Continuous adaptation for interactive ob- ject segmentation by learning from corrections
Theodora Kontogianni, Michael Gygli, Jasper Uijlings, and Vittorio Ferrari. Continuous adaptation for interactive ob- ject segmentation by learning from corrections. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16 , pages 579–596. Springer, 2020. 2
work page 2020
-
[21]
Anabranch network for camouflaged object segmentation
Trung-Nghia Le, Tam V Nguyen, Zhongliang Nie, Minh- Triet Tran, and Akihiro Sugimoto. Anabranch network for camouflaged object segmentation. Computer vision and im- age understanding, 184:45–56, 2019. 6, 7, 8 9
work page 2019
-
[22]
Mfp: Mak- ing full use of probability maps for interactive image seg- mentation
Chaewon Lee, Seon-Ho Lee, and Chang-Su Kim. Mfp: Mak- ing full use of probability maps for interactive image seg- mentation. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 4051–4059. IEEE, 2024. 2, 8
work page 2024
-
[23]
Interactive learn- ing for semantic segmentation in earth observation
Gaston Lenczner, Adrien Chan-Hon-Tong, Nicola Luminari, Bertrand Le Saux, and Guy Le Besnerais. Interactive learn- ing for semantic segmentation in earth observation. arXiv preprint arXiv:2009.11250, 2020. 2
-
[24]
Interactive image segmentation with latent diversity
Zhuwen Li, Qifeng Chen, and Vladlen Koltun. Interactive image segmentation with latent diversity. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 577–585, 2018. 2, 7
work page 2018
-
[25]
Jian Liang, Dapeng Hu, and Jiashi Feng. Do we really need to access the source data? source hypothesis transfer for un- supervised domain adaptation. In International conference on machine learning, pages 6028–6039. PMLR, 2020. 2
work page 2020
-
[26]
Regional interactive image segmentation net- works
JunHao Liew, Yunchao Wei, Wei Xiong, Sim-Heng Ong, and Jiashi Feng. Regional interactive image segmentation net- works. In 2017 IEEE international conference on computer vision (ICCV), pages 2746–2754. IEEE, 2017. 2
work page 2017
-
[27]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer vision–ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13, pages 740–755. Springer, 2014. 6
work page 2014
-
[28]
Interactive image segmentation with first click attention
Zheng Lin, Zhao Zhang, Lin-Zhuo Chen, Ming-Ming Cheng, and Shao-Ping Lu. Interactive image segmentation with first click attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 13339–13348, 2020. 2
work page 2020
-
[29]
Click prompt learning with optimal trans- port for interactive segmentation
Jie Liu, Haochen Wang, Wenzhe Yin, Jan-Jakob Sonke, and Efstratios Gavves. Click prompt learning with optimal trans- port for interactive segmentation. In European Conference on Computer Vision, pages 93–110. Springer, 2024. 2
work page 2024
-
[30]
Pseudoclick: Interactive image segmentation with click imi- tation
Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, and Ziyan Wu. Pseudoclick: Interactive image segmentation with click imi- tation. In European Conference on Computer Vision, pages 728–745. Springer, 2022. 2
work page 2022
-
[31]
Simpleclick: Interactive image segmentation with sim- ple vision transformers
Qin Liu, Zhenlin Xu, Gedas Bertasius, and Marc Nietham- mer. Simpleclick: Interactive image segmentation with sim- ple vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 22290– 22300, 2023. 7
work page 2023
-
[32]
Rethinking interactive image segmentation with low latency high quality and diverse prompts
Qin Liu, Jaemin Cho, Mohit Bansal, and Marc Nietham- mer. Rethinking interactive image segmentation with low latency high quality and diverse prompts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3773–3782, 2024. 2
work page 2024
-
[33]
Tangent transformers for composition, privacy and removal
Tian Yu Liu, Aditya Golatkar, and Stefano Soatto. Tangent transformers for composition, privacy and removal. arXiv preprint arXiv:2307.08122, 2023. 3
-
[34]
Decoupled Weight Decay Regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. 7
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[35]
David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings eighth IEEE international conference on computer vision. ICCV 2001 , pages 416–423. IEEE, 2001. 6, 7
work page 2001
-
[36]
Towards stable test-time adaptation in dynamic wild world
Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, and Mingkui Tan. Towards stable test-time adaptation in dynamic wild world. In Internetional Conference on Learning Representations, 2023. 2
work page 2023
-
[37]
Task arithmetic in the tangent space: Improved editing of pre-trained models
Guillermo Ortiz-Jimenez, Alessandro Favero, and Pascal Frossard. Task arithmetic in the tangent space: Improved editing of pre-trained models. Advances in Neural Informa- tion Processing Systems, 36, 2024. 3
work page 2024
-
[38]
Hyojin Park, Alan Yessenbayev, Tushar Singhal, Navin Ku- mar Adhikari, Yizhe Zhang, Shubhankar Mangesh Borse, Hong Cai, Nilesh Prasad Pandey, Fei Yin, Frank Mayer, et al. Real-time, accurate, and consistent video semantic segmentation via unsupervised adaptation and cross-unit de- ployment on mobile device. InProceedings of the IEEE/CVF Conference on Comp...
work page 2022
-
[39]
A benchmark dataset and evaluation methodology for video object segmentation
Federico Perazzi, Jordi Pont-Tuset, Brian McWilliams, Luc Van Gool, Markus Gross, and Alexander Sorkine-Hornung. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 724–732,
-
[40]
” grabcut” interactive foreground extraction using iterated graph cuts
Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. ” grabcut” interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG) , 23(3): 309–314, 2004. 2
work page 2004
-
[41]
Adapting the segment anything model during usage in novel situations
Robin Sch ¨on, Julian Lorenz, Katja Ludwig, and Rainer Lien- hart. Adapting the segment anything model during usage in novel situations. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3616–3626, 2024. 1, 2, 7, 8
work page 2024
-
[42]
f-brs: Rethinking backpropagating refinement for interactive segmentation
Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8623–8632, 2020. 6, 7
work page 2020
-
[43]
f-brs: Rethinking backpropagating refinement for interactive segmentation
Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8623–8632, 2020. 2
work page 2020
-
[44]
Re- viving iterative training with mask guidance for interactive segmentation
Konstantin Sofiiuk, Ilya A Petrov, and Anton Konushin. Re- viving iterative training with mask guidance for interactive segmentation. In 2022 IEEE International Conference on Image Processing (ICIP), pages 3141–3145. IEEE, 2022. 2, 7, 8
work page 2022
-
[45]
Cfr-icl: Cascade-forward refinement with iterative click loss for interactive image segmentation
Shoukun Sun, Min Xian, Fei Xu, Luca Capriotti, and Tiankai Yao. Cfr-icl: Cascade-forward refinement with iterative click loss for interactive image segmentation. In Proceedings of the AAAI conference on artificial intelligence , pages 5017– 5024, 2024. 2
work page 2024
-
[46]
Parameter efficient multi- 10 task model fusion with partial linearization
Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, and Dacheng Tao. Parameter efficient multi- 10 task model fusion with partial linearization. arXiv preprint arXiv:2310.04742, 2023. 3
-
[47]
Tesla: Test-time self-learning with automatic adversarial augmentation
Devavrat Tomar, Guillaume Vray, Behzad Bozorgtabar, and Jean-Philippe Thiran. Tesla: Test-time self-learning with automatic adversarial augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20341–20350, 2023. 2
work page 2023
-
[48]
On the road to online adaptation for semantic image segmentation
Riccardo V olpi, Pau De Jorge, Diane Larlus, and Gabriela Csurka. On the road to online adaptation for semantic image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 19184– 19195, 2022
work page 2022
-
[49]
Tent: Fully Test-time Adaptation by Entropy Minimization
Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Ol- shausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726,
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[50]
Jifeng Wang, Xiang Li, and Jian Yang. Stacked condi- tional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 1788–1797, 2018. 6, 7, 8
work page 2018
-
[51]
Continual test-time domain adaptation
Qin Wang, Olga Fink, Luc Van Gool, and Dengxin Dai. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7201–7211, 2022. 2
work page 2022
-
[52]
Wei Wang, Zhun Zhong, Weijie Wang, Xi Chen, Charles Ling, Boyu Wang, and Nicu Sebe. Dynamically instance- guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24090–24099, 2023
work page 2023
-
[53]
Continual test-time domain adaptation via dynamic sample selection
Yanshuo Wang, Jie Hong, Ali Cheraghian, Shafin Rahman, David Ahmedt-Aristizabal, Lars Petersson, and Mehrtash Harandi. Continual test-time domain adaptation via dynamic sample selection. In Proceedings of the IEEE/CVF Win- ter Conference on Applications of Computer Vision , pages 1701–1710, 2024. 2
work page 2024
-
[54]
Focused and collaborative feedback integration for interactive image seg- mentation
Qiaoqiao Wei, Hui Zhang, and Jun-Hai Yong. Focused and collaborative feedback integration for interactive image seg- mentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 18643– 18652, 2023. 2
work page 2023
-
[55]
Mitchell Wortsman, Gabriel Ilharco, Samir Ya Gadre, Re- becca Roelofs, Raphael Gontijo-Lopes, Ari S Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Ko- rnblith, et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing in- ference time. In International conference on machine learn- ing, pages 23965...
work page 2022
-
[56]
Robust fine-tuning of zero-shot models
Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gon- tijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, et al. Robust fine-tuning of zero-shot models. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 7959–7971, 2022. 3
work page 2022
-
[57]
Ties-merging: Resolving interference when merging models
Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raf- fel, and Mohit Bansal. Ties-merging: Resolving interference when merging models. Advances in Neural Information Pro- cessing Systems, 36:7093–7115, 2023. 3
work page 2023
-
[58]
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xi- aochun Cao, Jie Zhang, and Dacheng Tao. Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities. arXiv preprint arXiv:2408.07666, 2024. 3
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[59]
Open-vocabulary sam: Seg- ment and recognize twenty-thousand classes interactively
Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, and Chen Change Loy. Open-vocabulary sam: Seg- ment and recognize twenty-thousand classes interactively. In European Conference on Computer Vision, pages 419–437. Springer, 2024. 1, 2
work page 2024
-
[60]
Yizhe Zhang, Shubhankar Borse, Hong Cai, and Fatih Porikli. Auxadapt: Stable and efficient test-time adaptation for temporally consistent video semantic segmentation. In Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision, pages 2339–2348, 2022. 2
work page 2022
-
[61]
Graco: Granularity-controllable interactive segmentation
Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xi- awu Zheng, Rongrong Ji, Chang Liu, Li Yuan, and Jie Chen. Graco: Granularity-controllable interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 3501–3510, 2024. 2
work page 2024
-
[62]
Interactive segmentation as gaussion process classification
Minghao Zhou, Hong Wang, Qian Zhao, Yuexiang Li, Yawen Huang, Deyu Meng, and Yefeng Zheng. Interactive segmentation as gaussion process classification. In Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19488–19497, 2023. 2 11
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.