DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation

arxiv: 2506.23104 · v2 · submitted 2025-06-29 · 💻 cs.CV

DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation

Jihun Kim , Hoyong Kwon , Hyeokjun Kweon , Wooseong Jeong , Kuk-Jin Yoon This is my paper

Pith reviewed 2026-05-19 08:05 UTC · model grok-4.3

classification 💻 cs.CV

keywords test-time adaptationinteractive segmentationclick partitioningSegment Anything Modelcamouflaged objectsmodel merging

0 comments p. Extension

The pith

Partitioning user clicks into coherent subsets lets SAM adapt at test time without cue conflicts for better interactive segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a test-time adaptation approach that breaks a set of user clicks into smaller, more consistent groups rather than feeding them all to one model at the same time. Each group drives its own independent adaptation of the Segment Anything Model, after which the specialized versions are combined into a single final predictor. This strategy is meant to cut down on contradictory signals from mixed positive and negative clicks and to support more focused updates that work especially well on hard cases like camouflaged or multi-part objects. A reader would care because it points to a practical way to make interactive segmentation more accurate while requiring fewer corrections from the user.

Core claim

Instead of forcing a single model to incorporate all user clicks at once, DC-TTA partitions the clicks into more coherent subsets, each processed independently via test-time adaptation with a separated model; the adapted models are then merged to form a unified predictor that integrates the specialized knowledge from each subset.

What carries the argument

The divide-and-conquer partitioning of clicks into coherent subsets for independent per-subset test-time adaptation followed by model merging.

Load-bearing premise

Splitting the clicks into coherent subsets reduces conflicts among diverse cues and enables more localized updates that improve the final merged predictor.

What would settle it

A benchmark run in which merging separately adapted models on partitioned clicks produces equal or lower accuracy than a single model adapted on all clicks together.

Figures

Figures reproduced from arXiv: 2506.23104 by Hoyong Kwon, Hyeokjun Kweon, Jihun Kim, Kuk-Jin Yoon, Wooseong Jeong.

**Figure 2.** Figure 2: Overview of the naive TTA approach. Each iteration, a [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of our DC strategy (without TTA). Given [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Overview of the proposed DC-TTA framework. First, user clicks are assigned into segmentation units (left). Each unit is then [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: A qualitative comparison between the baseline method and our DC-TTA. Each subfigure shows the segmentation result after [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative examples of challenging camouflaged objects from CAMO [ [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Segmentation units identified in our DC-TTA. User [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Interactive segmentation (IS) allows users to iteratively refine object boundaries with minimal cues, such as positive and negative clicks. While the Segment Anything Model (SAM) has garnered attention in the IS community for its promptable segmentation capabilities, it often struggles in specialized domains or when handling complex scenarios (e.g., camouflaged or multi-part objects). To overcome these challenges, we propose DC-TTA, a novel test-time adaptation (TTA) framework that adapts SAM on a per-sample basis by leveraging user interactions as supervision. Instead of forcing a single model to incorporate all user clicks at once, DC-TTA partitions the clicks into more coherent subsets, each processed independently via TTA with a separated model. This Divide-and-Conquer strategy reduces conflicts among diverse cues and enables more localized updates. Finally, we merge the adapted models to form a unified predictor that integrates the specialized knowledge from each subset. Experimental results across various benchmarks demonstrate that DC-TTA significantly outperforms SAM's zero-shot results and conventional TTA methods, effectively handling complex tasks such as camouflaged object segmentation with fewer interactions and improved accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents DC-TTA, a divide-and-conquer framework for test-time adaptation in interactive segmentation. It partitions user clicks into coherent subsets, adapts separate SAM models on each subset using TTA, and merges the results to create a final predictor. This is claimed to handle conflicts in cues better than applying TTA to all clicks at once, leading to improved performance on standard benchmarks and complex tasks like camouflaged object segmentation compared to zero-shot SAM and conventional TTA methods.

Significance. Should the gains prove to stem from the coherence of the partitions rather than from the general benefits of adapting and merging multiple models, the approach could meaningfully advance interactive segmentation by allowing more effective use of user inputs in difficult scenarios. The framework is empirical and relies on the specific partitioning strategy, so its significance hinges on demonstrating that this strategy is superior to alternatives like random partitioning.

major comments (2)

[§3] §3 (Method): The central claim that coherent partitioning specifically reduces cue conflicts and enables superior localized updates is not isolated experimentally. No control is described that holds the number of adapted models and total TTA optimization steps fixed while comparing coherent partitioning against random partitioning or a single-model baseline.
[§4] §4 (Experiments): The reported outperformance on benchmarks lacks accompanying error bars, statistical tests, or ablations that directly vary only the partitioning criterion. This leaves open whether gains derive from ensembling effects rather than the coherence assumption highlighted in the abstract.

minor comments (2)

[Abstract] The abstract could specify the exact benchmarks, metrics, and interaction counts used to support claims of 'fewer interactions and improved accuracy'.
Notation for the final merging step would benefit from an explicit equation defining how the specialized predictors are combined into the unified model.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects for strengthening the empirical support of our claims. We address each major comment below and will incorporate revisions to provide the requested controls and statistical rigor.

read point-by-point responses

Referee: [§3] §3 (Method): The central claim that coherent partitioning specifically reduces cue conflicts and enables superior localized updates is not isolated experimentally. No control is described that holds the number of adapted models and total TTA optimization steps fixed while comparing coherent partitioning against random partitioning or a single-model baseline.

Authors: We agree that the current experiments do not fully isolate the contribution of coherent partitioning. While the manuscript includes a single-model TTA baseline, it lacks a matched control with random partitioning under fixed model count and optimization steps. We will add this ablation in the revised version to directly test whether coherence (rather than the mere presence of multiple adapted models) drives the observed benefits. revision: yes
Referee: [§4] §4 (Experiments): The reported outperformance on benchmarks lacks accompanying error bars, statistical tests, or ablations that directly vary only the partitioning criterion. This leaves open whether gains derive from ensembling effects rather than the coherence assumption highlighted in the abstract.

Authors: We acknowledge that the experimental section would benefit from error bars, statistical significance testing, and ablations that isolate the partitioning criterion. In the revision we will report error bars across multiple runs, include statistical tests, and add an ablation that varies only the partitioning strategy (coherent vs. random) while holding the number of models and total TTA steps constant, thereby clarifying that improvements are attributable to cue coherence rather than generic ensembling. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with independent experimental validation

full rationale

The paper introduces DC-TTA as a practical divide-and-conquer TTA method for interactive segmentation: user clicks are partitioned into coherent subsets, each subset drives independent per-sample adaptation of a separate SAM-based model, and the resulting models are merged into a final predictor. All performance claims rest on reported benchmark experiments comparing against SAM zero-shot and standard single-model TTA baselines. No equations, fitted parameters, or derivations appear in the provided text that would reduce the reported gains to quantities defined by construction within the paper itself. The central design choice (coherent partitioning) is motivated by an empirical hypothesis about cue conflict and is evaluated directly on external datasets rather than being justified by self-citation chains or self-referential definitions. Consequently the derivation chain is self-contained and the circularity score is 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that user-provided clicks serve as reliable per-sample supervision signals for adaptation; no new physical entities or large numbers of free parameters are introduced in the visible description.

axioms (1)

domain assumption User clicks provide reliable supervision for test-time adaptation
The method uses positive and negative clicks as direct supervision to adapt the model on each subset.

pith-pipeline@v0.9.0 · 5739 in / 1207 out tokens · 26542 ms · 2026-05-19T08:05:32.569182+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DC-TTA partitions the clicks into more coherent subsets, each processed independently via TTA with a separated model... merge the adapted models to form a unified predictor
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Divide-and-Conquer strategy reduces conflicts among diverse cues and enables more localized updates

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 5 internal anchors

[1]

Interactive graph cuts for optimal boundary & region segmentation of objects in nd images

Yuri Y Boykov and M-P Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Proceedings eighth IEEE international confer- ence on computer vision. ICCV 2001, pages 105–112. IEEE,

work page 2001
[2]

Contrastive test-time adaptation

Dian Chen, Dequan Wang, Trevor Darrell, and Sayna Ebrahimi. Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 295–305, 2022. 2

work page 2022
[3]

Focalclick: Towards prac- tical interactive image segmentation

Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, and Hengshuang Zhao. Focalclick: Towards prac- tical interactive image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1300–1309, 2022. 2, 7, 8

work page 2022
[4]

Sam-med2d

Junlong Cheng, Jin Ye, Zhongying Deng, Jianpin Chen, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Ji- long Chen, Lei Jiang, et al. Sam-med2d. arXiv preprint arXiv:2308.16184, 2023. 1, 2

work page arXiv 2023
[5]

To adapt or not to adapt? real- time adaptation for semantic segmentation

Marc Botet Colomer, Pier Luigi Dovesi, Theodoros Pana- giotakopoulos, Joao Frederico Carvalho, Linus H ¨arenstam- Nielsen, Hossein Azizpour, Hedvig Kjellstr ¨om, Daniel Cre- mers, and Matteo Poggi. To adapt or not to adapt? real- time adaptation for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages ...

work page 2023
[6]

Adaptive stochastic weight averaging

Caglar Demir, Arnab Sharma, and Axel-Cyrille Ngonga Ngomo. Adaptive stochastic weight averaging. arXiv preprint arXiv:2406.19092, 2024. 3

work page arXiv 2024
[7]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. 7

work page internal anchor Pith review Pith/arXiv arXiv 2010
[8]

The pascal visual object classes (voc) challenge

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010. 6, 7

work page 2010
[9]

Camouflaged object de- tection

Deng-Ping Fan, Ge-Peng Ji, Guolei Sun, Ming-Ming Cheng, Jianbing Shen, and Ling Shao. Camouflaged object de- tection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2777–2787,

work page
[10]

Uncertainty reduction for model adaptation in semantic segmentation

Francois Fleuret et al. Uncertainty reduction for model adaptation in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9613–9623, 2021. 2

work page 2021
[11]

Getting to 99% accuracy in interactive seg- mentation

Marco Forte, Brian Price, Scott Cohen, Ning Xu, and Franc ¸ois Piti´e. Getting to 99% accuracy in interactive seg- mentation. arXiv preprint arXiv:2003.07932, 2020. 2

work page arXiv 2003
[12]

Random walks for image segmentation

Leo Grady. Random walks for image segmentation. IEEE transactions on pattern analysis and machine intelligence , 28(11):1768–1783, 2006. 2

work page 2006
[13]

Geodesic star convexity for interactive image segmentation

Varun Gulshan, Carsten Rother, Antonio Criminisi, Andrew Blake, and Andrew Zisserman. Geodesic star convexity for interactive image segmentation. In 2010 IEEE Computer So- ciety Conference on Computer Vision and Pattern Recogni- tion, pages 3129–3136. IEEE, 2010. 2

work page 2010
[14]

Trash- can: A semantically-segmented dataset towards visual de- tection of marine debris

Jungseok Hong, Michael Fulton, and Junaed Sattar. Trash- can: A semantically-segmented dataset towards visual de- tection of marine debris. arXiv preprint arXiv:2007.08097,

work page arXiv 2007
[15]

Foc- sam: Delving deeply into focused objects in segmenting any- thing

You Huang, Zongyu Lan, Liujuan Cao, Xianming Lin, Shengchuan Zhang, Guannan Jiang, and Rongrong Ji. Foc- sam: Delving deeply into focused objects in segmenting any- thing. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 3120–3130,

work page
[16]

Editing Models with Task Arithmetic

Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic. arXiv preprint arXiv:2212.04089, 2022. 2, 3

work page internal anchor Pith review Pith/arXiv arXiv 2022
[17]

Interactive image seg- mentation via backpropagating refinement scheme

Won-Dong Jang and Chang-Su Kim. Interactive image seg- mentation via backpropagating refinement scheme. In Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5297–5306, 2019. 2

work page 2019
[18]

Fine-tuning linear layers only is a simple yet effective way for task arithmetic

Ruochen Jin, Bojian Hou, Jiancong Xiao, Weijie Su, and Li Shen. Fine-tuning linear layers only is a simple yet effective way for task arithmetic. arXiv preprint arXiv:2407.07089 ,

work page arXiv
[19]

Segment any- thing

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. In Proceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 4015–4026, 2023. 1, 2, 3, 7, 8

work page 2023
[20]

Continuous adaptation for interactive ob- ject segmentation by learning from corrections

Theodora Kontogianni, Michael Gygli, Jasper Uijlings, and Vittorio Ferrari. Continuous adaptation for interactive ob- ject segmentation by learning from corrections. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16 , pages 579–596. Springer, 2020. 2

work page 2020
[21]

Anabranch network for camouflaged object segmentation

Trung-Nghia Le, Tam V Nguyen, Zhongliang Nie, Minh- Triet Tran, and Akihiro Sugimoto. Anabranch network for camouflaged object segmentation. Computer vision and im- age understanding, 184:45–56, 2019. 6, 7, 8 9

work page 2019
[22]

Mfp: Mak- ing full use of probability maps for interactive image seg- mentation

Chaewon Lee, Seon-Ho Lee, and Chang-Su Kim. Mfp: Mak- ing full use of probability maps for interactive image seg- mentation. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 4051–4059. IEEE, 2024. 2, 8

work page 2024
[23]

Interactive learn- ing for semantic segmentation in earth observation

Gaston Lenczner, Adrien Chan-Hon-Tong, Nicola Luminari, Bertrand Le Saux, and Guy Le Besnerais. Interactive learn- ing for semantic segmentation in earth observation. arXiv preprint arXiv:2009.11250, 2020. 2

work page arXiv 2009
[24]

Interactive image segmentation with latent diversity

Zhuwen Li, Qifeng Chen, and Vladlen Koltun. Interactive image segmentation with latent diversity. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 577–585, 2018. 2, 7

work page 2018
[25]

Do we really need to access the source data? source hypothesis transfer for un- supervised domain adaptation

Jian Liang, Dapeng Hu, and Jiashi Feng. Do we really need to access the source data? source hypothesis transfer for un- supervised domain adaptation. In International conference on machine learning, pages 6028–6039. PMLR, 2020. 2

work page 2020
[26]

Regional interactive image segmentation net- works

JunHao Liew, Yunchao Wei, Wei Xiong, Sim-Heng Ong, and Jiashi Feng. Regional interactive image segmentation net- works. In 2017 IEEE international conference on computer vision (ICCV), pages 2746–2754. IEEE, 2017. 2

work page 2017
[27]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer vision–ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13, pages 740–755. Springer, 2014. 6

work page 2014
[28]

Interactive image segmentation with first click attention

Zheng Lin, Zhao Zhang, Lin-Zhuo Chen, Ming-Ming Cheng, and Shao-Ping Lu. Interactive image segmentation with first click attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 13339–13348, 2020. 2

work page 2020
[29]

Click prompt learning with optimal trans- port for interactive segmentation

Jie Liu, Haochen Wang, Wenzhe Yin, Jan-Jakob Sonke, and Efstratios Gavves. Click prompt learning with optimal trans- port for interactive segmentation. In European Conference on Computer Vision, pages 93–110. Springer, 2024. 2

work page 2024
[30]

Pseudoclick: Interactive image segmentation with click imi- tation

Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, and Ziyan Wu. Pseudoclick: Interactive image segmentation with click imi- tation. In European Conference on Computer Vision, pages 728–745. Springer, 2022. 2

work page 2022
[31]

Simpleclick: Interactive image segmentation with sim- ple vision transformers

Qin Liu, Zhenlin Xu, Gedas Bertasius, and Marc Nietham- mer. Simpleclick: Interactive image segmentation with sim- ple vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 22290– 22300, 2023. 7

work page 2023
[32]

Rethinking interactive image segmentation with low latency high quality and diverse prompts

Qin Liu, Jaemin Cho, Mohit Bansal, and Marc Nietham- mer. Rethinking interactive image segmentation with low latency high quality and diverse prompts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3773–3782, 2024. 2

work page 2024
[33]

Tangent transformers for composition, privacy and removal

Tian Yu Liu, Aditya Golatkar, and Stefano Soatto. Tangent transformers for composition, privacy and removal. arXiv preprint arXiv:2307.08122, 2023. 3

work page arXiv 2023
[34]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. 7

work page internal anchor Pith review Pith/arXiv arXiv 2017
[35]

A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics

David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings eighth IEEE international conference on computer vision. ICCV 2001 , pages 416–423. IEEE, 2001. 6, 7

work page 2001
[36]

Towards stable test-time adaptation in dynamic wild world

Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, and Mingkui Tan. Towards stable test-time adaptation in dynamic wild world. In Internetional Conference on Learning Representations, 2023. 2

work page 2023
[37]

Task arithmetic in the tangent space: Improved editing of pre-trained models

Guillermo Ortiz-Jimenez, Alessandro Favero, and Pascal Frossard. Task arithmetic in the tangent space: Improved editing of pre-trained models. Advances in Neural Informa- tion Processing Systems, 36, 2024. 3

work page 2024
[38]

Real-time, accurate, and consistent video semantic segmentation via unsupervised adaptation and cross-unit de- ployment on mobile device

Hyojin Park, Alan Yessenbayev, Tushar Singhal, Navin Ku- mar Adhikari, Yizhe Zhang, Shubhankar Mangesh Borse, Hong Cai, Nilesh Prasad Pandey, Fei Yin, Frank Mayer, et al. Real-time, accurate, and consistent video semantic segmentation via unsupervised adaptation and cross-unit de- ployment on mobile device. InProceedings of the IEEE/CVF Conference on Comp...

work page 2022
[39]

A benchmark dataset and evaluation methodology for video object segmentation

Federico Perazzi, Jordi Pont-Tuset, Brian McWilliams, Luc Van Gool, Markus Gross, and Alexander Sorkine-Hornung. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 724–732,

work page
[40]

” grabcut” interactive foreground extraction using iterated graph cuts

Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. ” grabcut” interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG) , 23(3): 309–314, 2004. 2

work page 2004
[41]

Adapting the segment anything model during usage in novel situations

Robin Sch ¨on, Julian Lorenz, Katja Ludwig, and Rainer Lien- hart. Adapting the segment anything model during usage in novel situations. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3616–3626, 2024. 1, 2, 7, 8

work page 2024
[42]

f-brs: Rethinking backpropagating refinement for interactive segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8623–8632, 2020. 6, 7

work page 2020
[43]

f-brs: Rethinking backpropagating refinement for interactive segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8623–8632, 2020. 2

work page 2020
[44]

Re- viving iterative training with mask guidance for interactive segmentation

Konstantin Sofiiuk, Ilya A Petrov, and Anton Konushin. Re- viving iterative training with mask guidance for interactive segmentation. In 2022 IEEE International Conference on Image Processing (ICIP), pages 3141–3145. IEEE, 2022. 2, 7, 8

work page 2022
[45]

Cfr-icl: Cascade-forward refinement with iterative click loss for interactive image segmentation

Shoukun Sun, Min Xian, Fei Xu, Luca Capriotti, and Tiankai Yao. Cfr-icl: Cascade-forward refinement with iterative click loss for interactive image segmentation. In Proceedings of the AAAI conference on artificial intelligence , pages 5017– 5024, 2024. 2

work page 2024
[46]

Parameter efficient multi- 10 task model fusion with partial linearization

Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, and Dacheng Tao. Parameter efficient multi- 10 task model fusion with partial linearization. arXiv preprint arXiv:2310.04742, 2023. 3

work page arXiv 2023
[47]

Tesla: Test-time self-learning with automatic adversarial augmentation

Devavrat Tomar, Guillaume Vray, Behzad Bozorgtabar, and Jean-Philippe Thiran. Tesla: Test-time self-learning with automatic adversarial augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20341–20350, 2023. 2

work page 2023
[48]

On the road to online adaptation for semantic image segmentation

Riccardo V olpi, Pau De Jorge, Diane Larlus, and Gabriela Csurka. On the road to online adaptation for semantic image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 19184– 19195, 2022

work page 2022
[49]

Tent: Fully Test-time Adaptation by Entropy Minimization

Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Ol- shausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726,

work page internal anchor Pith review Pith/arXiv arXiv 2006
[50]

Stacked condi- tional generative adversarial networks for jointly learning shadow detection and shadow removal

Jifeng Wang, Xiang Li, and Jian Yang. Stacked condi- tional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 1788–1797, 2018. 6, 7, 8

work page 2018
[51]

Continual test-time domain adaptation

Qin Wang, Olga Fink, Luc Van Gool, and Dengxin Dai. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7201–7211, 2022. 2

work page 2022
[52]

Dynamically instance- guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation

Wei Wang, Zhun Zhong, Weijie Wang, Xi Chen, Charles Ling, Boyu Wang, and Nicu Sebe. Dynamically instance- guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24090–24099, 2023

work page 2023
[53]

Continual test-time domain adaptation via dynamic sample selection

Yanshuo Wang, Jie Hong, Ali Cheraghian, Shafin Rahman, David Ahmedt-Aristizabal, Lars Petersson, and Mehrtash Harandi. Continual test-time domain adaptation via dynamic sample selection. In Proceedings of the IEEE/CVF Win- ter Conference on Applications of Computer Vision , pages 1701–1710, 2024. 2

work page 2024
[54]

Focused and collaborative feedback integration for interactive image seg- mentation

Qiaoqiao Wei, Hui Zhang, and Jun-Hai Yong. Focused and collaborative feedback integration for interactive image seg- mentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 18643– 18652, 2023. 2

work page 2023
[55]

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing in- ference time

Mitchell Wortsman, Gabriel Ilharco, Samir Ya Gadre, Re- becca Roelofs, Raphael Gontijo-Lopes, Ari S Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Ko- rnblith, et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing in- ference time. In International conference on machine learn- ing, pages 23965...

work page 2022
[56]

Robust fine-tuning of zero-shot models

Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gon- tijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, et al. Robust fine-tuning of zero-shot models. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 7959–7971, 2022. 3

work page 2022
[57]

Ties-merging: Resolving interference when merging models

Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raf- fel, and Mohit Bansal. Ties-merging: Resolving interference when merging models. Advances in Neural Information Pro- cessing Systems, 36:7093–7115, 2023. 3

work page 2023
[58]

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xi- aochun Cao, Jie Zhang, and Dacheng Tao. Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities. arXiv preprint arXiv:2408.07666, 2024. 3

work page internal anchor Pith review Pith/arXiv arXiv 2024
[59]

Open-vocabulary sam: Seg- ment and recognize twenty-thousand classes interactively

Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, and Chen Change Loy. Open-vocabulary sam: Seg- ment and recognize twenty-thousand classes interactively. In European Conference on Computer Vision, pages 419–437. Springer, 2024. 1, 2

work page 2024
[60]

Auxadapt: Stable and efficient test-time adaptation for temporally consistent video semantic segmentation

Yizhe Zhang, Shubhankar Borse, Hong Cai, and Fatih Porikli. Auxadapt: Stable and efficient test-time adaptation for temporally consistent video semantic segmentation. In Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision, pages 2339–2348, 2022. 2

work page 2022
[61]

Graco: Granularity-controllable interactive segmentation

Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xi- awu Zheng, Rongrong Ji, Chang Liu, Li Yuan, and Jie Chen. Graco: Granularity-controllable interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 3501–3510, 2024. 2

work page 2024
[62]

Interactive segmentation as gaussion process classification

Minghao Zhou, Hong Wang, Qian Zhao, Yuexiang Li, Yawen Huang, Deyu Meng, and Yefeng Zheng. Interactive segmentation as gaussion process classification. In Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19488–19497, 2023. 2 11

work page 2023

[1] [1]

Interactive graph cuts for optimal boundary & region segmentation of objects in nd images

Yuri Y Boykov and M-P Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Proceedings eighth IEEE international confer- ence on computer vision. ICCV 2001, pages 105–112. IEEE,

work page 2001

[2] [2]

Contrastive test-time adaptation

Dian Chen, Dequan Wang, Trevor Darrell, and Sayna Ebrahimi. Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 295–305, 2022. 2

work page 2022

[3] [3]

Focalclick: Towards prac- tical interactive image segmentation

Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, and Hengshuang Zhao. Focalclick: Towards prac- tical interactive image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1300–1309, 2022. 2, 7, 8

work page 2022

[4] [4]

Sam-med2d

Junlong Cheng, Jin Ye, Zhongying Deng, Jianpin Chen, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Ji- long Chen, Lei Jiang, et al. Sam-med2d. arXiv preprint arXiv:2308.16184, 2023. 1, 2

work page arXiv 2023

[5] [5]

To adapt or not to adapt? real- time adaptation for semantic segmentation

Marc Botet Colomer, Pier Luigi Dovesi, Theodoros Pana- giotakopoulos, Joao Frederico Carvalho, Linus H ¨arenstam- Nielsen, Hossein Azizpour, Hedvig Kjellstr ¨om, Daniel Cre- mers, and Matteo Poggi. To adapt or not to adapt? real- time adaptation for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages ...

work page 2023

[6] [6]

Adaptive stochastic weight averaging

Caglar Demir, Arnab Sharma, and Axel-Cyrille Ngonga Ngomo. Adaptive stochastic weight averaging. arXiv preprint arXiv:2406.19092, 2024. 3

work page arXiv 2024

[7] [7]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. 7

work page internal anchor Pith review Pith/arXiv arXiv 2010

[8] [8]

The pascal visual object classes (voc) challenge

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010. 6, 7

work page 2010

[9] [9]

Camouflaged object de- tection

Deng-Ping Fan, Ge-Peng Ji, Guolei Sun, Ming-Ming Cheng, Jianbing Shen, and Ling Shao. Camouflaged object de- tection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2777–2787,

work page

[10] [10]

Uncertainty reduction for model adaptation in semantic segmentation

Francois Fleuret et al. Uncertainty reduction for model adaptation in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9613–9623, 2021. 2

work page 2021

[11] [11]

Getting to 99% accuracy in interactive seg- mentation

Marco Forte, Brian Price, Scott Cohen, Ning Xu, and Franc ¸ois Piti´e. Getting to 99% accuracy in interactive seg- mentation. arXiv preprint arXiv:2003.07932, 2020. 2

work page arXiv 2003

[12] [12]

Random walks for image segmentation

Leo Grady. Random walks for image segmentation. IEEE transactions on pattern analysis and machine intelligence , 28(11):1768–1783, 2006. 2

work page 2006

[13] [13]

Geodesic star convexity for interactive image segmentation

Varun Gulshan, Carsten Rother, Antonio Criminisi, Andrew Blake, and Andrew Zisserman. Geodesic star convexity for interactive image segmentation. In 2010 IEEE Computer So- ciety Conference on Computer Vision and Pattern Recogni- tion, pages 3129–3136. IEEE, 2010. 2

work page 2010

[14] [14]

Trash- can: A semantically-segmented dataset towards visual de- tection of marine debris

Jungseok Hong, Michael Fulton, and Junaed Sattar. Trash- can: A semantically-segmented dataset towards visual de- tection of marine debris. arXiv preprint arXiv:2007.08097,

work page arXiv 2007

[15] [15]

Foc- sam: Delving deeply into focused objects in segmenting any- thing

You Huang, Zongyu Lan, Liujuan Cao, Xianming Lin, Shengchuan Zhang, Guannan Jiang, and Rongrong Ji. Foc- sam: Delving deeply into focused objects in segmenting any- thing. In Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition , pages 3120–3130,

work page

[16] [16]

Editing Models with Task Arithmetic

Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic. arXiv preprint arXiv:2212.04089, 2022. 2, 3

work page internal anchor Pith review Pith/arXiv arXiv 2022

[17] [17]

Interactive image seg- mentation via backpropagating refinement scheme

Won-Dong Jang and Chang-Su Kim. Interactive image seg- mentation via backpropagating refinement scheme. In Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5297–5306, 2019. 2

work page 2019

[18] [18]

Fine-tuning linear layers only is a simple yet effective way for task arithmetic

Ruochen Jin, Bojian Hou, Jiancong Xiao, Weijie Su, and Li Shen. Fine-tuning linear layers only is a simple yet effective way for task arithmetic. arXiv preprint arXiv:2407.07089 ,

work page arXiv

[19] [19]

Segment any- thing

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer White- head, Alexander C Berg, Wan-Yen Lo, et al. Segment any- thing. In Proceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 4015–4026, 2023. 1, 2, 3, 7, 8

work page 2023

[20] [20]

Continuous adaptation for interactive ob- ject segmentation by learning from corrections

Theodora Kontogianni, Michael Gygli, Jasper Uijlings, and Vittorio Ferrari. Continuous adaptation for interactive ob- ject segmentation by learning from corrections. InComputer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16 , pages 579–596. Springer, 2020. 2

work page 2020

[21] [21]

Anabranch network for camouflaged object segmentation

Trung-Nghia Le, Tam V Nguyen, Zhongliang Nie, Minh- Triet Tran, and Akihiro Sugimoto. Anabranch network for camouflaged object segmentation. Computer vision and im- age understanding, 184:45–56, 2019. 6, 7, 8 9

work page 2019

[22] [22]

Mfp: Mak- ing full use of probability maps for interactive image seg- mentation

Chaewon Lee, Seon-Ho Lee, and Chang-Su Kim. Mfp: Mak- ing full use of probability maps for interactive image seg- mentation. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 4051–4059. IEEE, 2024. 2, 8

work page 2024

[23] [23]

Interactive learn- ing for semantic segmentation in earth observation

Gaston Lenczner, Adrien Chan-Hon-Tong, Nicola Luminari, Bertrand Le Saux, and Guy Le Besnerais. Interactive learn- ing for semantic segmentation in earth observation. arXiv preprint arXiv:2009.11250, 2020. 2

work page arXiv 2009

[24] [24]

Interactive image segmentation with latent diversity

Zhuwen Li, Qifeng Chen, and Vladlen Koltun. Interactive image segmentation with latent diversity. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 577–585, 2018. 2, 7

work page 2018

[25] [25]

Do we really need to access the source data? source hypothesis transfer for un- supervised domain adaptation

Jian Liang, Dapeng Hu, and Jiashi Feng. Do we really need to access the source data? source hypothesis transfer for un- supervised domain adaptation. In International conference on machine learning, pages 6028–6039. PMLR, 2020. 2

work page 2020

[26] [26]

Regional interactive image segmentation net- works

JunHao Liew, Yunchao Wei, Wei Xiong, Sim-Heng Ong, and Jiashi Feng. Regional interactive image segmentation net- works. In 2017 IEEE international conference on computer vision (ICCV), pages 2746–2754. IEEE, 2017. 2

work page 2017

[27] [27]

Microsoft coco: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer vision–ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13, pages 740–755. Springer, 2014. 6

work page 2014

[28] [28]

Interactive image segmentation with first click attention

Zheng Lin, Zhao Zhang, Lin-Zhuo Chen, Ming-Ming Cheng, and Shao-Ping Lu. Interactive image segmentation with first click attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 13339–13348, 2020. 2

work page 2020

[29] [29]

Click prompt learning with optimal trans- port for interactive segmentation

Jie Liu, Haochen Wang, Wenzhe Yin, Jan-Jakob Sonke, and Efstratios Gavves. Click prompt learning with optimal trans- port for interactive segmentation. In European Conference on Computer Vision, pages 93–110. Springer, 2024. 2

work page 2024

[30] [30]

Pseudoclick: Interactive image segmentation with click imi- tation

Qin Liu, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, Marc Niethammer, and Ziyan Wu. Pseudoclick: Interactive image segmentation with click imi- tation. In European Conference on Computer Vision, pages 728–745. Springer, 2022. 2

work page 2022

[31] [31]

Simpleclick: Interactive image segmentation with sim- ple vision transformers

Qin Liu, Zhenlin Xu, Gedas Bertasius, and Marc Nietham- mer. Simpleclick: Interactive image segmentation with sim- ple vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 22290– 22300, 2023. 7

work page 2023

[32] [32]

Rethinking interactive image segmentation with low latency high quality and diverse prompts

Qin Liu, Jaemin Cho, Mohit Bansal, and Marc Nietham- mer. Rethinking interactive image segmentation with low latency high quality and diverse prompts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3773–3782, 2024. 2

work page 2024

[33] [33]

Tangent transformers for composition, privacy and removal

Tian Yu Liu, Aditya Golatkar, and Stefano Soatto. Tangent transformers for composition, privacy and removal. arXiv preprint arXiv:2307.08122, 2023. 3

work page arXiv 2023

[34] [34]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. 7

work page internal anchor Pith review Pith/arXiv arXiv 2017

[35] [35]

A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics

David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings eighth IEEE international conference on computer vision. ICCV 2001 , pages 416–423. IEEE, 2001. 6, 7

work page 2001

[36] [36]

Towards stable test-time adaptation in dynamic wild world

Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, and Mingkui Tan. Towards stable test-time adaptation in dynamic wild world. In Internetional Conference on Learning Representations, 2023. 2

work page 2023

[37] [37]

Task arithmetic in the tangent space: Improved editing of pre-trained models

Guillermo Ortiz-Jimenez, Alessandro Favero, and Pascal Frossard. Task arithmetic in the tangent space: Improved editing of pre-trained models. Advances in Neural Informa- tion Processing Systems, 36, 2024. 3

work page 2024

[38] [38]

Real-time, accurate, and consistent video semantic segmentation via unsupervised adaptation and cross-unit de- ployment on mobile device

Hyojin Park, Alan Yessenbayev, Tushar Singhal, Navin Ku- mar Adhikari, Yizhe Zhang, Shubhankar Mangesh Borse, Hong Cai, Nilesh Prasad Pandey, Fei Yin, Frank Mayer, et al. Real-time, accurate, and consistent video semantic segmentation via unsupervised adaptation and cross-unit de- ployment on mobile device. InProceedings of the IEEE/CVF Conference on Comp...

work page 2022

[39] [39]

A benchmark dataset and evaluation methodology for video object segmentation

Federico Perazzi, Jordi Pont-Tuset, Brian McWilliams, Luc Van Gool, Markus Gross, and Alexander Sorkine-Hornung. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 724–732,

work page

[40] [40]

” grabcut” interactive foreground extraction using iterated graph cuts

Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. ” grabcut” interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG) , 23(3): 309–314, 2004. 2

work page 2004

[41] [41]

Adapting the segment anything model during usage in novel situations

Robin Sch ¨on, Julian Lorenz, Katja Ludwig, and Rainer Lien- hart. Adapting the segment anything model during usage in novel situations. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 3616–3626, 2024. 1, 2, 7, 8

work page 2024

[42] [42]

f-brs: Rethinking backpropagating refinement for interactive segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8623–8632, 2020. 6, 7

work page 2020

[43] [43]

f-brs: Rethinking backpropagating refinement for interactive segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, and Anton Konushin. f-brs: Rethinking backpropagating refinement for interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 8623–8632, 2020. 2

work page 2020

[44] [44]

Re- viving iterative training with mask guidance for interactive segmentation

Konstantin Sofiiuk, Ilya A Petrov, and Anton Konushin. Re- viving iterative training with mask guidance for interactive segmentation. In 2022 IEEE International Conference on Image Processing (ICIP), pages 3141–3145. IEEE, 2022. 2, 7, 8

work page 2022

[45] [45]

Cfr-icl: Cascade-forward refinement with iterative click loss for interactive image segmentation

Shoukun Sun, Min Xian, Fei Xu, Luca Capriotti, and Tiankai Yao. Cfr-icl: Cascade-forward refinement with iterative click loss for interactive image segmentation. In Proceedings of the AAAI conference on artificial intelligence , pages 5017– 5024, 2024. 2

work page 2024

[46] [46]

Parameter efficient multi- 10 task model fusion with partial linearization

Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, and Dacheng Tao. Parameter efficient multi- 10 task model fusion with partial linearization. arXiv preprint arXiv:2310.04742, 2023. 3

work page arXiv 2023

[47] [47]

Tesla: Test-time self-learning with automatic adversarial augmentation

Devavrat Tomar, Guillaume Vray, Behzad Bozorgtabar, and Jean-Philippe Thiran. Tesla: Test-time self-learning with automatic adversarial augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20341–20350, 2023. 2

work page 2023

[48] [48]

On the road to online adaptation for semantic image segmentation

Riccardo V olpi, Pau De Jorge, Diane Larlus, and Gabriela Csurka. On the road to online adaptation for semantic image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 19184– 19195, 2022

work page 2022

[49] [49]

Tent: Fully Test-time Adaptation by Entropy Minimization

Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Ol- shausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726,

work page internal anchor Pith review Pith/arXiv arXiv 2006

[50] [50]

Stacked condi- tional generative adversarial networks for jointly learning shadow detection and shadow removal

Jifeng Wang, Xiang Li, and Jian Yang. Stacked condi- tional generative adversarial networks for jointly learning shadow detection and shadow removal. In Proceedings of the IEEE conference on computer vision and pattern recog- nition, pages 1788–1797, 2018. 6, 7, 8

work page 2018

[51] [51]

Continual test-time domain adaptation

Qin Wang, Olga Fink, Luc Van Gool, and Dengxin Dai. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7201–7211, 2022. 2

work page 2022

[52] [52]

Dynamically instance- guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation

Wei Wang, Zhun Zhong, Weijie Wang, Xi Chen, Charles Ling, Boyu Wang, and Nicu Sebe. Dynamically instance- guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24090–24099, 2023

work page 2023

[53] [53]

Continual test-time domain adaptation via dynamic sample selection

Yanshuo Wang, Jie Hong, Ali Cheraghian, Shafin Rahman, David Ahmedt-Aristizabal, Lars Petersson, and Mehrtash Harandi. Continual test-time domain adaptation via dynamic sample selection. In Proceedings of the IEEE/CVF Win- ter Conference on Applications of Computer Vision , pages 1701–1710, 2024. 2

work page 2024

[54] [54]

Focused and collaborative feedback integration for interactive image seg- mentation

Qiaoqiao Wei, Hui Zhang, and Jun-Hai Yong. Focused and collaborative feedback integration for interactive image seg- mentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 18643– 18652, 2023. 2

work page 2023

[55] [55]

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing in- ference time

Mitchell Wortsman, Gabriel Ilharco, Samir Ya Gadre, Re- becca Roelofs, Raphael Gontijo-Lopes, Ari S Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Ko- rnblith, et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing in- ference time. In International conference on machine learn- ing, pages 23965...

work page 2022

[56] [56]

Robust fine-tuning of zero-shot models

Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gon- tijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, et al. Robust fine-tuning of zero-shot models. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 7959–7971, 2022. 3

work page 2022

[57] [57]

Ties-merging: Resolving interference when merging models

Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raf- fel, and Mohit Bansal. Ties-merging: Resolving interference when merging models. Advances in Neural Information Pro- cessing Systems, 36:7093–7115, 2023. 3

work page 2023

[58] [58]

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xi- aochun Cao, Jie Zhang, and Dacheng Tao. Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities. arXiv preprint arXiv:2408.07666, 2024. 3

work page internal anchor Pith review Pith/arXiv arXiv 2024

[59] [59]

Open-vocabulary sam: Seg- ment and recognize twenty-thousand classes interactively

Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, and Chen Change Loy. Open-vocabulary sam: Seg- ment and recognize twenty-thousand classes interactively. In European Conference on Computer Vision, pages 419–437. Springer, 2024. 1, 2

work page 2024

[60] [60]

Auxadapt: Stable and efficient test-time adaptation for temporally consistent video semantic segmentation

Yizhe Zhang, Shubhankar Borse, Hong Cai, and Fatih Porikli. Auxadapt: Stable and efficient test-time adaptation for temporally consistent video semantic segmentation. In Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision, pages 2339–2348, 2022. 2

work page 2022

[61] [61]

Graco: Granularity-controllable interactive segmentation

Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xi- awu Zheng, Rongrong Ji, Chang Liu, Li Yuan, and Jie Chen. Graco: Granularity-controllable interactive segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 3501–3510, 2024. 2

work page 2024

[62] [62]

Interactive segmentation as gaussion process classification

Minghao Zhou, Hong Wang, Qian Zhao, Yuexiang Li, Yawen Huang, Deyu Meng, and Yefeng Zheng. Interactive segmentation as gaussion process classification. In Proceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19488–19497, 2023. 2 11

work page 2023