pith. sign in

arxiv: 2605.23411 · v1 · pith:577MO3TKnew · submitted 2026-05-22 · 💻 cs.LG · cs.CR· cs.CV

Sample-wise Targeted Adversarial Attacks on Test-time Adaptation

Pith reviewed 2026-05-25 04:57 UTC · model grok-4.3

classification 💻 cs.LG cs.CRcs.CV
keywords test-time adaptationadversarial attackssample-wise attackstargeted attacksmeta-learninggradient alignmentdistributional stealth
0
0 comments X

The pith

Sample-wise targeted attacks on test-time adaptation misclassify only triggered inputs while preserving the overall label distribution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Test-time adaptation counters distribution shifts but can be manipulated through the unlabeled test stream. Class-wise targeted attacks fail in practice because they force many samples toward one label and create an obvious frequency spike. The paper replaces that with a sample-wise approach that activates only on inputs carrying an attacker-chosen trigger and leaves the global label counts unchanged. The method uses meta-learning together with a priority-aware gradient alignment cast as an ellipsoidal trust-region problem to resolve the conflict between attack success and distributional stealth. If the approach holds, TTA systems become vulnerable to stealthy, hard-to-detect manipulation that existing distribution-based detectors miss.

Core claim

We introduce a sample-wise targeted adversarial attack on TTA that misclassifies only inputs carrying an attacker-chosen trigger while preserving the global label distribution of benign queries; the attack is realized by a meta-learning procedure whose priority-aware gradient alignment is formulated as an ellipsoidal trust-region problem that supplies theoretical guarantees even when attack success and distributional stealth pull the gradient in opposite directions.

What carries the argument

Priority-aware gradient alignment strategy that formulates the gradient update as an ellipsoidal trust-region problem to prioritize attack success while enforcing distributional stealth.

If this is right

  • The attack reaches high targeted success rates on CIFAR-10-C, CIFAR-100-C and ImageNet-C under multiple TTA protocols.
  • Label frequencies remain statistically indistinguishable from the clean baseline, evading detection that relies on class-count monitoring.
  • The attack retains effectiveness against existing defenses that assume class-wise or batch-wise perturbations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • TTA pipelines may need to monitor for localized trigger patterns in addition to global statistics.
  • The meta-learning framing could be reused for other online adaptation settings that require both efficacy and statistical invisibility.
  • Trigger-specific robustness training might become a standard safeguard once sample-wise attacks are recognized as practical.

Load-bearing premise

The ellipsoidal trust-region formulation can still optimize the attack objective even when attack success and label-distribution preservation produce misaligned gradients.

What would settle it

An experiment in which the attack either fails to reach high targeted success on triggered samples or produces a label distribution whose deviation from the no-attack baseline exceeds what the paper reports would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.23411 by Phuc Duc Nguyen, Quang Duc Nguyen.

Figure 1
Figure 1. Figure 1: Illustration of distributional stealth. Un [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: TTA attack setting. The attack is selec [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the attack generation using the proposed meta-learning workflow, where [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: (a) Cosine similarity between gradients of Lcls and Lstl across attack optimization steps, where the vertical dashed line marks the practical horizon (500 steps). (b) Frequency distribution of cosine similarity values. Nearly all steps (99.9%) exhibit negative cosine similarity, indicating per￾sistent gradient misalignment. In this section, we describe how the aligned di￾rection d k is constructed. For bre… view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of label distribution shifts induced by RTTDP, FCA, and ours on CIFAR-10-C. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Trigger patterns on victim input [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: ASR (%) of MedBN [33]. 5.1 Experimental Setting Datasets and backbones. We evaluate the attack on standard benchmarks for TTA, namely CIFAR10-C, CIFAR100-C, and ImageNet-C. These datasets are constructed by applying a diverse set of synthetic corruptions to clean images [13], thereby simulating realistic distribution shifts. Following common practice [42, 36], for CIFAR10-C and CIFAR100-C, we adopt ResNet-… view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of label distribution shifts induced by [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of label distribution shifts induced by [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of label distribution shifts induced by Euclidean trust-region formulation and [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of label distribution shifts induced by Euclidean trust-region formulation and [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Comparison of label distribution shifts induced by RTTDP [ [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Comparison of label distribution shifts induced by RTTDP [ [PITH_FULL_IMAGE:figures/full_fig_p021_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Comparison of label distribution shifts induced by a class-wise attack (DIA [ [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Comparison of label distribution shifts induced by a class-wise attack (DIA [ [PITH_FULL_IMAGE:figures/full_fig_p021_17.png] view at source ↗
read the original abstract

Test-time adaptation (TTA) effectively counters distribution shifts but exposes models to adversarial manipulation via the unlabeled test stream. Existing class-wise targeted attacks remain impractical for stealthy exploitation in this setting: since TTA operates on batches, forcing a subset of samples toward a target label unintentionally pulls similar benign samples along, resulting in a conspicuously high frequency of the target label that is easy to detect. To capture a more realistic threat, we introduce a sample-wise targeted attack. Unlike prior approaches, the attacker aims to misclassify only inputs carrying an attacker-chosen trigger, while preserving the global label distribution of benign queries to evade detection. To achieve this, we propose a meta-learning-based attack with a novel priority-aware gradient alignment strategy that explicitly prioritizes attack success. The strategy formulates the gradient update as an ellipsoidal trust-region problem, mitigating the misalignment between attack success and distributional stealth, while providing theoretical guarantees for effective optimization of the attack objective in the presence of gradient misalignment. Extensive experiments on CIFAR-10-C, CIFAR-100-C, and ImageNet-C across TTA protocols demonstrate that our method achieves high targeted success rates while maintaining a label distribution that is consistent with the no-attack baseline, making it difficult to detect in unlabeled TTA deployment scenarios. Furthermore, we demonstrate that our attack shows strong robustness against existing defenses.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims to introduce a sample-wise targeted adversarial attack on test-time adaptation (TTA) that misclassifies only trigger-carrying inputs while preserving the global label distribution of benign samples to evade detection in unlabeled streams. It uses a meta-learning framework with a novel priority-aware gradient alignment strategy formulated as an ellipsoidal trust-region problem to resolve misalignment between attack success and distributional stealth, accompanied by theoretical guarantees. Experiments on CIFAR-10-C, CIFAR-100-C, and ImageNet-C across TTA protocols report high targeted success rates, label distributions consistent with no-attack baselines, and robustness to existing defenses.

Significance. If the central claims hold, the work identifies a realistic and stealthy threat model for TTA deployments that prior class-wise attacks do not capture, potentially affecting the security of adaptive models in production. The meta-learning approach with trust-region optimization for balancing conflicting objectives may have value beyond this setting. The reported empirical results on standard corruption benchmarks provide a concrete starting point for evaluating such attacks.

major comments (3)
  1. [§3 (meta-learning attack formulation)] The priority-aware gradient alignment (ellipsoidal trust-region formulation) is presented as mitigating misalignment between attack success and distributional stealth, but the manuscript does not model or constrain the downstream effect of batch-level TTA parameter updates on non-triggered samples. This coupling is load-bearing for the sample-wise isolation and label-distribution stealth claims.
  2. [§4 (experiments on CIFAR/ImageNet-C)] Experiments report label distribution consistency with the no-attack baseline, but without explicit measurements or ablations of indirect prediction shifts on benign samples sharing a batch with triggered inputs, it remains unclear whether the reported frequencies would hold under actual TTA batch updates.
  3. [§3.2 (theoretical analysis)] The theoretical guarantees are stated for effective optimization of the attack objective under gradient misalignment, yet they appear limited to the meta-optimization step and do not extend to end-to-end preservation of batch label statistics after adaptation.
minor comments (2)
  1. [§4.1] Clarify the exact batch sizes and TTA update rules used in the experimental protocols, as these directly affect the coupling concern.
  2. [Related work] Add a reference to prior work on batch-level statistics in TTA if not already present, to contextualize the sample-wise distinction.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments, which help clarify key aspects of our threat model and evaluation. We address each major comment point by point below, indicating where revisions will be made.

read point-by-point responses
  1. Referee: [§3 (meta-learning attack formulation)] The priority-aware gradient alignment (ellipsoidal trust-region formulation) is presented as mitigating misalignment between attack success and distributional stealth, but the manuscript does not model or constrain the downstream effect of batch-level TTA parameter updates on non-triggered samples. This coupling is load-bearing for the sample-wise isolation and label-distribution stealth claims.

    Authors: We agree that the current formulation in §3 optimizes the per-sample attack objective via meta-learning and the ellipsoidal trust-region but does not explicitly simulate or constrain how TTA parameter updates propagate to non-triggered samples in the same batch. This interaction is indeed central to the stealth claims. We will revise §3 to add a dedicated paragraph discussing this batch-level coupling under the assumed threat model (unlabeled streams with mixed triggered and benign inputs) and how the priority-aware alignment indirectly supports isolation by prioritizing distributional consistency at the attack-generation stage. revision: partial

  2. Referee: [§4 (experiments on CIFAR/ImageNet-C)] Experiments report label distribution consistency with the no-attack baseline, but without explicit measurements or ablations of indirect prediction shifts on benign samples sharing a batch with triggered inputs, it remains unclear whether the reported frequencies would hold under actual TTA batch updates.

    Authors: The reported label distributions are measured after full TTA runs on streams containing the attacked batches, and they remain consistent with the no-attack baseline. However, we acknowledge the value of isolating indirect effects on co-batch benign samples. We will add new ablations in §4 that track per-sample prediction changes for benign inputs when processed alongside triggered samples under each TTA protocol, providing direct evidence on whether distributional stealth is preserved after the adaptation step. revision: yes

  3. Referee: [§3.2 (theoretical analysis)] The theoretical guarantees are stated for effective optimization of the attack objective under gradient misalignment, yet they appear limited to the meta-optimization step and do not extend to end-to-end preservation of batch label statistics after adaptation.

    Authors: The analysis in §3.2 establishes convergence guarantees for the meta-optimization under the priority-aware alignment when gradients are misaligned. We agree these guarantees do not extend to a formal proof of post-adaptation label-statistic preservation, which would require additional assumptions on the TTA update rule and batch statistics. We will revise the text to explicitly state the scope of the theoretical results and emphasize that end-to-end distributional stealth is demonstrated empirically across the evaluated protocols and datasets. revision: partial

Circularity Check

0 steps flagged

No circularity; derivation relies on independent meta-optimization without reduction to inputs

full rationale

The paper introduces a sample-wise targeted attack via meta-learning and a priority-aware gradient alignment cast as an ellipsoidal trust-region problem, with claims of theoretical guarantees for optimization under misalignment. No equations, fitted parameters, or self-citations are shown that reduce the attack success or stealth properties to self-defined quantities or prior results by construction. The central claims rest on the proposed formulation and experimental validation rather than tautological re-labeling of inputs, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no free parameters, axioms, or invented entities are described or can be extracted.

pith-pipeline@v0.9.0 · 5767 in / 1054 out tokens · 13678 ms · 2026-05-25T04:57:57.657052+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 4 internal anchors

  1. [1]

    A new backdoor attack in cnns by training set corruption without label poisoning

    Mauro Barni, Kassem Kallas, and Benedetta Tondi. A new backdoor attack in cnns by training set corruption without label poisoning. In2019 IEEE International Conference on Image Processing (ICIP), pages 101–105. IEEE, 2019

  2. [2]

    Cambridge university press, 2004

    Stephen Boyd and Lieven Vandenberghe.Convex optimization. Cambridge university press, 2004

  3. [3]

    A probabilistic framework for lifelong test-time adaptation

    Dhanajit Brahma and Piyush Rai. A probabilistic framework for lifelong test-time adaptation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3582–3591, 2023

  4. [4]

    Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

    Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Targeted backdoor attacks on deep learning systems using data poisoning.arXiv preprint arXiv:1712.05526, 2017

  5. [5]

    Gradnorm: Gra- dient normalization for adaptive loss balancing in deep multitask networks

    Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. Gradnorm: Gra- dient normalization for adaptive loss balancing in deep multitask networks. InInternational conference on machine learning, pages 794–803. PMLR, 2018

  6. [6]

    Test-time poisoning attacks against test-time adaptation models

    Tianshuo Cong, Xinlei He, Yun Shen, and Yang Zhang. Test-time poisoning attacks against test-time adaptation models. In2024 IEEE Symposium on Security and Privacy (SP), pages 1306–1324. IEEE, 2024

  7. [7]

    Februus: Input purification defense against trojan attacks on deep neural network systems

    Bao Gia Doan, Ehsan Abbasnejad, and Damith C Ranasinghe. Februus: Input purification defense against trojan attacks on deep neural network systems. InProceedings of the 36th Annual Computer Security Applications Conference, pages 897–912, 2020

  8. [8]

    Robust physical-world attacks on deep learning visual classification

    Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1625–1634, 2018

  9. [9]

    Sotta: Robust test-time adaptation on noisy data streams.Advances in Neural Information Processing Systems, 36:14070–14093, 2023

    Taesik Gong, Yewon Kim, Taeckyung Lee, Sorn Chottananurak, and Sung-Ju Lee. Sotta: Robust test-time adaptation on noisy data streams.Advances in Neural Information Processing Systems, 36:14070–14093, 2023

  10. [10]

    BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

    Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733, 2017

  11. [11]

    Ranked entropy minimization for continual test-time adaptation.arXiv preprint arXiv:2505.16441, 2025

    Jisu Han, Jaemin Na, and Wonjun Hwang. Ranked entropy minimization for continual test-time adaptation.arXiv preprint arXiv:2505.16441, 2025

  12. [12]

    Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

    Dan Hendrycks and Thomas Dietterich. Benchmarking neural network robustness to common corruptions and perturbations.arXiv preprint arXiv:1903.12261, 2019

  13. [13]

    Benchmarking neural network robustness to common corruptions and perturbations.Proceedings of the International Conference on Learning Representations, 2019

    Dan Hendrycks and Thomas Dietterich. Benchmarking neural network robustness to common corruptions and perturbations.Proceedings of the International Conference on Learning Representations, 2019

  14. [14]

    Beyond entropy: Region confidence proxy for wild test-time adaptation.arXiv preprint arXiv:2505.20704, 2025

    Zixuan Hu, Yichun Hu, Xiaotong Li, Shixiang Tang, and Ling-Yu Duan. Beyond entropy: Region confidence proxy for wild test-time adaptation.arXiv preprint arXiv:2505.20704, 2025

  15. [15]

    Test-time adaptation via self-training with nearest neighbor information.arXiv preprint arXiv:2207.10792, 2022

    Minguk Jang, Sae-Young Chung, and Hye Won Chung. Test-time adaptation via self-training with nearest neighbor information.arXiv preprint arXiv:2207.10792, 2022

  16. [16]

    Multi-task learning using uncertainty to weigh losses for scene geometry and semantics

    Alex Kendall, Yarin Gal, and Roberto Cipolla. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 7482–7491, 2018

  17. [17]

    Stationary latent weight inference for unreliable obser- vations from online test-time adaptation

    Jae-Hong Lee and Joon-Hyuk Chang. Stationary latent weight inference for unreliable obser- vations from online test-time adaptation. InForty-first International Conference on Machine Learning, 2024. 10

  18. [18]

    Entropy is not enough for test-time adaptation: From the perspective of disentangled factors.arXiv preprint arXiv:2403.07366, 2024

    Jonghyun Lee, Dahuin Jung, Saehyung Lee, Junsung Park, Juhyeon Shin, Uiwon Hwang, and Sungroh Yoon. Entropy is not enough for test-time adaptation: From the perspective of disentangled factors.arXiv preprint arXiv:2403.07366, 2024

  19. [19]

    Towards open-set test-time adaptation utilizing the wisdom of crowds in entropy minimization

    Jungsoo Lee, Debasmit Das, Jaegul Choo, and Sungha Choi. Towards open-set test-time adaptation utilizing the wisdom of crowds in entropy minimization. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 16380–16389, 2023

  20. [20]

    Surgical fine-tuning improves adaptation to distribution shifts.arXiv preprint arXiv:2210.11466, 2022

    Yoonho Lee, Annie S Chen, Fahim Tajwar, Ananya Kumar, Huaxiu Yao, Percy Liang, and Chelsea Finn. Surgical fine-tuning improves adaptation to distribution shifts.arXiv preprint arXiv:2210.11466, 2022

  21. [21]

    A comprehensive survey on source-free domain adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8):5743–5762, 2024

    Jingjing Li, Zhiqi Yu, Zhekai Du, Lei Zhu, and Heng Tao Shen. A comprehensive survey on source-free domain adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8):5743–5762, 2024

  22. [22]

    Backdoor learning: A survey.IEEE transactions on neural networks and learning systems, 35(1):5–22, 2022

    Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. Backdoor learning: A survey.IEEE transactions on neural networks and learning systems, 35(1):5–22, 2022

  23. [23]

    Invisible backdoor attack with sample-specific triggers

    Yuezun Li, Yiming Li, Baoyuan Wu, Longkang Li, Ran He, and Siwei Lyu. Invisible backdoor attack with sample-specific triggers. InProceedings of the IEEE/CVF international conference on computer vision, pages 16463–16472, 2021

  24. [24]

    A comprehensive survey on test-time adaptation under distribution shifts.International Journal of Computer Vision, 133(1):31–64, 2025

    Jian Liang, Ran He, and Tieniu Tan. A comprehensive survey on test-time adaptation under distribution shifts.International Journal of Computer Vision, 133(1):31–64, 2025

  25. [25]

    Detecting and correcting for label shift with black box predictors

    Zachary Lipton, Yu-Xiang Wang, and Alexander Smola. Detecting and correcting for label shift with black box predictors. InInternational conference on machine learning, pages 3122–3130. PMLR, 2018

  26. [26]

    Conflict-averse gradient descent for multi-task learning.Advances in Neural Information Processing Systems, 34:18878–18890, 2021

    Bo Liu, Xingchao Liu, Xiaojie Jin, Peter Stone, and Qiang Liu. Conflict-averse gradient descent for multi-task learning.Advances in Neural Information Processing Systems, 34:18878–18890, 2021

  27. [27]

    Variational continual test-time adaptation.arXiv preprint arXiv:2402.08182, 2024

    Fan Lyu, Kaile Du, Yuyang Li, Hanyu Zhao, Zhang Zhang, Guangcan Liu, and Liang Wang. Variational continual test-time adaptation.arXiv preprint arXiv:2402.08182, 2024

  28. [28]

    Improved self-training for test-time adaptation

    Jing Ma. Improved self-training for test-time adaptation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23701–23710, 2024

  29. [29]

    Springer Science & Business Media, 2013

    Yurii Nesterov.Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2013

  30. [30]

    and Tran, A

    Anh Nguyen and Anh Tran. Wanet–imperceptible warping-based backdoor attack.arXiv preprint arXiv:2102.10369, 2021

  31. [31]

    Efficient test-time model adaptation without forgetting

    Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yaofo Chen, Shijian Zheng, Peilin Zhao, and Mingkui Tan. Efficient test-time model adaptation without forgetting. InInternational confer- ence on machine learning, pages 16888–16905. PMLR, 2022

  32. [32]

    Towards stable test-time adaptation in dynamic wild world.arXiv preprint arXiv:2302.12400, 2023

    Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Zhiquan Wen, Yaofo Chen, Peilin Zhao, and Mingkui Tan. Towards stable test-time adaptation in dynamic wild world.arXiv preprint arXiv:2302.12400, 2023

  33. [33]

    Medbn: Robust test-time adaptation against malicious test samples

    Hyejin Park, Jeongyeon Hwang, Sunung Mun, Sangdon Park, and Jungseul Ok. Medbn: Robust test-time adaptation against malicious test samples. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5997–6007, 2024

  34. [34]

    A survey of robust ad- versarial training in pattern recognition: Fundamental, theory, and methodologies.Pattern Recognition, 131:108889, 2022

    Zhuang Qian, Kaizhu Huang, Qiu-Feng Wang, and Xu-Yao Zhang. A survey of robust ad- versarial training in pattern recognition: Fundamental, theory, and methodologies.Pattern Recognition, 131:108889, 2022

  35. [35]

    Failing loudly: An empirical study of methods for detecting dataset shift.Advances in Neural Information Processing Systems, 32, 2019

    Stephan Rabanser, Stephan Günnemann, and Zachary Lipton. Failing loudly: An empirical study of methods for detecting dataset shift.Advances in Neural Information Processing Systems, 32, 2019. 11

  36. [36]

    On the adversarial vulnerability of label-free test-time adaptation

    Shahriar Rifat, Jonathan Ashdown, Michael J De Lucia, Ananthram Swami, and Francesco Restuccia. On the adversarial vulnerability of label-free test-time adaptation. InThe Thirteenth International Conference on Learning Representations, 2025

  37. [37]

    If your data distribution shifts, use self-learning

    Evgenia Rusak, Steffen Schneider, George Pachitariu, Luisa Eck, Peter Gehler, Oliver Bring- mann, Wieland Brendel, and Matthias Bethge. If your data distribution shifts, use self-learning. arXiv preprint arXiv:2104.12928, 2021

  38. [38]

    Hidden trigger backdoor attacks

    Aniruddha Saha, Akshayvarun Subramanya, and Hamed Pirsiavash. Hidden trigger backdoor attacks. InProceedings of the AAAI conference on artificial intelligence, volume 34, pages 11957–11965, 2020

  39. [39]

    Multi-task learning as multi-objective optimization.Advances in neural information processing systems, 31, 2018

    Ozan Sener and Vladlen Koltun. Multi-task learning as multi-objective optimization.Advances in neural information processing systems, 31, 2018

  40. [40]

    Test: Test-time self- training under distribution shift

    Samarth Sinha, Peter Gehler, Francesco Locatello, and Bernt Schiele. Test: Test-time self- training under distribution shift. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2759–2769, 2023

  41. [41]

    On general minimax theorems.Pacific Journal of Mathematics, 8(1):171–176, 1958

    Maurice Sion. On general minimax theorems.Pacific Journal of Mathematics, 8(1):171–176, 1958

  42. [42]

    On the adversarial risk of test time adaptation: An investigation into realistic test-time data poisoning

    Yongyi Su, Yushu Li, Nanqing Liu, Kui Jia, Xulei Yang, Chuan-Sheng Foo, and Xun Xu. On the adversarial risk of test time adaptation: An investigation into realistic test-time data poisoning. arXiv preprint arXiv:2410.04682, 2024

  43. [43]

    Revisiting realistic test-time training: Sequential inference and adaptation by anchored clustering.Advances in Neural Information Processing Systems, 35:17543–17555, 2022

    Yongyi Su, Xun Xu, and Kui Jia. Revisiting realistic test-time training: Sequential inference and adaptation by anchored clustering.Advances in Neural Information Processing Systems, 35:17543–17555, 2022

  44. [44]

    Parameter-selective continual test-time adaptation

    Jiaxu Tian and Fan Lyu. Parameter-selective continual test-time adaptation. InProceedings of the Asian Conference on Computer Vision, pages 1384–1400, 2024

  45. [45]

    Tent: Fully Test-time Adaptation by Entropy Minimization

    Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization.arXiv preprint arXiv:2006.10726, 2020

  46. [46]

    Continual test-time domain adaptation

    Qin Wang, Olga Fink, Luc Van Gool, and Dengxin Dai. Continual test-time domain adaptation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7201–7211, 2022

  47. [47]

    Backdoor attacks against deep learning systems in the physical world

    Emily Wenger, Josephine Passananti, Arjun Nitin Bhagoji, Yuanshun Yao, Haitao Zheng, and Ben Y Zhao. Backdoor attacks against deep learning systems in the physical world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6206–6215, 2021

  48. [48]

    Uncovering adversarial risks of test-time adaptation.arXiv preprint arXiv:2301.12576, 2023

    Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T Wang, Vikash Sehwag, Saeed Mahloujifar, and Prateek Mittal. Uncovering adversarial risks of test-time adaptation.arXiv preprint arXiv:2301.12576, 2023

  49. [49]

    Gradient surgery for multi-task learning.Advances in neural information processing systems, 33:5824–5836, 2020

    Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning.Advances in neural information processing systems, 33:5824–5836, 2020

  50. [50]

    Come: Test-time adaptation by conservatively minimizing entropy.arXiv preprint arXiv:2410.10894, 2024

    Qingyang Zhang, Yatao Bian, Xinke Kong, Peilin Zhao, and Changqing Zhang. Come: Test-time adaptation by conservatively minimizing entropy.arXiv preprint arXiv:2410.10894, 2024

  51. [51]

    Limitations

    Yu Zhang and Qiang Yang. A survey on multi-task learning.IEEE transactions on knowledge and data engineering, 34(12):5586–5609, 2021. 12 A Extended Related Works Test-time Adaptation.TTA updates model parameters at inference time using unlabeled test data to handle distribution shift [24]. Unlike training-time domain adaptation [21], it assumes no access ...

  52. [52]

    Guidelines: • The answer [N/A] means that the paper does not involve crowdsourcing nor research with human subjects

    Institutional review board (IRB) approvals or equivalent for research with human subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or ...