Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment

Hongyeob Kim; Huiling Liu; Sungeun Hong; Youjia Zhang; Youngeun Kim; Young-Geun Choi

arxiv: 2508.15568 · v8 · submitted 2025-08-21 · 💻 cs.CV · cs.LG

Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment

Youjia Zhang , Youngeun Kim , Young-Geun Choi , Hongyeob Kim , Huiling Liu , Sungeun Hong This is my paper

Pith reviewed 2026-05-18 21:23 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords test-time adaptationdistribution shiftGaussian modelingbackpropagation-freeprobabilistic inferencecomputer visionzero-shot robustness

0 comments

The pith

ADAPT reframes test-time adaptation as closed-form Gaussian inference on online-updated class means with a shared covariance, eliminating all gradient steps and source data needs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that test-time adaptation under distribution shifts can be performed reliably by modeling unlabeled target features as draws from class-conditional Gaussians whose means are updated gradually and whose covariance is estimated once and shared across classes. This modeling supplies closed-form likelihoods for inference and calibrated predictions without backpropagation, iterative optimization, or access to source data. A sympathetic reader would care because most prior TTA methods either require gradients that prevent real-time use or lack explicit class-conditional modeling, leaving decision boundaries unreliable when shifts occur. The approach further adds lightweight CLIP-guided regularization and a historical bank to correct likelihood bias while supporting both streaming and batch target settings.

Core claim

We reframe TTA as a Gaussian probabilistic inference task by modeling class-conditional likelihoods using gradually updated class means and a shared covariance matrix. This enables closed-form, training-free inference. To correct potential likelihood bias, we introduce lightweight regularization guided by CLIP priors and a historical knowledge bank. ADAPT requires no source data, no gradient updates, and no full access to target data, supporting both online and transductive settings.

What carries the argument

Gaussian probabilistic inference that computes class likelihoods from online-updated per-class means and one shared covariance matrix estimated directly from unlabeled target features.

If this is right

Real-time inference becomes feasible on edge devices because no gradients or iterative optimization are required.
Both online streaming and transductive batch adaptation are supported with only partial or full unlabeled target batches.
Calibrated predictions improve under a wide range of distribution shifts without retraining or source replay.
Scalability increases because the method avoids storing or accessing the full source dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same online Gaussian update pattern could be applied to other probabilistic heads such as normalizing flows or mixture models to relax the single-covariance assumption.
Connecting the historical knowledge bank to Bayesian updating would allow explicit uncertainty quantification over the running means.
The CLIP prior regularization suggests a broader pattern: using large-scale vision-language models as cheap, label-free anchors during test-time distribution alignment.

Load-bearing premise

Class-conditional feature distributions in the target domain can be adequately captured by gradually updated class means and a single shared covariance matrix estimated without any labels or source data.

What would settle it

Run the method on a benchmark whose test features exhibit strong non-Gaussian structure or class-conditional covariance differences; if accuracy or calibration falls below optimization-based TTA baselines, the modeling premise fails.

Figures

Figures reproduced from arXiv: 2508.15568 by Hongyeob Kim, Huiling Liu, Sungeun Hong, Youjia Zhang, Youngeun Kim, Young-Geun Choi.

**Figure 2.** Figure 2: Hyperparameter analysis. 4.3 Ablation Studies and Further Analysis Ablation Study. We ablate the key components in ADAPT. As shown in [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of decision boundaries on ImageNet-A. The colors indicate different classes. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of covariance properties on ImageNet: (Left) shows Frobenius distance [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

**Figure 5.** Figure 5: Results of few-shot classification across 30 datasets. We evaluate our method under both [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

**Figure 6.** Figure 6: Performance comparison of proposed ADAPT on different VLMs. [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

read the original abstract

Test-time adaptation (TTA) enhances the zero-shot robustness under distribution shifts by leveraging unlabeled test data during inference. Despite notable advances, several challenges still limit its broader applicability. First, most methods rely on backpropagation or iterative optimization, which limits scalability and hinders real-time deployment. Second, they lack explicit modeling of class-conditional feature distributions. This modeling is crucial for producing reliable decision boundaries and calibrated predictions, but it remains underexplored due to the lack of both source data and supervision at test time. In this paper, we propose ADAPT, an Advanced Distribution-Aware and backPropagation-free Test-time adaptation method. We reframe TTA as a Gaussian probabilistic inference task by modeling class-conditional likelihoods using gradually updated class means and a shared covariance matrix. This enables closed-form, training-free inference. To correct potential likelihood bias, we introduce lightweight regularization guided by CLIP priors and a historical knowledge bank. ADAPT requires no source data, no gradient updates, and no full access to target data, supporting both online and transductive settings. Extensive experiments across diverse benchmarks demonstrate that our method achieves state-of-the-art performance under a wide range of distribution shifts with superior scalability and robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes ADAPT, a backpropagation-free test-time adaptation method for improving zero-shot robustness under distribution shifts. It reframes TTA as closed-form Gaussian probabilistic inference by modeling class-conditional likelihoods with gradually updated class means and a single shared covariance matrix, both estimated from unlabeled target data without source data or gradients. Lightweight regularization via CLIP priors and a historical knowledge bank is introduced to correct likelihood bias. The method supports online and transductive settings and claims state-of-the-art performance with superior scalability across diverse benchmarks.

Significance. If the central claims hold, ADAPT would represent a meaningful advance in efficient TTA by eliminating optimization and backpropagation while providing explicit probabilistic modeling of class-conditional distributions. This could enable real-time deployment in resource-limited settings and improve calibration under shifts, provided the unsupervised mean updates and shared-covariance assumption prove robust.

major comments (3)

[§3] §3 (Method), around the class-mean update rule: the unsupervised assignment of samples to classes for updating means relies on the model's own (potentially biased) predictions under distribution shift. This creates a risk of error accumulation that directly undermines the robustness and SOTA claims, yet no analysis or mitigation beyond the CLIP regularization is detailed to demonstrate stability from unreliable initial predictions.
[Abstract and §4] Abstract and §4 (Experiments): the SOTA performance claim is asserted without reported error bars, ablation studies on the shared covariance assumption, or comparisons isolating the effect of the historical knowledge bank. This is load-bearing because the central contribution is the closed-form Gaussian inference under the stated assumptions.
[§3.2] §3.2, covariance estimation: the single shared covariance matrix is estimated without labels or source data, but the paper does not address how class-specific scale differences (often amplified by shifts) are handled or why this does not degrade decision boundaries relative to per-class covariances.

minor comments (2)

[§3] Notation for the Gaussian parameters (means and covariance) should be introduced with explicit equations early in the method section to clarify the closed-form inference steps.
[§4] Figure captions and experimental tables would benefit from clearer indication of online vs. transductive settings and the exact benchmarks used for each result.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with clarifications on our approach and indicate planned revisions to strengthen the presentation and analysis.

read point-by-point responses

Referee: [§3] §3 (Method), around the class-mean update rule: the unsupervised assignment of samples to classes for updating means relies on the model's own (potentially biased) predictions under distribution shift. This creates a risk of error accumulation that directly undermines the robustness and SOTA claims, yet no analysis or mitigation beyond the CLIP regularization is detailed to demonstrate stability from unreliable initial predictions.

Authors: We agree that relying on the model's initial predictions for unsupervised class-mean updates introduces a risk of error accumulation under distribution shift. Our design mitigates this through gradual momentum-based updates, CLIP priors that serve as an external anchor to correct biased likelihoods, and the historical knowledge bank that accumulates and reuses more reliable statistics over time. These components are intended to limit drift even from imperfect early assignments. To make this robustness explicit, we will add a dedicated stability analysis in the revised manuscript, including plots of prediction consistency across update steps and sensitivity experiments under varying initial conditions. revision: yes
Referee: [Abstract and §4] Abstract and §4 (Experiments): the SOTA performance claim is asserted without reported error bars, ablation studies on the shared covariance assumption, or comparisons isolating the effect of the historical knowledge bank. This is load-bearing because the central contribution is the closed-form Gaussian inference under the stated assumptions.

Authors: We acknowledge that stronger statistical reporting and component-wise ablations would better substantiate the SOTA claims and the contribution of the closed-form Gaussian inference. In the revised version we will report all main results with error bars computed over multiple random seeds. We will also add an ablation comparing the shared covariance to alternatives (such as diagonal or limited per-class estimates) and a controlled study isolating the historical knowledge bank by removing or varying its contribution. These additions will directly address the load-bearing nature of the assumptions. revision: yes
Referee: [§3.2] §3.2, covariance estimation: the single shared covariance matrix is estimated without labels or source data, but the paper does not address how class-specific scale differences (often amplified by shifts) are handled or why this does not degrade decision boundaries relative to per-class covariances.

Authors: The shared covariance is chosen because, in the TTA regime, the number of samples per class is typically too small for stable per-class covariance estimation, which would lead to noisy or singular matrices. Pooling across classes provides a more reliable estimate of the overall feature distribution while the class means are updated individually. Although class-specific scales can differ under shifts, the combination of mean alignment and the shared covariance still produces effective probabilistic decision boundaries, as shown by our consistent outperformance of baselines. We will expand §3.2 with an explicit discussion of this design choice, its limitations, and supporting empirical evidence. revision: yes

Circularity Check

0 steps flagged

Derivation chain is self-contained; no reductions to inputs by construction

full rationale

The paper reframes TTA as closed-form Gaussian probabilistic inference using gradually updated class means and a shared covariance, with lightweight regularization from external CLIP priors and a historical bank. These steps rely on explicit modeling assumptions and iterative updates from unlabeled target data rather than fitting parameters to a subset and renaming the output as a prediction. No self-citations, uniqueness theorems, or ansatz smuggling are invoked to justify core choices. The derivation does not reduce to its inputs by definition; the probabilistic alignment produces new decision boundaries from the estimated distributions. This is the common honest outcome for a method whose central claim rests on modeling choices that remain falsifiable against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the unverified assumption that target-domain features are well-modeled by per-class Gaussians with shared covariance; no free parameters are explicitly named in the abstract, and no new entities are postulated.

axioms (1)

domain assumption Class-conditional distributions in the target domain are approximately Gaussian and can be tracked via running means and a shared covariance without labels.
Invoked in the description of modeling class-conditional likelihoods using gradually updated class means and a shared covariance matrix.

pith-pipeline@v0.9.0 · 5758 in / 1264 out tokens · 38585 ms · 2026-05-18T21:23:46.666095+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Multi-modal Test-time Adaptation via Adaptive Probabilistic Gaussian Calibration
cs.CV 2026-04 unverdicted novelty 6.0

A probabilistic Gaussian model with adaptive contrastive asymmetry rectification improves multi-modal test-time adaptation by modeling category distributions and correcting modality asymmetry for better predictions un...

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Food-101–Mining Discriminative Components with Random Forests

Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101–Mining Discriminative Components with Random Forests. InECCV, 2014

work page 2014
[2]

Information maximization for few-shot learning

Malik Boudiaf, Imtiaz Ziko, Jérôme Rony, José Dolz, Pablo Piantanida, and Ismail Ben Ayed. Information maximization for few-shot learning. InNeurIPS, 2020

work page 2020
[3]

Describing Textures in the Wild

Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing Textures in the Wild. InCVPR, 2014

work page 2014
[4]

Imagenet: A Large-Scale Hierarchical Image Database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A Large-Scale Hierarchical Image Database. InCVPR, 2009

work page 2009
[5]

A normality test for multivariate dependent samples.Signal Processing, 201:108705, 2022

Sara El Bouch, Olivier Michel, and Pierre Comon. A normality test for multivariate dependent samples.Signal Processing, 201:108705, 2022

work page 2022
[6]

Joint normality test via two-dimensional projection

Sara ElBouch, Olivier JJ Michel, and Pierre Comon. Joint normality test via two-dimensional projection. InICASSP, 2022

work page 2022
[7]

Frus- tratingly easy test-time adaptation of vision-language models

Matteo Farina, Gianni Franchi, Giovanni Iacca, Massimiliano Mancini, and Elisa Ricci. Frus- tratingly easy test-time adaptation of vision-language models. InNeurIPS, 2024

work page 2024
[8]

Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories. In CVPRW, 2004

work page 2004
[9]

Diverse data augmenta- tion with diffusions for effective test-time prompt tuning

Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, and Wangmeng Zuo. Diverse data augmenta- tion with diffusions for effective test-time prompt tuning. InICCV, 2023

work page 2023
[10]

Online gaussian test-time adaptation of vision-language models.arXiv preprint arXiv:2501.04352, 2025

Clément Fuchs, Maxime Zanella, and Christophe De Vleeschouwer. Online gaussian test-time adaptation of vision-language models.arXiv preprint arXiv:2501.04352, 2025

work page arXiv 2025
[11]

Clip-adapter: Better vision-language models with feature adapters.IJCV, 132(2), 2024

Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, and Yu Qiao. Clip-adapter: Better vision-language models with feature adapters.IJCV, 132(2), 2024

work page 2024
[12]

Dota: Distributional test-time adaptation of vision-language models.arXiv preprint arXiv:2409.19375, 2024

Zongbo Han, Jialong Yang, Junfan Li, Qinghua Hu, Qianli Xu, Mike Zheng Shou, and Changqing Zhang. Dota: Distributional test-time adaptation of vision-language models.arXiv preprint arXiv:2409.19375, 2024

work page arXiv 2024
[13]

Discriminant analysis by gaussian mixtures.Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):155–176, 1996

Trevor Hastie and Robert Tibshirani. Discriminant analysis by gaussian mixtures.Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):155–176, 1996

work page 1996
[14]

Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7):2217– 2226, 2019

work page 2019
[15]

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization. InCVPR, 2021. 10

work page 2021
[16]

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Dan Hendrycks and Thomas Dietterich. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. InICLR, 2019

work page 2019
[17]

Natural Adversarial Examples

Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. Natural Adversarial Examples. InCVPR, 2021

work page 2021
[18]

A class of invariant consistent tests for multivariate normality

Norbert Henze and Bernd Zirkler. A class of invariant consistent tests for multivariate normality. Communications in statistics-Theory and Methods, 19(10):3595–3617, 1990

work page 1990
[19]

Transductive inference for text classification using support vector machines

Thorsten Joachims. Transductive inference for text classification using support vector machines. InICML, 1999

work page 1999
[20]

Label propagation for zero-shot classification with vision-language models

Yannis Kalantidis, Giorgos Tolias, et al. Label propagation for zero-shot classification with vision-language models. InCVPR, 2024

work page 2024
[21]

Efficient test-time adaptation of vision-language models

Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, and Eric Xing. Efficient test-time adaptation of vision-language models. InCVPR, 2024

work page 2024
[22]

3D Object Representations for Fine-Grained Categorization

Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3D Object Representations for Fine-Grained Categorization. InCVPRW, 2013

work page 2013
[23]

Estimation of the precision matrix of a singular wishart distribution and its application in high-dimensional data

Tatsuya Kubokawa and Muni S Srivastava. Estimation of the precision matrix of a singular wishart distribution and its application in high-dimensional data. 99(9):1906–1928, 2008

work page 1906
[24]

Ra-tta: Retrieval-augmented test-time adaptation for vision-language models

Youngjun Lee, Doyoung Kim, Junhyeok Kang, Jihwan Bang, Hwanjun Song, and Jae-Gil Lee. Ra-tta: Retrieval-augmented test-time adaptation for vision-language models. InICLR, 2025

work page 2025
[25]

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. InICML, 2022

work page 2022
[26]

Align Before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare, Shafiq Joty, Caiming Xiong, and Steven Chu Hong Hoi. Align Before Fuse: Vision and Language Representation Learning with Momentum Distillation. InNeurIPS, 2021

work page 2021
[27]

Using discriminant analysis for multi-class classification: an experimental investigation.Knowledge and information systems, 10:453–472, 2006

Tao Li, Shenghuo Zhu, and Mitsunori Ogihara. Using discriminant analysis for multi-class classification: an experimental investigation.Knowledge and information systems, 10:453–472, 2006

work page 2006
[28]

Text and image are mutually beneficial: Enhancing training-free few-shot classification with clip

Yayuan Li, Jintao Guo, Lei Qi, Wenbin Li, and Yinghuan Shi. Text and image are mutually beneficial: Enhancing training-free few-shot classification with clip. InAAAI, 2025

work page 2025
[29]

Efficient and context-aware label propagation for zero-/few-shot training-free adaptation of vision-language model

Yushu Li, Yongyi Su, Adam Goodge, Kui Jia, and Xun Xu. Efficient and context-aware label propagation for zero-/few-shot training-free adaptation of vision-language model. InICLR, 2025

work page 2025
[30]

Learning to propagate labels: Transductive propagation network for few-shot learning

Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. Learning to propagate labels: Transductive propagation network for few-shot learning. InICLR, 2019

work page 2019
[31]

Swapprompt: Test-time prompt adaptation for vision-language models

Xiaosong Ma, Jie Zhang, Song Guo, and Wenchao Xu. Swapprompt: Test-time prompt adaptation for vision-language models. InNeurIPS, 2023

work page 2023
[32]

Fine-Grained Visual Classification of Aircraft

Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, and Andrea Vedaldi. Fine- Grained Visual Classification of Aircraft.arXiv preprint arXiv:1306.5151, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[33]

Test-time prompt tuning for zero-shot generalization in vision-language models

Shu Manli, Nie Weili, Huang De-An, Yu Zhiding, Goldstein Tom, Anandkumar Anima, and Xiao Chaowei. Test-time prompt tuning for zero-shot generalization in vision-language models. InNeurIPS, 2022

work page 2022
[34]

Black-box test-time prompt tuning for vision-language models

Fan’an Meng, Chaoran Cui, Hongjun Dai, and Shuai Gong. Black-box test-time prompt tuning for vision-language models. InAAAI, 2025

work page 2025
[35]

A random-projection based test of gaussianity for stationary processes.Computational Statistics & Data Analysis, 75:124–141, 2014

Alicia Nieto-Reyes, Juan Antonio Cuesta-Albertos, and Fabrice Gamboa. A random-projection based test of gaussianity for stationary processes.Computational Statistics & Data Analysis, 75:124–141, 2014. 11

work page 2014
[36]

Automated Flower Classification over a Large Number of Classes

Maria-Elena Nilsback and Andrew Zisserman. Automated Flower Classification over a Large Number of Classes. InICVGIP. IEEE, 2008

work page 2008
[37]

Cats and Dogs

Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, and CV Jawahar. Cats and Dogs. In CVPR, 2012

work page 2012
[38]

The matrix cookbook.Technical University of Denmark, 7(15):510, 2008

Kaare Brandt Petersen, Michael Syskind Pedersen, et al. The matrix cookbook.Technical University of Denmark, 7(15):510, 2008

work page 2008
[39]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InICML, 2021

work page 2021
[40]

Do imagenet classifiers generalize to imagenet? InICML, 2019

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do imagenet classifiers generalize to imagenet? InICML, 2019

work page 2019
[41]

An extension of shapiro and wilk’s w test for normality to large samples

J Patrick Royston. An extension of shapiro and wilk’s w test for normality to large samples. Journal of the Royal Statistical Society: Series C (Applied Statistics), 31(2):115–124, 1982

work page 1982
[42]

Align your prompts: Test-time prompting with distribution alignment for zero-shot generalization

Jameel Hassan Abdul Samadh, Hanan Gani, Noor Hazim Hussein, Muhammad Uzair Khattak, Muzammal Naseer, Fahad Khan, and Salman Khan. Align your prompts: Test-time prompting with distribution alignment for zero-shot generalization. InNeurIPS, 2023

work page 2023
[43]

An analysis of variance test for normality.Biometrika, 52(3):591– 611, 1965

S Shaphiro and MBJB Wilk. An analysis of variance test for normality.Biometrika, 52(3):591– 611, 1965

work page 1965
[44]

High-dimensional linear discriminant analysis classifier for spiked covariance model.Journal of Machine Learning Research, 21(112):1–24, 2020

Houssem Sifaou, Abla Kammoun, and Mohamed-Slim Alouini. High-dimensional linear discriminant analysis classifier for spiked covariance model.Journal of Machine Learning Research, 21(112):1–24, 2020

work page 2020
[45]

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild.arXiv preprint arXiv:1212.0402, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[46]

Just shift it: Test-time prototype shifting for zero-shot generalization with vision-language models

Elaine Sui, Xiaohan Wang, and Serena Yeung-Levy. Just shift it: Test-time prototype shifting for zero-shot generalization with vision-language models. InWACV. IEEE, 2025

work page 2025
[47]

Sus-x: Training-free name-only transfer of vision-language models

Vishaal Udandarao, Ankush Gupta, and Samuel Albanie. Sus-x: Training-free name-only transfer of vision-language models. InICCV, 2023

work page 2023
[48]

Discriminative gaussian process latent variable model for classification

Raquel Urtasun and Trevor Darrell. Discriminative gaussian process latent variable model for classification. InICML, 2007

work page 2007
[49]

Tent: Fully test-time adaptation by entropy minimization

Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. InICLR, 2021

work page 2021
[50]

Learning Robust Global Representations by Penalizing Local Rredictive Power

Haohan Wang, Songwei Ge, Zachary Lipton, and Eric P Xing. Learning Robust Global Representations by Penalizing Local Rredictive Power. InNeurIPS, 2019

work page 2019
[51]

A hard-to-beat baseline for training-free clip-based adaptation

Zhengbo Wang, Jian Liang, Lijun Sheng, Ran He, Zilei Wang, and Tieniu Tan. A hard-to-beat baseline for training-free clip-based adaptation. InICLR, 2024

work page 2024
[52]

Is less more? exploring token condensation as training-free adaptation for clip

Zixin Wang, Dong Gong, Sen Wang, Zi Huang, and Yadan Luo. Is less more? exploring token condensation as training-free adaptation for clip. InICCV, 2025

work page 2025
[53]

Sun Database: Large-Scale Scene Recognition from Abbey to Zoo

Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun Database: Large-Scale Scene Recognition from Abbey to Zoo. InCVPR, 2010

work page 2010
[54]

Dynaprompt: Dynamic test-time prompt tuning

Zehao Xiao, Shilin Yan, Jack Hong, Jiayin Cai, Xiaolong Jiang, Yao Hu, Jiayi Shen, Qi Wang, and Cees GM Snoek. Dynaprompt: Dynamic test-time prompt tuning. InICLR, 2025

work page 2025
[55]

C-tpt: Calibrated test-time prompt tuning for vision-language models via text feature dispersion

Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee, Mark Hasegawa-Johnson, Yingzhen Li, and Chang D Yoo. C-tpt: Calibrated test-time prompt tuning for vision-language models via text feature dispersion. InICLR, 2024. 12

work page 2024
[56]

Task residual for tuning vision- language models

Tao Yu, Zhihe Lu, Xin Jin, Zhibo Chen, and Xinchao Wang. Task residual for tuning vision- language models. InCVPR, 2023

work page 2023
[57]

On the test-time zero-shot generalization of vision- language models: Do we really need prompt learning? InCVPR, 2024

Maxime Zanella and Ismail Ben Ayed. On the test-time zero-shot generalization of vision- language models: Do we really need prompt learning? InCVPR, 2024

work page 2024
[58]

Realistic test-time adaptation of vision-language models

Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer, and Ismail Ben Ayed. Realistic test-time adaptation of vision-language models. InCVPR, 2025

work page 2025
[59]

Boosting vision-language models with transduction

Maxime Zanella, Benoît Gérin, and Ismail Ayed. Boosting vision-language models with transduction. InNeurIPS, 2024

work page 2024
[60]

Boosting vision-language models for histopathology classification: Predict all at once

Maxime Zanella, Fereshteh Shakeri, Yunshi Huang, Houda Bahig, and Ismail Ben Ayed. Boosting vision-language models for histopathology classification: Predict all at once. InJ. Multivar . Anal., 2024

work page 2024
[61]

Dual prototype evolving for test-time generalization of vision-language models

Ce Zhang, Simon Stepputtis, Katia Sycara, and Yaqi Xie. Dual prototype evolving for test-time generalization of vision-language models. InNeurIPS, 2024

work page 2024
[62]

Historical test-time prompt tuning for vision foundation models

Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, and Shijian Lu. Historical test-time prompt tuning for vision foundation models. InNeurIPS, 2024

work page 2024
[63]

Tip-adapter: Training-free adaption of clip for few-shot classification

Renrui Zhang, Wei Zhang, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, and Hongsheng Li. Tip-adapter: Training-free adaption of clip for few-shot classification. InECCV. Springer, 2022

work page 2022
[64]

Boostadapter: Improving vision-language test-time adaptation via regional bootstrapping

Taolin Zhang, Jinpeng Wang, Hang Guo, Tao Dai, Bin Chen, and Shu-Tao Xia. Boostadapter: Improving vision-language test-time adaptation via regional bootstrapping. InNeurIPS, 2024

work page 2024
[65]

Dual memory networks: A versatile adaptation approach for vision-language models

Yabin Zhang, Wenjie Zhu, Hui Tang, Zhiyuan Ma, Kaiyang Zhou, and Lei Zhang. Dual memory networks: A versatile adaptation approach for vision-language models. InCVPR, 2024

work page 2024
[66]

Learning with local and global consistency

Dengyong Zhou, Olivier Bousquet, Thomas Lal, Jason Weston, and Bernhard Schölkopf. Learning with local and global consistency. InNeurIPS, 2003

work page 2003
[67]

Bayesian test-time adaptation for vision-language models

Lihua Zhou, Mao Ye, Shuaifeng Li, Nianxin Li, Xiatian Zhu, Lei Deng, Hongbin Liu, and Zhen Lei. Bayesian test-time adaptation for vision-language models. InCVPR, 2025

work page 2025
[68]

Not all features matter: Enhancing few-shot clip with adaptive prior refinement

Xiangyang Zhu, Renrui Zhang, Bowei He, Aojun Zhou, Dong Wang, Bin Zhao, and Peng Gao. Not all features matter: Enhancing few-shot clip with adaptive prior refinement. InICCV, 2023

work page 2023
[69]

Enhancing zero-shot vision models by label-free prompt distribution learning and bias correcting

Xingyu Zhu, Beier Zhu, Yi Tan, Shuo Wang, Yanbin Hao, and Hanwang Zhang. Enhancing zero-shot vision models by label-free prompt distribution learning and bias correcting. In NeurIPS, 2024

work page 2024
[70]

Awt: Transferring vision-language models via augmentation, weighting, and transportation

Yuhan Zhu, Yuyang Ji, Zhiyu Zhao, Gangshan Wu, and Limin Wang. Awt: Transferring vision-language models via augmentation, weighting, and transportation. InNeurIPS, 2024

work page 2024
[71]

Efficient test-time prompt tuning for vision-language models.arXiv preprint arXiv:2408.05775, 2024

Yuhan Zhu, Guozhen Zhang, Chen Xu, Haocheng Shen, Xiaoxin Chen, Gangshan Wu, and Limin Wang. Efficient test-time prompt tuning for vision-language models.arXiv preprint arXiv:2408.05775, 2024

work page arXiv 2024
[72]

Laplacian regularized few-shot learning

Imtiaz Ziko, Jose Dolz, Eric Granger, and Ismail Ben Ayed. Laplacian regularized few-shot learning. InICML, 2020. 13 Technical Appendices and Supplementary Material This appendix provides a detailed theoretical analysis of our method, along with additional experi- mental results. The contents are organized as follows: •Appendix A: Theoretical Analysis A.1...

work page arXiv 2020
[73]

provides modest improvements (e.g., +2.43% on Task 1), likely due to better- aligned class centers. However, updating only Σ (Row 2) leads to substantial per- formance drops (e.g., down to 9.58% on Task 2), indicating that estimating covari- ance from noisy test-time predictions alone is highly unstable and unreliable. The lower block (Rows 5–8) introduce...

work page

[1] [1]

Food-101–Mining Discriminative Components with Random Forests

Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101–Mining Discriminative Components with Random Forests. InECCV, 2014

work page 2014

[2] [2]

Information maximization for few-shot learning

Malik Boudiaf, Imtiaz Ziko, Jérôme Rony, José Dolz, Pablo Piantanida, and Ismail Ben Ayed. Information maximization for few-shot learning. InNeurIPS, 2020

work page 2020

[3] [3]

Describing Textures in the Wild

Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing Textures in the Wild. InCVPR, 2014

work page 2014

[4] [4]

Imagenet: A Large-Scale Hierarchical Image Database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A Large-Scale Hierarchical Image Database. InCVPR, 2009

work page 2009

[5] [5]

A normality test for multivariate dependent samples.Signal Processing, 201:108705, 2022

Sara El Bouch, Olivier Michel, and Pierre Comon. A normality test for multivariate dependent samples.Signal Processing, 201:108705, 2022

work page 2022

[6] [6]

Joint normality test via two-dimensional projection

Sara ElBouch, Olivier JJ Michel, and Pierre Comon. Joint normality test via two-dimensional projection. InICASSP, 2022

work page 2022

[7] [7]

Frus- tratingly easy test-time adaptation of vision-language models

Matteo Farina, Gianni Franchi, Giovanni Iacca, Massimiliano Mancini, and Elisa Ricci. Frus- tratingly easy test-time adaptation of vision-language models. InNeurIPS, 2024

work page 2024

[8] [8]

Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories. In CVPRW, 2004

work page 2004

[9] [9]

Diverse data augmenta- tion with diffusions for effective test-time prompt tuning

Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, and Wangmeng Zuo. Diverse data augmenta- tion with diffusions for effective test-time prompt tuning. InICCV, 2023

work page 2023

[10] [10]

Online gaussian test-time adaptation of vision-language models.arXiv preprint arXiv:2501.04352, 2025

Clément Fuchs, Maxime Zanella, and Christophe De Vleeschouwer. Online gaussian test-time adaptation of vision-language models.arXiv preprint arXiv:2501.04352, 2025

work page arXiv 2025

[11] [11]

Clip-adapter: Better vision-language models with feature adapters.IJCV, 132(2), 2024

Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, and Yu Qiao. Clip-adapter: Better vision-language models with feature adapters.IJCV, 132(2), 2024

work page 2024

[12] [12]

Dota: Distributional test-time adaptation of vision-language models.arXiv preprint arXiv:2409.19375, 2024

Zongbo Han, Jialong Yang, Junfan Li, Qinghua Hu, Qianli Xu, Mike Zheng Shou, and Changqing Zhang. Dota: Distributional test-time adaptation of vision-language models.arXiv preprint arXiv:2409.19375, 2024

work page arXiv 2024

[13] [13]

Discriminant analysis by gaussian mixtures.Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):155–176, 1996

Trevor Hastie and Robert Tibshirani. Discriminant analysis by gaussian mixtures.Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):155–176, 1996

work page 1996

[14] [14]

Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7):2217– 2226, 2019

work page 2019

[15] [15]

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization. InCVPR, 2021. 10

work page 2021

[16] [16]

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Dan Hendrycks and Thomas Dietterich. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. InICLR, 2019

work page 2019

[17] [17]

Natural Adversarial Examples

Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. Natural Adversarial Examples. InCVPR, 2021

work page 2021

[18] [18]

A class of invariant consistent tests for multivariate normality

Norbert Henze and Bernd Zirkler. A class of invariant consistent tests for multivariate normality. Communications in statistics-Theory and Methods, 19(10):3595–3617, 1990

work page 1990

[19] [19]

Transductive inference for text classification using support vector machines

Thorsten Joachims. Transductive inference for text classification using support vector machines. InICML, 1999

work page 1999

[20] [20]

Label propagation for zero-shot classification with vision-language models

Yannis Kalantidis, Giorgos Tolias, et al. Label propagation for zero-shot classification with vision-language models. InCVPR, 2024

work page 2024

[21] [21]

Efficient test-time adaptation of vision-language models

Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, and Eric Xing. Efficient test-time adaptation of vision-language models. InCVPR, 2024

work page 2024

[22] [22]

3D Object Representations for Fine-Grained Categorization

Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3D Object Representations for Fine-Grained Categorization. InCVPRW, 2013

work page 2013

[23] [23]

Estimation of the precision matrix of a singular wishart distribution and its application in high-dimensional data

Tatsuya Kubokawa and Muni S Srivastava. Estimation of the precision matrix of a singular wishart distribution and its application in high-dimensional data. 99(9):1906–1928, 2008

work page 1906

[24] [24]

Ra-tta: Retrieval-augmented test-time adaptation for vision-language models

Youngjun Lee, Doyoung Kim, Junhyeok Kang, Jihwan Bang, Hwanjun Song, and Jae-Gil Lee. Ra-tta: Retrieval-augmented test-time adaptation for vision-language models. InICLR, 2025

work page 2025

[25] [25]

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. InICML, 2022

work page 2022

[26] [26]

Align Before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare, Shafiq Joty, Caiming Xiong, and Steven Chu Hong Hoi. Align Before Fuse: Vision and Language Representation Learning with Momentum Distillation. InNeurIPS, 2021

work page 2021

[27] [27]

Using discriminant analysis for multi-class classification: an experimental investigation.Knowledge and information systems, 10:453–472, 2006

Tao Li, Shenghuo Zhu, and Mitsunori Ogihara. Using discriminant analysis for multi-class classification: an experimental investigation.Knowledge and information systems, 10:453–472, 2006

work page 2006

[28] [28]

Text and image are mutually beneficial: Enhancing training-free few-shot classification with clip

Yayuan Li, Jintao Guo, Lei Qi, Wenbin Li, and Yinghuan Shi. Text and image are mutually beneficial: Enhancing training-free few-shot classification with clip. InAAAI, 2025

work page 2025

[29] [29]

Efficient and context-aware label propagation for zero-/few-shot training-free adaptation of vision-language model

Yushu Li, Yongyi Su, Adam Goodge, Kui Jia, and Xun Xu. Efficient and context-aware label propagation for zero-/few-shot training-free adaptation of vision-language model. InICLR, 2025

work page 2025

[30] [30]

Learning to propagate labels: Transductive propagation network for few-shot learning

Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. Learning to propagate labels: Transductive propagation network for few-shot learning. InICLR, 2019

work page 2019

[31] [31]

Swapprompt: Test-time prompt adaptation for vision-language models

Xiaosong Ma, Jie Zhang, Song Guo, and Wenchao Xu. Swapprompt: Test-time prompt adaptation for vision-language models. InNeurIPS, 2023

work page 2023

[32] [32]

Fine-Grained Visual Classification of Aircraft

Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, and Andrea Vedaldi. Fine- Grained Visual Classification of Aircraft.arXiv preprint arXiv:1306.5151, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[33] [33]

Test-time prompt tuning for zero-shot generalization in vision-language models

Shu Manli, Nie Weili, Huang De-An, Yu Zhiding, Goldstein Tom, Anandkumar Anima, and Xiao Chaowei. Test-time prompt tuning for zero-shot generalization in vision-language models. InNeurIPS, 2022

work page 2022

[34] [34]

Black-box test-time prompt tuning for vision-language models

Fan’an Meng, Chaoran Cui, Hongjun Dai, and Shuai Gong. Black-box test-time prompt tuning for vision-language models. InAAAI, 2025

work page 2025

[35] [35]

A random-projection based test of gaussianity for stationary processes.Computational Statistics & Data Analysis, 75:124–141, 2014

Alicia Nieto-Reyes, Juan Antonio Cuesta-Albertos, and Fabrice Gamboa. A random-projection based test of gaussianity for stationary processes.Computational Statistics & Data Analysis, 75:124–141, 2014. 11

work page 2014

[36] [36]

Automated Flower Classification over a Large Number of Classes

Maria-Elena Nilsback and Andrew Zisserman. Automated Flower Classification over a Large Number of Classes. InICVGIP. IEEE, 2008

work page 2008

[37] [37]

Cats and Dogs

Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, and CV Jawahar. Cats and Dogs. In CVPR, 2012

work page 2012

[38] [38]

The matrix cookbook.Technical University of Denmark, 7(15):510, 2008

Kaare Brandt Petersen, Michael Syskind Pedersen, et al. The matrix cookbook.Technical University of Denmark, 7(15):510, 2008

work page 2008

[39] [39]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InICML, 2021

work page 2021

[40] [40]

Do imagenet classifiers generalize to imagenet? InICML, 2019

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do imagenet classifiers generalize to imagenet? InICML, 2019

work page 2019

[41] [41]

An extension of shapiro and wilk’s w test for normality to large samples

J Patrick Royston. An extension of shapiro and wilk’s w test for normality to large samples. Journal of the Royal Statistical Society: Series C (Applied Statistics), 31(2):115–124, 1982

work page 1982

[42] [42]

Align your prompts: Test-time prompting with distribution alignment for zero-shot generalization

Jameel Hassan Abdul Samadh, Hanan Gani, Noor Hazim Hussein, Muhammad Uzair Khattak, Muzammal Naseer, Fahad Khan, and Salman Khan. Align your prompts: Test-time prompting with distribution alignment for zero-shot generalization. InNeurIPS, 2023

work page 2023

[43] [43]

An analysis of variance test for normality.Biometrika, 52(3):591– 611, 1965

S Shaphiro and MBJB Wilk. An analysis of variance test for normality.Biometrika, 52(3):591– 611, 1965

work page 1965

[44] [44]

High-dimensional linear discriminant analysis classifier for spiked covariance model.Journal of Machine Learning Research, 21(112):1–24, 2020

Houssem Sifaou, Abla Kammoun, and Mohamed-Slim Alouini. High-dimensional linear discriminant analysis classifier for spiked covariance model.Journal of Machine Learning Research, 21(112):1–24, 2020

work page 2020

[45] [45]

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild.arXiv preprint arXiv:1212.0402, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[46] [46]

Just shift it: Test-time prototype shifting for zero-shot generalization with vision-language models

Elaine Sui, Xiaohan Wang, and Serena Yeung-Levy. Just shift it: Test-time prototype shifting for zero-shot generalization with vision-language models. InWACV. IEEE, 2025

work page 2025

[47] [47]

Sus-x: Training-free name-only transfer of vision-language models

Vishaal Udandarao, Ankush Gupta, and Samuel Albanie. Sus-x: Training-free name-only transfer of vision-language models. InICCV, 2023

work page 2023

[48] [48]

Discriminative gaussian process latent variable model for classification

Raquel Urtasun and Trevor Darrell. Discriminative gaussian process latent variable model for classification. InICML, 2007

work page 2007

[49] [49]

Tent: Fully test-time adaptation by entropy minimization

Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. Tent: Fully test-time adaptation by entropy minimization. InICLR, 2021

work page 2021

[50] [50]

Learning Robust Global Representations by Penalizing Local Rredictive Power

Haohan Wang, Songwei Ge, Zachary Lipton, and Eric P Xing. Learning Robust Global Representations by Penalizing Local Rredictive Power. InNeurIPS, 2019

work page 2019

[51] [51]

A hard-to-beat baseline for training-free clip-based adaptation

Zhengbo Wang, Jian Liang, Lijun Sheng, Ran He, Zilei Wang, and Tieniu Tan. A hard-to-beat baseline for training-free clip-based adaptation. InICLR, 2024

work page 2024

[52] [52]

Is less more? exploring token condensation as training-free adaptation for clip

Zixin Wang, Dong Gong, Sen Wang, Zi Huang, and Yadan Luo. Is less more? exploring token condensation as training-free adaptation for clip. InICCV, 2025

work page 2025

[53] [53]

Sun Database: Large-Scale Scene Recognition from Abbey to Zoo

Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun Database: Large-Scale Scene Recognition from Abbey to Zoo. InCVPR, 2010

work page 2010

[54] [54]

Dynaprompt: Dynamic test-time prompt tuning

Zehao Xiao, Shilin Yan, Jack Hong, Jiayin Cai, Xiaolong Jiang, Yao Hu, Jiayi Shen, Qi Wang, and Cees GM Snoek. Dynaprompt: Dynamic test-time prompt tuning. InICLR, 2025

work page 2025

[55] [55]

C-tpt: Calibrated test-time prompt tuning for vision-language models via text feature dispersion

Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee, Mark Hasegawa-Johnson, Yingzhen Li, and Chang D Yoo. C-tpt: Calibrated test-time prompt tuning for vision-language models via text feature dispersion. InICLR, 2024. 12

work page 2024

[56] [56]

Task residual for tuning vision- language models

Tao Yu, Zhihe Lu, Xin Jin, Zhibo Chen, and Xinchao Wang. Task residual for tuning vision- language models. InCVPR, 2023

work page 2023

[57] [57]

On the test-time zero-shot generalization of vision- language models: Do we really need prompt learning? InCVPR, 2024

Maxime Zanella and Ismail Ben Ayed. On the test-time zero-shot generalization of vision- language models: Do we really need prompt learning? InCVPR, 2024

work page 2024

[58] [58]

Realistic test-time adaptation of vision-language models

Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer, and Ismail Ben Ayed. Realistic test-time adaptation of vision-language models. InCVPR, 2025

work page 2025

[59] [59]

Boosting vision-language models with transduction

Maxime Zanella, Benoît Gérin, and Ismail Ayed. Boosting vision-language models with transduction. InNeurIPS, 2024

work page 2024

[60] [60]

Boosting vision-language models for histopathology classification: Predict all at once

Maxime Zanella, Fereshteh Shakeri, Yunshi Huang, Houda Bahig, and Ismail Ben Ayed. Boosting vision-language models for histopathology classification: Predict all at once. InJ. Multivar . Anal., 2024

work page 2024

[61] [61]

Dual prototype evolving for test-time generalization of vision-language models

Ce Zhang, Simon Stepputtis, Katia Sycara, and Yaqi Xie. Dual prototype evolving for test-time generalization of vision-language models. InNeurIPS, 2024

work page 2024

[62] [62]

Historical test-time prompt tuning for vision foundation models

Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, and Shijian Lu. Historical test-time prompt tuning for vision foundation models. InNeurIPS, 2024

work page 2024

[63] [63]

Tip-adapter: Training-free adaption of clip for few-shot classification

Renrui Zhang, Wei Zhang, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, and Hongsheng Li. Tip-adapter: Training-free adaption of clip for few-shot classification. InECCV. Springer, 2022

work page 2022

[64] [64]

Boostadapter: Improving vision-language test-time adaptation via regional bootstrapping

Taolin Zhang, Jinpeng Wang, Hang Guo, Tao Dai, Bin Chen, and Shu-Tao Xia. Boostadapter: Improving vision-language test-time adaptation via regional bootstrapping. InNeurIPS, 2024

work page 2024

[65] [65]

Dual memory networks: A versatile adaptation approach for vision-language models

Yabin Zhang, Wenjie Zhu, Hui Tang, Zhiyuan Ma, Kaiyang Zhou, and Lei Zhang. Dual memory networks: A versatile adaptation approach for vision-language models. InCVPR, 2024

work page 2024

[66] [66]

Learning with local and global consistency

Dengyong Zhou, Olivier Bousquet, Thomas Lal, Jason Weston, and Bernhard Schölkopf. Learning with local and global consistency. InNeurIPS, 2003

work page 2003

[67] [67]

Bayesian test-time adaptation for vision-language models

Lihua Zhou, Mao Ye, Shuaifeng Li, Nianxin Li, Xiatian Zhu, Lei Deng, Hongbin Liu, and Zhen Lei. Bayesian test-time adaptation for vision-language models. InCVPR, 2025

work page 2025

[68] [68]

Not all features matter: Enhancing few-shot clip with adaptive prior refinement

Xiangyang Zhu, Renrui Zhang, Bowei He, Aojun Zhou, Dong Wang, Bin Zhao, and Peng Gao. Not all features matter: Enhancing few-shot clip with adaptive prior refinement. InICCV, 2023

work page 2023

[69] [69]

Enhancing zero-shot vision models by label-free prompt distribution learning and bias correcting

Xingyu Zhu, Beier Zhu, Yi Tan, Shuo Wang, Yanbin Hao, and Hanwang Zhang. Enhancing zero-shot vision models by label-free prompt distribution learning and bias correcting. In NeurIPS, 2024

work page 2024

[70] [70]

Awt: Transferring vision-language models via augmentation, weighting, and transportation

Yuhan Zhu, Yuyang Ji, Zhiyu Zhao, Gangshan Wu, and Limin Wang. Awt: Transferring vision-language models via augmentation, weighting, and transportation. InNeurIPS, 2024

work page 2024

[71] [71]

Efficient test-time prompt tuning for vision-language models.arXiv preprint arXiv:2408.05775, 2024

Yuhan Zhu, Guozhen Zhang, Chen Xu, Haocheng Shen, Xiaoxin Chen, Gangshan Wu, and Limin Wang. Efficient test-time prompt tuning for vision-language models.arXiv preprint arXiv:2408.05775, 2024

work page arXiv 2024

[72] [72]

Laplacian regularized few-shot learning

Imtiaz Ziko, Jose Dolz, Eric Granger, and Ismail Ben Ayed. Laplacian regularized few-shot learning. InICML, 2020. 13 Technical Appendices and Supplementary Material This appendix provides a detailed theoretical analysis of our method, along with additional experi- mental results. The contents are organized as follows: •Appendix A: Theoretical Analysis A.1...

work page arXiv 2020

[73] [73]

provides modest improvements (e.g., +2.43% on Task 1), likely due to better- aligned class centers. However, updating only Σ (Row 2) leads to substantial per- formance drops (e.g., down to 9.58% on Task 2), indicating that estimating covari- ance from noisy test-time predictions alone is highly unstable and unreliable. The lower block (Rows 5–8) introduce...

work page