pith. sign in

arxiv: 2603.27139 · v2 · submitted 2026-03-28 · 💻 cs.CV

The Geometry of Robustness: Optimizing Loss Landscape Curvature and Feature Manifold Alignment for Robust Finetuning of Vision-Language Models

Pith reviewed 2026-05-14 22:14 UTC · model grok-4.3

classification 💻 cs.CV
keywords robust fine-tuningvision-language modelsloss landscape curvaturefeature alignmentadversarial robustnessout-of-distribution generalizationCLIP modelsPAC-Bayes
0
0 comments X

The pith

GRACE fine-tunes vision-language models by flattening loss curvature and aligning features to gain ID and adversarial accuracy without losing OOD performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that fine-tuning vision-language models creates a three-way trade-off between in-distribution accuracy, out-of-distribution generalization, and adversarial robustness. This trade-off arises because optimization lands in sharp anisotropic minima and produces feature representations that distort under small changes. GRACE counters both problems at once with curvature-scaled weight perturbations that flatten the loss surface and a feature-alignment loss that keeps representations consistent across clean, adversarial, and shifted inputs. A sympathetic reader cares because existing methods improve at most two of the three goals, leaving deployed VLMs either inaccurate or brittle. On ImageNet fine-tuning of CLIP models the method raises ID accuracy 10.8 percent and adversarial accuracy 13.5 percent while holding OOD accuracy at 57.0 percent, nearly matching the zero-shot baseline.

Core claim

GRACE, grounded in Robust PAC-Bayes theory, jointly regularizes parameter-space curvature through adaptive weight perturbations scaled by local curvature estimates and enforces feature-space invariance with an alignment loss across clean, adversarial, and OOD inputs. On ImageNet fine-tuning of CLIP models this produces 10.8 percent higher ID accuracy, 13.5 percent higher adversarial accuracy, and 57.0 percent OOD accuracy versus the 57.4 percent zero-shot baseline. Geometric analysis shows the resulting minima are flatter and the learned features remain undistorted across distribution shifts.

What carries the argument

GRACE framework that applies adaptive curvature-scaled perturbations to promote flat minima together with a Gram-aligned feature invariance loss.

If this is right

  • GRACE converges to flatter minima in the loss landscape.
  • Feature representations stay consistent without distortion under adversarial perturbations and distribution shifts.
  • ID accuracy, adversarial robustness, and OOD generalization improve simultaneously on CLIP ImageNet fine-tuning.
  • The approach supplies a principled route to generalized robustness in foundation VLMs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Curvature regularization may transfer to other multimodal or language-only foundation models facing similar optimization instabilities.
  • Tracking curvature during training could become a practical diagnostic for whether a fine-tuning run is likely to preserve OOD behavior.
  • The same alignment loss might stabilize representations in non-adversarial continual-learning or domain-adaptation settings.

Load-bearing premise

The three-way robustness trade-off stems from sharp anisotropic minima in parameter space and unstable feature representations that deform under perturbation.

What would settle it

Measuring the Hessian or curvature metrics of a GRACE-trained model and finding no reduction in sharpness relative to standard fine-tuning would falsify the claimed geometric mechanism.

Figures

Figures reproduced from arXiv: 2603.27139 by Brisa Maneechotesuwan, Chengyue Huang, Shaunak Halbe, Shivang Chopra, Zsolt Kira.

Figure 1
Figure 1. Figure 1: The VLM robustness three-way tradeoff. Existing robust fine-tuning strategies resolve at most two of {ID, OOD, ad￾versarial} robustness simultaneously, leaving a gap in generalized robustness. GRACE is designed to close this gap. shifts (OOD), and (iii) resisting gradient-based adversarial attacks. Standard fine-tuning often collapses at least one of these axes [26, 28, 30, 38], making generalized robustne… view at source ↗
Figure 2
Figure 2. Figure 2: (a) Feature Distribution Analysis: 3D projection of image features for in-distribution (fID), OOD (fOOD), and PGD adversarial inputs (fAdv) of the same class. (b) Loss Landscape Analysis: 3D/2D loss slices around the converged solutions for each method, using shared perturbation directions. Method λmax (×104 ) ∥H∥F / √ d (×102 ) FT 3.5 0.89 WiSE-FT (S1) 3.3 0.78 TeCoA (S2) 1.8 0.52 GRACE 1.6 0.43 [PITH_FU… view at source ↗
Figure 4
Figure 4. Figure 4: LAR-AWP rank curriculum. A diagonal rank mask controls the effective perturbation rank per layer. Curvature es￾timates hW (from mini-batch gradients) are used to assign higher perturbation ranks to sharper layers, focusing smoothing where the loss landscape is steepest. 5.3. Layer-Wise Adaptive Low-Rank AWP (LAR￾AWP) To control the expected robust sharpness term in Eq. (2), GRACE injects adversarial weight… view at source ↗
Figure 5
Figure 5. Figure 5: Gram-volume feature alignment. For each input, GRACE compares clean, adversarial, and LAR-AWP-perturbed image embeddings via a small Gram matrix. The Gram-volume loss encourages these three vectors to remain close to each other (low volume) while preserving separation across different classes. where nv is the validation mini-batch size. We maintain an exponential moving average of hW for each layer and com… view at source ↗
Figure 6
Figure 6. Figure 6: Pareto Curve. GRACE (Red) achieves +7 performance gain while being 1.4× faster than prior adversarial methods. achieves the best average, demonstrating the complemen￾tary effects of representation consistency and curvature reg￾ularization. These results confirm that both geometric com￾ponents are critical for stabilizing representations and miti￾gating catastrophic forgetting during fine-tuning. 6.5. Compu… view at source ↗
Figure 7
Figure 7. Figure 7: Layerwise curvature anisotropy in CLIP. Normal￾ized Hutchinson curvature κℓ for each transformer block of CLIP ViT-B/32, ViT-B/16, and ViT-L/14. All models exhibit substantial variation in curvature across depth, indicating strong layerwise ge￾ometric heterogeneity. B. Preliminaries B.1. Image Classification with Vision–Language Models In a K-class image classification problem with inputs x ∈ X and labels … view at source ↗
Figure 8
Figure 8. Figure 8: Schematic of LoRA weight updates. Weight up￾dates are parameterized by low-rank matrices A and B of rank r, constraining deviations from the pre-trained weights to a low￾dimensional subspace. reduces the number of trainable parameters, improves effi￾ciency, and implicitly constrains the adaptation to remain close to the pre-trained solution, consistent with the KL term in our PAC-Bayesian analysis. The ove… view at source ↗
Figure 9
Figure 9. Figure 9: 3D visualization of CLIP feature geometry under dif￾ferent shifts. Using normalized embeddings projected onto the unit sphere, we compare ID samples, natural adversarial (Nat Adv) variants, OOD samples, and features obtained under AWP pertur￾bations. AWP produces feature displacements that closely follow the structure of real OOD and natural adversarial shifts, confirm￾ing that curvature-aligned weight per… view at source ↗
Figure 11
Figure 11. Figure 11: Evolution of Layerwise AWP Rank During Train￾ing. The heatmap shows the adaptive adversarial weight perturba￾tion (AWP) rank assigned to each transformer block across train￾ing epochs. To understand how GRACE allocates adversarial pertur￾bation capacity across the network, we analyze the tem￾poral progression of the learned AWP rank r (ℓ) AWP and re￾late it to the curvature structure of the CLIP ViT-B/32 … view at source ↗
read the original abstract

Fine-tuning approaches for Vision-Language Models (VLMs) face a critical three-way trade-off between In-Distribution (ID) accuracy, Out-of-Distribution (OOD) generalization, and adversarial robustness. Existing robust fine-tuning strategies resolve at most two axes of this trade-off. Generalization-preserving methods retain ID/OOD performance but leave models vulnerable to adversarial attacks, while adversarial training improves robustness to targeted attacks but degrades ID/OOD accuracy. Our key insight is that the robustness trade-off stems from two geometric failures: sharp, anisotropic minima in parameter space and unstable feature representations that deform under perturbation. To address this, we propose GRACE (Gram-aligned Robustness via Adaptive Curvature Estimation), a unified fine-tuning framework that jointly regularizes the parameter-space curvature and feature-space invariance for VLMs. Grounded in Robust PAC-Bayes theory, GRACE employs adaptive weight perturbations scaled by local curvature to promote flatter minima, combined with a feature alignment loss that maintains representation consistency across clean, adversarial, and OOD inputs. On ImageNet fine-tuning of CLIP models, GRACE simultaneously improves ID accuracy by 10.8%, and adversarial accuracy by 13.5% while maintaining 57.0% OOD accuracy (vs. 57.4% zero-shot baseline). Geometric analysis confirms that GRACE converges to flatter minima without feature distortion across distribution shifts, providing a principled step toward generalized robustness in foundation VLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that the three-way trade-off in VLM fine-tuning (ID accuracy, OOD generalization, adversarial robustness) arises from sharp anisotropic minima in parameter space and unstable feature representations under perturbation. GRACE addresses this via a unified framework grounded in Robust PAC-Bayes theory: adaptive curvature-regularizing weight perturbations to promote flatter minima, combined with a feature alignment loss enforcing representation consistency across clean, adversarial, and OOD inputs. On ImageNet fine-tuning of CLIP models, it reports simultaneous gains of +10.8% ID accuracy and +13.5% adversarial accuracy while holding OOD accuracy at 57.0% (vs. 57.4% zero-shot baseline), with geometric analysis confirming flatter minima without feature distortion.

Significance. If the empirical gains and theoretical grounding hold under full verification, this would be a meaningful contribution to robust fine-tuning of foundation VLMs by providing a geometric diagnosis and joint regularization strategy that resolves the typical trade-off, moving beyond methods that improve at most two axes.

major comments (2)
  1. Abstract and experimental section: the central claims of +10.8% ID and +13.5% adversarial accuracy improvements (with OOD near baseline) are presented without error bars, statistical significance tests, ablation tables, or a complete experimental protocol (e.g., exact OOD datasets, perturbation budgets, hyperparameter ranges), which is load-bearing for assessing whether the three-way improvement is reproducible and not an artifact of selective reporting.
  2. Theory section (Robust PAC-Bayes grounding): without the explicit derivations of the adaptive weight perturbations scaled by local curvature, it remains unclear whether the regularization terms are parameter-free or reduce by construction to quantities fitted on the target data, raising a potential circularity risk for the claimed geometric benefits.
minor comments (1)
  1. Notation for the feature alignment loss and curvature estimator should be defined more explicitly with respect to the VLM components (e.g., vision encoder vs. text encoder) to improve clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and have revised the manuscript to strengthen the experimental reporting and theoretical derivations.

read point-by-point responses
  1. Referee: Abstract and experimental section: the central claims of +10.8% ID and +13.5% adversarial accuracy improvements (with OOD near baseline) are presented without error bars, statistical significance tests, ablation tables, or a complete experimental protocol (e.g., exact OOD datasets, perturbation budgets, hyperparameter ranges), which is load-bearing for assessing whether the three-way improvement is reproducible and not an artifact of selective reporting.

    Authors: We agree that comprehensive experimental details are essential for reproducibility. In the revised manuscript we have added error bars from five independent runs with different random seeds, included paired t-test results confirming statistical significance (p < 0.01) for the reported gains, expanded the ablation tables to cover each GRACE component, and provided a complete experimental protocol in the main text and appendix. This protocol specifies the exact OOD datasets (ImageNet-A, ImageNet-R, ImageNet-V2), perturbation budgets (PGD with ε = 8/255 and 10 steps), and hyperparameter ranges used for tuning. revision: yes

  2. Referee: Theory section (Robust PAC-Bayes grounding): without the explicit derivations of the adaptive weight perturbations scaled by local curvature, it remains unclear whether the regularization terms are parameter-free or reduce by construction to quantities fitted on the target data, raising a potential circularity risk for the claimed geometric benefits.

    Authors: We thank the referee for this observation. The original theory section presented the high-level PAC-Bayes motivation but omitted the full derivations for space reasons. We have now inserted the explicit step-by-step derivations in Section 3, showing that the adaptive perturbations are obtained from an online local Hessian-trace approximation computed during training and are not fitted post-hoc on the target data. The resulting regularization terms follow directly from the Robust PAC-Bayes bound and remain parameter-free in their core formulation; only standard validation-based selection is used for the few scalar hyperparameters. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central argument is an empirical claim: GRACE, motivated by geometric diagnosis of sharp minima and unstable features and grounded in external Robust PAC-Bayes theory, yields simultaneous gains in ID accuracy (+10.8%), adversarial accuracy (+13.5%), and near-baseline OOD accuracy on ImageNet-CLIP fine-tuning. No load-bearing derivation step is shown to reduce to its own inputs by construction. The abstract and skeptic summary present the method as jointly regularizing curvature (via adaptive perturbations) and feature alignment, with reported numbers as direct experimental outcomes rather than fitted predictions renamed as results. No self-citation chains, ansatzes smuggled via prior work, or self-definitional quantities appear in the provided text. The framework is therefore treated as self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so free parameters, axioms, and invented entities cannot be exhaustively audited; the method invokes Robust PAC-Bayes theory as grounding.

axioms (1)
  • domain assumption Robust PAC-Bayes theory supplies valid bounds for robustness under perturbation
    Invoked to justify adaptive curvature regularization and feature alignment.

pith-pipeline@v0.9.0 · 5582 in / 1085 out tokens · 28976 ms · 2026-05-14T22:14:34.864229+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    Analysis of representations for domain adaptation

    Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. Analysis of representations for domain adaptation. InAdvances in Neural Information Processing Systems. MIT Press, 2006. 4

  2. [2]

    Reliable evalua- tion of adversarial robustness with an ensemble of diverse parameter-free attacks

    Francesco Croce and Matthias Hein. Reliable evalua- tion of adversarial robustness with an ensemble of diverse parameter-free attacks. ICML, 2020. 6

  3. [3]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. 7, 8, 10

  4. [4]

    Sharpness-aware minimization for efficiently improving generalization

    Pierre Foret, Ariel Kleiner, Hossein Mobahi, and Behnam Neyshabur. Sharpness-aware minimization for efficiently improving generalization. InICLR, 2021. 2

  5. [5]

    Finetune like you pretrain: Im- proved finetuning of zero-shot vision models

    Sachin Goyal, Ananya Kumar, Sankalp Garg, Zico Kolter, and Aditi Raghunathan. Finetune like you pretrain: Im- proved finetuning of zero-shot vision models. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19338–19347, 2023. 1, 2, 3, 7, 10

  6. [6]

    The many faces of robust- ness: A critical analysis of out-of-distribution generalization

    Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kada- vath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Lixuan Zhu, Samyak Parajuli, Mike Guo, Dawn Xiaodong Song, Ja- cob Steinhardt, and Justin Gilmer. The many faces of robust- ness: A critical analysis of out-of-distribution generalization. 2021 IEEE/CVF International Conference on Computer Vi- sion (IC...

  7. [7]

    Natural adversarial examples

    Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Stein- hardt, and Dawn Song. Natural adversarial examples. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 15262–15271,

  8. [8]

    LoRA: Low-rank adaptation of large language models

    Edward J Hu, yelong shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In ICLR, 2022. 2

  9. [9]

    Directional gradient pro- jection for robust fine-tuning of foundation models

    Chengyue Huang, Junjiao Tian, Brisa Maneechotesuwan, Shivang Chopra, and Zsolt Kira. Directional gradient pro- jection for robust fine-tuning of foundation models. InThe Thirteenth International Conference on Learning Represen- tations, 2025. 1

  10. [10]

    Scaling up visual and vision-language representa- tion learning with noisy text supervision

    Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc Le, Yun-Hsuan Sung, Zhen Li, and Tom Duerig. Scaling up visual and vision-language representa- tion learning with noisy text supervision. InInternational conference on machine learning, pages 4904–4916. PMLR,

  11. [11]

    Visualizing the loss landscape of neural nets

    Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. Visualizing the loss landscape of neural nets. In Neural Information Processing Systems, 2018. 3

  12. [12]

    Rethinking natural adversarial examples for classifica- tion models, 2021

    Xiao Li, Jianmin Li, Ting Dai, Jie Shi, Jun Zhu, and Xiaolin Hu. Rethinking natural adversarial examples for classifica- tion models, 2021. 6

  13. [13]

    Language-driven anchors for zero-shot ad- versarial robustness

    Xiao Li, Wei Zhang, Yining Liu, Zhanhao Hu, Bo Zhang, and Xiaolin HU. Language-driven anchors for zero-shot ad- versarial robustness. InCVPR, 2024. 2, 3, 7, 10

  14. [14]

    Sophia: A scalable stochastic second-order optimizer for language model pre-training

    Hong Liu, Zhiyuan Li, David Leo Wright Hall, Percy Liang, and Tengyu Ma. Sophia: A scalable stochastic second-order optimizer for language model pre-training. InThe Twelfth In- ternational Conference on Learning Representations, 2024. 5

  15. [15]

    Erfani, Sudanthi Wijewickrema, Grant Schoenebeck, Dawn Song, Michael E

    Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi Wijewickrema, Grant Schoenebeck, Dawn Song, Michael E. Houle, and James Bailey. Characterizing adversarial sub- spaces using local intrinsic dimensionality, 2018. 5

  16. [16]

    Towards deep learn- ing models resistant to adversarial attacks

    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learn- ing models resistant to adversarial attacks. InInternational Conference on Learning Representations, 2018. 1, 2

  17. [17]

    Understanding zero-shot adversarial robust- ness for large-scale models

    Chengzhi Mao, Scott Geng, Junfeng Yang, Xin Wang, and Carl V ondrick. Understanding zero-shot adversarial robust- ness for large-scale models. InThe Eleventh International Conference on Learning Representations, 2023. 1, 2, 3, 7, 10

  18. [18]

    A pac-bayesian tutorial with a dropout bound, 2013

    David McAllester. A pac-bayesian tutorial with a dropout bound, 2013. 3

  19. [19]

    Lipsum-FT: Ro- bust fine-tuning of zero-shot models using random text guid- ance

    Giung Nam, Byeongho Heo, and Juho Lee. Lipsum-FT: Ro- bust fine-tuning of zero-shot models using random text guid- ance. InThe Twelfth International Conference on Learning Representations, 2024. 1, 2

  20. [20]

    Exploring generalization in deep learning

    Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, and Nathan Srebro. Exploring generalization in deep learning. InProceedings of the 31st International Conference on Neural Information Processing Systems, page 5949–5958, Red Hook, NY , USA, 2017. Curran Associates Inc. 3

  21. [21]

    A pac-bayesian approach to spectrally-normalized mar- gin bounds for neural networks, 2018

    Behnam Neyshabur, Srinadh Bhojanapalli, and Nathan Sre- bro. A pac-bayesian approach to spectrally-normalized mar- gin bounds for neural networks, 2018. 3

  22. [22]

    Curran Associates Inc., Red Hook, NY , USA, 2019

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K ¨opf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala.PyTorch: an imper- ative style, high-perfo...

  23. [23]

    Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, and Quoc V . Le. Combined scaling for zero-shot transfer learn- ing.Neurocomput., 555(C), 2023. 1

  24. [24]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learning, pages 8748–8763. PMLR, 2021....

  25. [25]

    Do ImageNet classifiers generalize to Ima- geNet? InProceedings of the 36th International Conference on Machine Learning, pages 5389–5400

    Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do ImageNet classifiers generalize to Ima- geNet? InProceedings of the 36th International Conference on Machine Learning, pages 5389–5400. PMLR, 2019. 6, 8

  26. [26]

    Robust CLIP: Unsupervised ad- versarial fine-tuning of vision embeddings for robust large vision-language models

    Christian Schlarmann, Naman Deep Singh, Francesco Croce, and Matthias Hein. Robust CLIP: Unsupervised ad- versarial fine-tuning of vision embeddings for robust large vision-language models. InProceedings of the 41st Inter- national Conference on Machine Learning, pages 43685– 43704. PMLR, 2024. 1, 2, 3, 7, 10

  27. [27]

    Hopcroft

    Chuanbiao Song, Kun He, Liwei Wang, and John E. Hopcroft. Improving the generalization of adversarial train- ing with domain adaptation. InInternational Conference on Learning Representations, 2019. 2

  28. [28]

    Trainable projected gradient method for robust fine-tuning

    Junjiao Tian, Xiaoliang Dai, Chih-Yao Ma, Zecheng He, Yen-Cheng Liu, and Zsolt Kira. Trainable projected gradient method for robust fine-tuning. In2023 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 7836–7845, 2023. 1, 2, 3, 7, 10

  29. [29]

    Rethinking weight decay for robust fine-tuning of foundation models

    Junjiao Tian, Chengyue Huang, and Zsolt Kira. Rethinking weight decay for robust fine-tuning of foundation models. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. 3, 7, 10

  30. [30]

    Fast trainable projection for robust fine-tuning

    Junjiao Tian, Yen-Cheng Liu, James Seale Smith, and Zsolt Kira. Fast trainable projection for robust fine-tuning. InPro- ceedings of the 37th International Conference on Neural In- formation Processing Systems, Red Hook, NY , USA, 2024. Curran Associates Inc. 1, 2

  31. [31]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neu- ral Information Processing Systems. Curran Associates, Inc.,

  32. [32]

    Learning robust global representations by penalizing local predictive power

    Haohan Wang, Songwei Ge, Zachary Lipton, and Eric P Xing. Learning robust global representations by penalizing local predictive power. InAdvances in Neural Information Processing Systems. Curran Associates, Inc., 2019. 6, 8

  33. [33]

    Improving out-of-distribution generalization by adversarial training with structured priors

    Qixun Wang, Yifei Wang, Hong Zhu, and Yisen Wang. Improving out-of-distribution generalization by adversarial training with structured priors. InAdvances in Neural Infor- mation Processing Systems, 2022. 2

  34. [34]

    Pre- trained model guided fine-tuning for zero-shot adversarial robustness

    Sibo Wang, Jie Zhang, Zheng Yuan, and Shiguang Shan. Pre- trained model guided fine-tuning for zero-shot adversarial robustness. InCVPR, 2024. 2, 3, 7, 10

  35. [35]

    Transformers: State-of-the-art natural language processing

    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chau- mond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. Transformers: State-of-the-art ...

  36. [36]

    Robust fine-tuning of zero-shot models

    Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gon- tijo Lopes, Hannaneh Hajishirzi, Ali Farhadi, Hongseok Namkoong, and Ludwig Schmidt. Robust fine-tuning of zero-shot models. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 7959–7971, 2022. 1, 2, 3, 7, 10

  37. [37]

    Adversarial weight perturbation helps robust generalization

    Dongxian Wu, Shu-Tao Xia, and Yisen Wang. Adversarial weight perturbation helps robust generalization. InNeurIPS,

  38. [38]

    A photo of a{label}

    Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Chongx- uan LI, Ngai-Man (Man) Cheung, and Min Lin. On eval- uating adversarial robustness of large vision-language mod- els. InAdvances in Neural Information Processing Systems, pages 54111–54138. Curran Associates, Inc., 2023. 1 The Geometry of Robustness: Optimizing Loss Landscape Curvature and Feature Man...