pith. sign in

arxiv: 2605.23065 · v1 · pith:A2KSUE7Znew · submitted 2026-05-21 · 💻 cs.CV · cs.AI· cs.LG

Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

Pith reviewed 2026-05-25 05:23 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG
keywords adversarial robustnessFloyd-Steinberg ditheringvision foundation modelsinput transformationadversarial defensemulti-level quantizationDINOv2PaliGemma
0
0 comments X

The pith

Multi-level Floyd-Steinberg dithering defends frozen vision foundation models against adversarial attacks with less clean-image degradation than baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Vision foundation models act as frozen backbones for many tasks yet remain vulnerable to adversarial attacks that can break them without retraining. The paper examines multi-level Floyd-Steinberg error-diffusion dithering as a lightweight input transformation that alters pixel values to break attack patterns while keeping semantic content intact for the models. Evaluations cover six tasks including classification, segmentation, depth estimation, retrieval, captioning, and visual question answering, using DINOv2 and PaliGemma models against PGD, MI-FGSM, SIA attacks, and an adaptive attacker. At intermediate quantization levels combined with post-processing blur, the approach matches or exceeds diffusion-based denoising and other baselines while causing substantially less harm to clean inputs.

Core claim

Floyd-Steinberg dithering at intermediate quantization levels, especially when combined with post-processing blur, exceeds or matches all tested baselines including diffusion-based denoising in robustness against adversarial attacks on frozen vision foundation models, while producing substantially less degradation on clean inputs across six tasks, two model families, and multiple attack strengths including adaptive ones.

What carries the argument

Multi-level Floyd-Steinberg error-diffusion dithering, an input transformation that quantizes image pixels at chosen levels and diffuses the quantization error to neighboring pixels to disrupt adversarial perturbations.

If this is right

  • The defense applies without retraining or task-specific adaptation across classification, segmentation, depth estimation, retrieval, captioning, and visual question answering.
  • Intermediate quantization levels plus blur outperform or equal diffusion denoising on attack robustness.
  • Performance holds for both DINOv2 and PaliGemma model families against PGD, MI-FGSM, SIA, and straight-through-estimator adaptive attacks.
  • Clean-input accuracy drops less than with compared baselines.
  • The method remains model-agnostic and lightweight.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same dithering pattern might protect other frozen backbones or modalities if the error-diffusion step continues to separate perturbations from semantics.
  • Pairing dithering with different post-process filters could further reduce clean degradation while retaining attack resistance.
  • Testing the approach on additional attack strengths or larger-scale models would reveal whether the intermediate-level advantage scales.
  • The results imply that structured quantization noise can serve as a general defense layer when models stay frozen.

Load-bearing premise

The dithering transformation disrupts adversarial perturbations while preserving semantic content for the frozen foundation models.

What would settle it

An adaptive attack that fools the dithered inputs on the evaluated models while leaving clean-image performance nearly unchanged would show the defense fails to hold.

read the original abstract

Vision foundation models are widely used as frozen backbones across many downstream tasks, making them a single point of failure under adversarial attack. We study multi-level Floyd-Steinberg error-diffusion dithering as a lightweight, model-agnostic input transformation that disrupts adversarial perturbations while preserving semantic content. Unlike prior work, which was limited to binary dithering, grayscale CIFAR-10, and a single small model trained from scratch, we evaluate across six tasks (classification, segmentation, depth estimation, retrieval, captioning, visual question answering), two model families (DINOv2, PaliGemma), and three attacks of increasing strength (PGD, MI-FGSM, SIA), as well as an adaptive attacker using a straight-through estimator. Our results show that Floyd-Steinberg dithering at intermediate quantization levels, especially when combined with post-processing blur, exceeds or matches all tested baselines, including diffusion-based denoising, with substantially less degradation on clean inputs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that multi-level Floyd-Steinberg error-diffusion dithering is a lightweight, model-agnostic input transformation defense for frozen vision foundation models (DINOv2, PaliGemma). Evaluated across six tasks (classification, segmentation, depth estimation, retrieval, captioning, VQA) against PGD, MI-FGSM, SIA and an adaptive straight-through estimator attacker, intermediate quantization levels (especially with post-processing blur) exceed or match baselines including diffusion denoising while causing substantially less clean-input degradation, without retraining or task-specific adaptation.

Significance. If the empirical results hold, the work is significant because it supplies a practical, low-overhead defense applicable to frozen foundation models used as backbones across many tasks, addressing their role as single points of failure. The breadth of the evaluation—six tasks, two model families, multiple attack strengths including an adaptive attacker—is a clear strength that supports the model-agnostic claim. The purely empirical nature with no fitted parameters inside the reported numbers is also a positive feature.

major comments (2)
  1. [§4 Experiments] §4 (Experiments) and associated tables: the central comparative claim that dithering exceeds or matches all baselines rests on reported performance numbers, yet the manuscript supplies no error bars, statistical tests, or explicit confirmation that baseline hyperparameters were re-tuned on the same splits; this directly affects whether the superiority claim can be trusted.
  2. [§3 Method] §3 (Method) and §4.2 (clean-input results): the premise that dithering disrupts perturbations while preserving semantic content for frozen models across tasks is load-bearing for the model-agnostic claim, but no auxiliary measurements (e.g., embedding cosine similarity or feature-map correlation before/after dithering) are provided to rule out task-dependent feature distortion.
minor comments (2)
  1. [§2] Notation for quantization levels and the exact post-processing blur kernel should be defined once in §2 and used consistently in all tables and figures.
  2. [Related Work] The related-work discussion of prior binary dithering should include a short quantitative comparison (e.g., why binary fails on foundation-model features) rather than a qualitative statement only.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation for minor revision. The positive assessment of the work's significance and evaluation breadth is appreciated. Below we respond point-by-point to the two major comments, committing to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [§4 Experiments] §4 (Experiments) and associated tables: the central comparative claim that dithering exceeds or matches all baselines rests on reported performance numbers, yet the manuscript supplies no error bars, statistical tests, or explicit confirmation that baseline hyperparameters were re-tuned on the same splits; this directly affects whether the superiority claim can be trusted.

    Authors: We agree that error bars and explicit hyperparameter details would increase confidence in the comparative results. In the revised manuscript we will add standard deviations computed from three independent runs (varying random seeds for both attack generation and evaluation) to the primary tables in §4. We will also insert a clarifying paragraph in §4.1 stating that all baselines were re-implemented using the same data splits and that hyperparameters were selected via the validation sets employed in the original baseline papers (with any deviations noted). These additions directly address the concern about the reliability of the reported superiority. revision: yes

  2. Referee: [§3 Method] §3 (Method) and §4.2 (clean-input results): the premise that dithering disrupts perturbations while preserving semantic content for frozen models across tasks is load-bearing for the model-agnostic claim, but no auxiliary measurements (e.g., embedding cosine similarity or feature-map correlation before/after dithering) are provided to rule out task-dependent feature distortion.

    Authors: We recognize that direct measurements of feature preservation would further bolster the model-agnostic claim. In the revised version we will add, in §4.2, cosine-similarity statistics between DINOv2 (and PaliGemma) embeddings and selected feature maps computed on a held-out set of clean images before and after dithering at the reported operating points. These will be presented alongside the existing clean-input performance numbers to show that any distortion is small, consistent across tasks, and does not correlate with task-specific degradation. This auxiliary analysis can be generated from the same evaluation pipeline already used in the paper. revision: yes

Circularity Check

0 steps flagged

No circularity detected; purely empirical evaluation

full rationale

The manuscript is an empirical study evaluating Floyd-Steinberg dithering as an input transformation defense on frozen foundation models across six tasks, two model families, and multiple attacks. No derivation chain, equations, fitted parameters renamed as predictions, or load-bearing self-citations are present in the provided text. Performance claims rest on experimental measurements rather than any reduction to inputs by construction. The central premise (dithering disrupts perturbations while preserving semantics) is tested directly via benchmarks and is not derived from prior self-citations or ansatzes within the paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters, invented entities or non-standard axioms are visible from the abstract; the approach rests on the standard domain assumption that error-diffusion dithering can be treated as a fixed, model-agnostic preprocessing step.

axioms (1)
  • domain assumption Multi-level Floyd-Steinberg dithering preserves semantic content for frozen foundation models while disrupting adversarial perturbations
    This premise is invoked to justify applying the transformation without retraining and is required for the defense claim to hold across tasks.

pith-pipeline@v0.9.0 · 5722 in / 1236 out tokens · 28082 ms · 2026-05-25T05:23:22.686939+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 6 internal anchors

  1. [1]

    INTRODUCTION Vision foundation models (VFMs) have become the default building blocks in computer vision. A single VFM, trained once on a large corpus, can serve as a backbone for classi- fication, segmentation, depth estimation, retrieval, and more, with little or no task-specific fine-tuning. VFMs can also be plugged in as the visual encoder of a vision-...

  2. [2]

    Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

    BACKGROUND 2.1. Vision foundation models VFMs are trained once on large-scale corpora via self- supervised learning, producing general-purpose visual repre- sentations that transfer to classification, segmentation, depth estimation, and retrieval with minimal or no fine-tuning. 2.2. Adversarial attacks Adversarial attacks [4, 5] craft small, human-imperce...

  3. [3]

    1, left)

    DITHERING DEFENSE Consider a VFMf ϕ followed by a task headg θ, produc- ing a task labelc o =g θ(fϕ(xo))for a clean inputx o ∈ RH×W×3 (Fig. 1, left). An adversary seeks to findx a within anϵ-ball ofx o that maximizes a task lossL c, e.g. Lc = max c̸=co p(c|x)−p(c o|x)for classification. We as- sume a white-box adversary with full access tof ϕ andg θ, repr...

  4. [4]

    EXPERIMENTAL SETUP To facilitate future research, we release a highly customizable codebase athttps://github.com/bruce-willis/ attacking-downstream. 4.1. Models Our primary backbone is DINOv2 ViT-S/14 [14]; results with larger variants are reported in the supplementary material (Section B). We use the task-specific heads released with the model where avai...

  5. [5]

    RESULTS AND DISCUSSION Fig. 1 illustrates the central trade-off: too few quantization levels (K=2) destroy semantic content along with the pertur- bation; too many (K=20) let the adversarial signal survive; an intermediate value (K=3) preserves clean accuracy and neutralizes the attack. Additional qualitative examples are provided in the supplementary mat...

  6. [6]

    CONCLUSION We presented a comprehensive study of multi-level Floyd– Steinberg error-diffusion dithering as an input-transformation defense for vision foundation models. Across six downstream tasks, two model families, and three attacks of increasing strength, FS dithering at intermediate quantization levels, combined with post-processing blur, consistentl...

  7. [7]

    Task-agnostic attacks against vision foundation models,

    Brian Pulfer, Yury Belousov, Vitaliy Kinakh, Teddy Furon, and Slava V oloshynovskiy, “Task-agnostic attacks against vision foundation models,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 3570–3581. 1, 3

  8. [8]

    Error diffusion halfton- ing against adversarial examples,

    Shao-Yuan Lo and Vishal M Patel, “Error diffusion halfton- ing against adversarial examples,” in2021 IEEE Interna- tional Conference on Image Processing (ICIP). IEEE, 2021, pp. 3892–3896. 1, 2

  9. [9]

    Monet: Impres- sionism as a defense against adversarial examples,

    Huangyi Ge, Sze Yiu Chau, and Ninghui Li, “Monet: Impres- sionism as a defense against adversarial examples,” in2020 Second IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). IEEE, 2020, pp. 246–255. 1, 2

  10. [10]

    Intriguing properties of neural networks,

    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus, “Intriguing properties of neural networks,” in2nd Interna- tional Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Pro- ceedings, Yoshua Bengio and Yann LeCun, Eds., 2014. 1

  11. [11]

    Explaining and harnessing adversarial examples,

    Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy, “Explaining and harnessing adversarial examples,” in3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun, Eds., 2015. 1

  12. [12]

    Towards deep learning models resistant to adversarial attacks,

    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu, “Towards deep learning models resistant to adversarial attacks,” in6th International Conference on Learning Representations, ICLR 2018, V ancou- ver , BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. 2018, OpenReview.net. 1, 2

  13. [13]

    Boosting adversarial at- tacks with momentum,

    Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li, “Boosting adversarial at- tacks with momentum,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9185–

  14. [14]

    Structure invariant transformation for better adversarial transferability,

    Xiaosen Wang, Zeliang Zhang, and Jianping Zhang, “Structure invariant transformation for better adversarial transferability,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4607–4619. 2

  15. [15]

    Countering Adversarial Images using Input Transformations

    Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten, “Countering adversarial images using input trans- formations,”arXiv preprint arXiv:1711.00117, 2017. 2

  16. [16]

    Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

    Weilin Xu, David Evans, and Yanjun Qi, “Feature squeez- ing: Detecting adversarial examples in deep neural networks,” arXiv preprint arXiv:1704.01155, 2017. 2

  17. [17]

    (certified!!) ad- versarial robustness for free!,

    Nicholas Carlini, Florian Tram `er, Krishnamurthy Dvijotham, Leslie Rice, Mingjie Sun, and Zico Kolter, “(certified!!) ad- versarial robustness for free!,”International Conference on Learning Representations (ICLR), 2023. 2

  18. [18]

    Beyond classification: Evaluating diffu- sion denoised smoothing for security-utility trade off,

    Yury Belousov, Brian Pulfer, Vitaliy Kinakh, and Slava V oloshynovskiy, “Beyond classification: Evaluating diffu- sion denoised smoothing for security-utility trade off,”arXiv preprint arXiv:2505.15594, 2025. 2

  19. [19]

    An adaptive algorithm for spatial gray- scale,

    Robert W Floyd, “An adaptive algorithm for spatial gray- scale,” inProc. Soc. Inf. Disp., 1976, vol. 17, pp. 75–77. 2

  20. [20]

    Dinov2: Learning robust visual features without supervision,

    Maxime Oquab, Timoth ´ee Darcet, Th´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud As- sran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po- Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herv ´e Jegou, Julien Mairal, Pa...

  21. [21]

    Paligemma: A versatile 3b vlm for transfer,

    Lucas Beyer, Andreas Steiner, Andr ´e Susano Pinto, Alexan- der Kolesnikov, Xiao Wang, Daniel Salz, Maxim Neu- mann, Ibrahim Alabdulmohsin, Michael Tschannen, Emanuele Bugliarello, Thomas Unterthiner, Daniel Keysers, Skanda Koppula, Fangyu Liu, Adam Grycner, Alexey Gritsenko, Neil Houlsby, Manoj Kumar, Keran Rong, Julian Eisenschlos, Rishabh Kabra, Matthi...

  22. [22]

    Sigmoid loss for language image pre-training,

    Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lu- cas Beyer, “Sigmoid loss for language image pre-training,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 11975–11986. 3

  23. [23]

    The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results,

    M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results,” http://www.pascal- network.org/challenges/VOC/voc2012/workshop/index.html. 3

  24. [24]

    Indoor Semantic Segmentation using depth information

    Camille Couprie, Cl ´ement Farabet, Laurent Najman, and Yann LeCun, “Indoor semantic segmentation using depth informa- tion,”arXiv preprint arXiv:1301.3572, 2013. 3

  25. [25]

    Revisiting oxford and paris: Large-scale image retrieval benchmarking,

    F. Radenovi ´c, A. Iscen, G. Tolias, Y . Avrithis, and O. Chum, “Revisiting oxford and paris: Large-scale image retrieval benchmarking,” inCVPR, 2018. 3

  26. [26]

    Microsoft COCO Captions: Data Collection and Evaluation Server

    Xinlei Chen, Hao Fang, Tsung-Yi Lin, Ramakrishna Vedan- tam, Saurabh Gupta, Piotr Doll ´ar, and C Lawrence Zitnick, “Microsoft coco captions: Data collection and evaluation server,”arXiv preprint arXiv:1504.00325, 2015. 3

  27. [27]

    Spice: Semantic propositional image caption evalu- ation,

    Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould, “Spice: Semantic propositional image caption evalu- ation,” inEuropean conference on computer vision. Springer, 2016, pp. 382–398. 3

  28. [28]

    CLIPScore: a reference-free evaluation met- ric for image captioning,

    Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi, “CLIPScore: a reference-free evaluation met- ric for image captioning,” inEMNLP, 2021. 3

  29. [29]

    Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answer- ing,

    Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, and Devi Parikh, “Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answer- ing,” inConference on Computer Vision and Pattern Recogni- tion (CVPR), 2017. 3

  30. [30]

    Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

    Yoshua Bengio, Nicholas L ´eonard, and Aaron Courville, “Es- timating or propagating gradients through stochastic neurons for conditional computation,”arXiv preprint arXiv:1308.3432,

  31. [31]

    Obfus- cated gradients give a false sense of security: Circumventing defenses to adversarial examples,

    Anish Athalye, Nicholas Carlini, and David Wagner, “Obfus- cated gradients give a false sense of security: Circumventing defenses to adversarial examples,” inInternational conference on machine learning. PMLR, 2018, pp. 274–283. 3 A. QUALITA TIVE EXAMPLES: THE ROLE OF QUANTIZA TION GRANULARITY Table S4 presents a per-image walkthrough that lays bare how t...