pith. sign in

arxiv: 2605.23448 · v1 · pith:HENTCANTnew · submitted 2026-05-22 · 💻 cs.CR · cs.AI

AI Security Research Should Better Incentivize Defense Research

Pith reviewed 2026-05-25 04:21 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords AI securityattack-defense imbalancedefense research incentivesevaluation standardsusable protectionsvulnerability demonstrationsfederated learninglarge language models
0
0 comments X

The pith

AI security research produces more attack papers than defense papers, evaluated under standards that exaggerate threats and hinder practical protections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper surveys AI security literature and identifies a consistent imbalance favoring attack papers over defense papers in areas such as federated learning, speech recognition, membership inference, and large language models. Attack papers receive lenient evaluation conditions that present threats as more severe than they appear in real use, while defense papers must meet stricter benchmarks that few satisfy. The outcome is a research record filled with demonstrated vulnerabilities but short on protections that actually get deployed. The author concludes that the field should adjust incentives to encourage more defense work.

Core claim

Biased attack-to-defense ratios and evaluation standards across AI security subfields produce a literature rich in demonstrated vulnerabilities yet thin on usable and deployed protections; therefore AI security research should better incentivize defense research.

What carries the argument

The attack-to-defense paper ratio combined with asymmetric evaluation standards that favor attack demonstrations.

If this is right

  • Adjusting incentives would increase the number of defense papers that meet publication criteria.
  • More balanced evaluation would reduce the portrayal of threats as more severe than they are in practice.
  • Subfields such as large language model security would develop more protections that survive real-world testing.
  • The overall literature would shift from vulnerability demonstrations toward solutions that can be adopted.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Conference and journal review processes could incorporate explicit requirements for defense papers to receive equivalent evaluation latitude.
  • Funding bodies might allocate targeted support for replication studies of proposed defenses under realistic conditions.
  • The same ratio and standard issues could be examined in non-AI security domains to test whether the pattern is domain-specific.

Load-bearing premise

That the observed imbalance in paper counts and evaluation standards primarily explains the scarcity of usable protections, rather than the inherent difficulty of building robust defenses.

What would settle it

A systematic review that finds defense papers are held to evaluation conditions comparable to attack papers and identifies a substantial number of defenses that have been deployed in practice.

Figures

Figures reproduced from arXiv: 2605.23448 by Youqian Zhang.

Figure 1
Figure 1. Figure 1: Attack vs. defense paper composition across 16 SoK papers. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Venue distribution of attack vs. defense papers. (a) By venue category; (b) Top 12 [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Attack vs. defense paper counts over time (2010–2025). [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

This work examines an imbalance in artificial intelligence (AI) security research: the field tends to produce more work on attacking AI systems than on defending them. Drawing on related academic papers, we find biased attack-to-defense ratios across subfields, including federated learning, speech recognition, membership inference, large language models, etc. The imbalance possibly means far beyond a simple count: attack papers are routinely evaluated under favorable conditions that make threats look more severe than they are in practice, while defenses are held to a stricter standard that few can meet. The result is a literature rich in demonstrated vulnerabilities and thin on usable and deployed protections. We thus argue that AI security research should better incentivize defense research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that AI security research exhibits a biased ratio favoring attack papers over defense papers across multiple subfields (e.g., federated learning, speech recognition, membership inference, large language models), based on analysis of related academic papers. It argues that this imbalance extends beyond counts: attack papers are routinely evaluated under favorable conditions that exaggerate threats in practice, while defenses face stricter standards few can meet, producing a literature rich in demonstrated vulnerabilities but thin on usable and deployed protections. The authors conclude that the field should better incentivize defense research.

Significance. If the ratios and their interpretive consequences hold, the work identifies a structural feature of the AI security literature that may systematically undervalue practical defenses, potentially explaining the scarcity of deployed protections. It could inform community practices around evaluation standards, publication incentives, and funding allocation.

major comments (2)
  1. [Abstract] Abstract: the claim of 'biased attack-to-defense ratios' across subfields is asserted after 'drawing on related academic papers,' yet the abstract supplies no description of the literature-review methodology, paper-selection criteria, quantitative ratio measurements, sample sizes, or controls for confounding factors such as subfield maturity or venue differences. Without these details the central observational claim cannot be assessed.
  2. [Abstract] Abstract: the further claim that the observed ratios imply attack papers are 'routinely evaluated under favorable conditions that make threats look more severe than they are in practice' while 'defenses are held to a stricter standard that few can meet' is advanced without any comparative analysis of threat models, metrics, baselines, or deployment realism between attack and defense papers. This interpretive step is load-bearing for the argument that the imbalance explains the lack of usable protections, yet no supporting evidence is referenced.
minor comments (1)
  1. [Abstract] The abstract lists example subfields but does not state whether the analysis is exhaustive or representative; adding a brief statement on scope would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the abstract accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of 'biased attack-to-defense ratios' across subfields is asserted after 'drawing on related academic papers,' yet the abstract supplies no description of the literature-review methodology, paper-selection criteria, quantitative ratio measurements, sample sizes, or controls for confounding factors such as subfield maturity or venue differences. Without these details the central observational claim cannot be assessed.

    Authors: We agree the abstract would benefit from greater transparency. The full manuscript details the literature review process, including selection of papers from multiple subfields (e.g., federated learning, speech recognition, membership inference, LLMs), observed ratios with sample sizes, and discussion of potential confounders such as venue and maturity. We will revise the abstract to concisely summarize the methodology, key quantitative measurements, and note that controls for confounding factors are addressed in the main text. revision: yes

  2. Referee: [Abstract] Abstract: the further claim that the observed ratios imply attack papers are 'routinely evaluated under favorable conditions that make threats look more severe than they are in practice' while 'defenses are held to a stricter standard that few can meet' is advanced without any comparative analysis of threat models, metrics, baselines, or deployment realism between attack and defense papers. This interpretive step is load-bearing for the argument that the imbalance explains the lack of usable protections, yet no supporting evidence is referenced.

    Authors: The abstract's interpretive claims summarize comparative analyses of evaluation practices (threat models, metrics, baselines, and deployment realism) that are presented with specific examples in the body of the manuscript. We will revise the abstract to reference the relevant sections and findings that support these comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: observational argument from external citations

full rationale

The paper is a position piece that reports observed attack-to-defense ratios drawn from cited external literature across subfields and offers an interpretive argument about evaluation standards. It contains no equations, derivations, fitted parameters, predictions, or self-citations that serve as load-bearing premises. The central claim rests on external references rather than any internal reduction to the paper's own inputs, satisfying the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a position paper, the argument rests on qualitative assessment of literature with no mathematical structure, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5630 in / 1074 out tokens · 29494 ms · 2026-05-25T04:21:17.218237+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 3 internal anchors

  1. [1]

    SoK: The Faults in Our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems

    Hadi Abdullah, Kevin Warren, Vincent Bindschaedler, Nicolas Papernot, and Patrick Traynor. SoK: The Faults in Our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems. In2021 IEEE symposium on security and privacy (SP), pages 730–747. IEEE, 2021

  2. [2]

    Membership Inference Attacks and Defenses in Federated Learning: A Survey.ACM Computing Surveys, 57 (4):1–35, 2024

    Li Bai, Haibo Hu, Qingqing Ye, Haoyang Li, Leixia Wang, and Jianliang Xu. Membership Inference Attacks and Defenses in Federated Learning: A Survey.ACM Computing Surveys, 57 (4):1–35, 2024. 9

  3. [3]

    SoK: Gradient Inversion Attacks in Federated Learning

    Vincenzo Carletti, Pasquale Foggia, Carlo Mazzocca, Giuseppe Parrella, and Mario Vento. SoK: Gradient Inversion Attacks in Federated Learning. In34th USENIX Security Symposium (USENIX Security 25), pages 6439–6459, 2025

  4. [4]

    Statistics - Publications per year, 2026

    DBLP. Statistics - Publications per year, 2026. URL https://dblp.org/statistics/ publicationsperyear.html

  5. [5]

    Safeguarding Large Language Models: A Survey.Artificial Intelligence Review, 58(12):382, 2025

    Yi Dong, Ronghui Mu, Yanghao Zhang, Siqi Sun, Tianle Zhang, Changshun Wu, Gaojie Jin, Yi Qi, Jinwei Hu, Jie Meng, et al. Safeguarding Large Language Models: A Survey.Artificial Intelligence Review, 58(12):382, 2025

  6. [6]

    SoK: On Gradient Leakage in Federated Learning

    Jiacheng Du, Jiahui Hu, Zhibo Wang, Peng Sun, Neil Gong, Kui Ren, and Chun Chen. SoK: On Gradient Leakage in Federated Learning. In34th USENIX Security Symposium (USENIX Security 25), pages 3045–3064, 2025

  7. [7]

    SoK: Unintended Interactions among Machine Learning Defenses and Risks

    Vasisht Duddu, Sebastian Szyller, and N Asokan. SoK: Unintended Interactions among Machine Learning Defenses and Risks. In2024 IEEE Symposium on Security and Privacy (SP), pages 2996–3014. IEEE, 2024

  8. [8]

    A Review of Deepfake Techniques: Architecture, Detection, and Datasets.IEEE Access, 12:154718–154742, 2024

    Peter Edwards, Jean-Christophe Nebel, Darrel Greenhill, and Xing Liang. A Review of Deepfake Techniques: Architecture, Detection, and Datasets.IEEE Access, 12:154718–154742, 2024

  9. [9]

    A Taxonomic Survey of Model Extraction Attacks

    Didem Genç, Mustafa¨"Ozuysal, and Emrah Tomur. A Taxonomic Survey of Model Extraction Attacks. In2023 IEEE International Conference on Cyber Security and Resilience (CSR), pages 200–205. IEEE, 2023

  10. [10]

    Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, and Muhammad Shafique. Physi- cal Adversarial Attacks for Camera-Based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook.IEEE Access, 11:109617–109668, 2023

  11. [11]

    Jailbreaking LLMs: A Survey of Attacks, Defenses and Evaluation.Authorea Preprints, 2026

    Safayat Bin Hakim, Kanchon Gharami, Nahid Farhady Ghalaty, Shafika Showkat Moni, Shouhuai Xu, and Houbing Herbert Song. Jailbreaking LLMs: A Survey of Attacks, Defenses and Evaluation.Authorea Preprints, 2026

  12. [12]

    SoK: Neural Network Extraction through Physical Side Channels

    Péter Horváth, Dirk Lauret, Zhuoran Liu, and Lejla Batina. SoK: Neural Network Extraction through Physical Side Channels. InUSENIX Security Symposium, 2024

  13. [13]

    Membership Inference Attacks on Machine Learning: A Survey.ACM Computing Surveys (CSUR), 54(11s):1–37, 2022

    Hongsheng Hu, Zoran Salcic, Lichao Sun, Gillian Dobbie, Philip S Yu, and Xuyun Zhang. Membership Inference Attacks on Machine Learning: A Survey.ACM Computing Surveys (CSUR), 54(11s):1–37, 2022

  14. [14]

    Defenses to Membership Inference Attacks: A Survey.ACM Computing Surveys, 56(4):1–34, 2023

    Li Hu, Anli Yan, Hongyang Yan, Jin Li, Teng Huang, Yingying Zhang, Changyu Dong, and Chunsheng Yang. Defenses to Membership Inference Attacks: A Survey.ACM Computing Surveys, 56(4):1–34, 2023

  15. [15]

    SoK: A Review of Differentially Private Linear Models For High-Dimensional Data

    Amol Khanna, Edward Raff, and Nathan Inkawhich. SoK: A Review of Differentially Private Linear Models For High-Dimensional Data. In2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), 2024

  16. [16]

    Prompt-Based Jailbreaking of Leading LLM Chatbots: A Survey of Attacks and Defenses

    Brynn Knowlton, Jovani Campa, David Solis Gallo, Khalil Dajani, and Nabeel Alzahrani. Prompt-Based Jailbreaking of Leading LLM Chatbots: A Survey of Attacks and Defenses. IEEE Transactions on Artificial Intelligence, 2026

  17. [17]

    SoK: Certified Robustness for Deep Neural Networks

    Linyi Li, Tao Xie, and Bo Li. SoK: Certified Robustness for Deep Neural Networks. In2023 IEEE symposium on security and privacy (SP), pages 1289–1310. IEEE, 2023

  18. [18]

    A Survey on Speech Deepfake Detection.ACM Computing Surveys, 57(7):1–38, 2025

    Menglu Li, Yasaman Ahmadiadli, and Xiao-Ping Zhang. A Survey on Speech Deepfake Detection.ACM Computing Surveys, 57(7):1–38, 2025

  19. [19]

    Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society, 2025

    Ziyao Liu, Huanyi Ye, Chen Chen, Yongsen Zheng, and Kwok-Yan Lam. Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society, 2025

  20. [20]

    SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)

    Matthieu Meeus, Igor Shilov, Shubham Jain, Manuel Faysse, Marek Rei, and Yves-Alexandre de Montjoye. SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It). In2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 385–401. IEEE, 2025. 10

  21. [21]

    SoK: All You Need to Know about On-Device ML Model Extraction-The Gap between Research and Practice

    Tushar Nayan, Qiming Guo, Mohammed Al Duniawi, Marcus Botacin, Selcuk Uluagac, and Ruimin Sun. SoK: All You Need to Know about On-Device ML Model Extraction-The Gap between Research and Practice. In33rd USENIX Security Symposium (USENIX Security 24), pages 5233–5250, 2024

  22. [22]

    Physical Adversarial Attacks for Surveillance: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(12):17036–17056, 2023

    Kien Nguyen, Tharindu Fernando, Clinton Fookes, and Sridha Sridharan. Physical Adversarial Attacks for Surveillance: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(12):17036–17056, 2023

  23. [23]

    A Survey of Machine Unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

    Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A Survey of Machine Unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

  24. [24]

    SoK: Explainable Machine Learning in Adver- sarial Environments

    Maximilian Noppel and Christian Wressnegger. SoK: Explainable Machine Learning in Adver- sarial Environments. In2024 IEEE Symposium on Security and Privacy (SP), pages 2441–2459. IEEE, 2024

  25. [25]

    SoK: Critical Evaluation of Quantum Machine Learning for Adversarial Robustness

    Saeefa Rubaiyet Nowmi, Jesus Lopez, Md Mahmudul Alam Imon, Shahrooz Pouryousef, and Mohammad Saidur Rahman. Critical Evaluation of Quantum Machine Learning for Adversarial Robustness.arXiv preprint arXiv:2511.14989, 2025

  26. [26]

    I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences.ACM Computing Surveys, 55 (14s):1–41, 2023

    Daryna Oliynyk, Rudolf Mayer, and Andreas Rauber. I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences.ACM Computing Surveys, 55 (14s):1–41, 2023

  27. [27]

    SoK: Security and Privacy in Machine Learning

    Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael P Wellman. SoK: Security and Privacy in Machine Learning. In2018 IEEE European symposium on security and privacy (EuroS&P), pages 399–414. IEEE, 2018

  28. [28]

    DeepfakeGenerationandDetection: ABenchmark and Survey.ACM Computing Surveys, 2024

    Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, GuangtaoZhai, JianYang, andDachengTao. DeepfakeGenerationandDetection: ABenchmark and Survey.ACM Computing Surveys, 2024

  29. [29]

    SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning

    Ahmed Salem, Giovanni Cherubin, David Evans, Boris K¨"opf, Andrew Paverd, Anshuman Suri, Shruti Tople, and Santiago Zanella-Béguelin. SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning. In2023 IEEE Symposium on Security and Privacy (SP), pages 327–345. IEEE, 2023

  30. [30]

    Adversarial Examples on Object Recognition: A Comprehensive Survey.ACM Computing Surveys (CSUR), 53(3):1–38, 2020

    Alex Serban, Erik Poll, and Joost Visser. Adversarial Examples on Object Recognition: A Comprehensive Survey.ACM Computing Surveys (CSUR), 53(3):1–38, 2020

  31. [31]

    Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems, 36(7):11676–11696, 2024

    Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li. Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems, 36(7):11676–11696, 2024

  32. [32]

    One Pixel Attack for Fooling Deep Neural Networks.IEEE Trans

    Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. One Pixel Attack for Fooling Deep Neural Networks.IEEE Trans. Evol. Comput., 23(5):828–841, 2019. doi: 10.1109/TEVC. 2019.2890858. URLhttps://doi.org/10.1109/TEVC.2019.2890858

  33. [33]

    SoK: Pitfalls in Evaluating Black-Box Attacks

    Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, and David Evans. SoK: Pitfalls in Evaluating Black-Box Attacks. In2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 387–407. IEEE, 2024

  34. [34]

    Intriguing properties of neural networks

    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. Intriguing Properties of Neural Networks. In Yoshua Bengio and Yann LeCun, editors,2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014. URL http://arxiv...

  35. [35]

    Ad- versarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey.Electronics, 11(14):2183, 2022

    Hao Tan, Le Wang, Huan Zhang, Junjian Zhang, Muhammad Shafiq, and Zhaoquan Gu. Ad- versarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey.Electronics, 11(14):2183, 2022

  36. [36]

    SoK: Evaluating Jailbreak Guardrails for Large Language Models.arXiv preprint arXiv:2506.10597, 2025

    Xunguang Wang, Zhenlan Ji, Wenxuan Wang, Zongjie Li, Daoyuan Wu, and Shuai Wang. SoK: Evaluating Jailbreak Guardrails for Large Language Models.arXiv preprint arXiv:2506.10597, 2025. 11

  37. [37]

    Physical Adversarial Attack Meets Computer Vision: A Decade Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9797–9817, 2024

    Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, and Zheng Wang. Physical Adversarial Attack Meets Computer Vision: A Decade Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9797–9817, 2024

  38. [38]

    Visual Adversarial Attacks and Defenses in the Physical World: A Survey.ACM Computing Surveys, 58(10):1–36, 2026

    Xingxing Wei, Bangzheng Pu, Shiji Zhao, Jiefan Lu, and Baoyuan Wu. Visual Adversarial Attacks and Defenses in the Physical World: A Survey.ACM Computing Surveys, 58(10):1–36, 2026

  39. [39]

    SoK: Anti-Facial Recognition Technology

    Emily Wenger, Shawn Shan, Haitao Zheng, and Ben Y Zhao. SoK: Anti-Facial Recognition Technology. In2023 IEEE Symposium on Security and Privacy (SP), pages 864–881. IEEE, 2023

  40. [40]

    Membership Inference Attacks on Large-Scale Models: A Survey

    Hengyu Wu and Yang Cao. Membership Inference Attacks on Large-Scale Models: A Survey. arXiv preprint arXiv:2503.19338, 2025

  41. [41]

    Gradient Leakage Attacks in Federated Learning: Research Frontiers, Taxonomy, and Future Directions.IEEE Network, 38(2):247–254, 2023

    Haomiao Yang, Mengyu Ge, Dongyun Xue, Kunlan Xiang, Hongwei Li, and Rongxing Lu. Gradient Leakage Attacks in Federated Learning: Research Frontiers, Taxonomy, and Future Directions.IEEE Network, 38(2):247–254, 2023

  42. [42]

    Jailbreak Attacks and Defenses Against Large Language Models: A Survey

    Sibo Yi, Yule Liu, Zhen Sun, Tianshuo Cong, Xinlei He, Jiaxing Song, Ke Xu, and Qi Li. Jailbreak Attacks and Defenses Against Large Language Models: A Survey.arXiv preprint arXiv:2407.04295, 2024

  43. [43]

    The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape.ACM Computing Surveys, 57(9):1–37, 2025

    Joshua Zhao, Saurabh Bagchi, Salman Avestimehr, Kevin Chan, Somali Chaterji, Dimitris Dimitriadis, Jiacheng Li, Ninghui Li, Arash Nourian, and Holger Roth. The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape.ACM Computing Surveys, 57(9):1–37, 2025. 12 A SoKs Table 2: Overview of SoK pap...

  44. [44]

    IEEE EuroS&P 2018 SoK: Security and Privacy in Machine Learning 40 22 1.8

  45. [45]

    IEEE S&P 2021 SoK: The Faults in Our ASRs: An Overview of Attacks against ASR and Speaker ID Systems 39 20 2.0

  46. [46]

    IEEE S&P 2023 SoK: Certified Robustness for Deep Neural Networks 19 162 0.1

  47. [47]

    IEEE S&P 2023 SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in ML 45 6 7.5

  48. [48]

    IEEE S&P 2023 SoK: Anti-Facial Recognition Technology 48 13 3.7

  49. [49]

    IEEE S&P 2024 SoK: Unintended Interactions among Machine Learning Defenses and Risks 78 60 1.3

  50. [50]

    IEEE S&P 2024 SoK: Explainable Machine Learning in Adversarial Environments 50 68 0.7

  51. [51]

    IEEE SaTML 2024 SoK: A Review of Differentially Private Linear Models for High-Dimensional Data 0 31 0.0

  52. [52]

    IEEE SaTML 2024 SoK: Pitfalls in Evaluating Black-Box Attacks 172 5 34.4

  53. [53]

    IEEE SaTML 2025 SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It) 43 8 5.4

  54. [54]

    2024 SoK: Neural Network Extraction through Physical Side Channels 70 16 4.4

    USENIX Sec. 2024 SoK: Neural Network Extraction through Physical Side Channels 70 16 4.4

  55. [55]

    2024 SoK: All You Need to Know about On-Device ML Model Extraction 30 26 1.2

    USENIX Sec. 2024 SoK: All You Need to Know about On-Device ML Model Extraction 30 26 1.2

  56. [56]

    2025 SoK: Gradient Inversion Attacks in Federated Learning 50 28 1.8

    USENIX Sec. 2025 SoK: Gradient Inversion Attacks in Federated Learning 50 28 1.8

  57. [57]

    2025 SoK: On Gradient Leakage in Federated Learning 40 14 2.9

    USENIX Sec. 2025 SoK: On Gradient Leakage in Federated Learning 40 14 2.9

  58. [58]

    arXiv 2025 Critical Evaluation of Quantum Machine Learning for Adversarial Robustness 28 26 1.1

  59. [59]

    #A and #D denote the number of directly related attack and defense papers cited by each survey, respectively

    arXiv 2025 SoK: Evaluating Jailbreak Guardrails for Large Language Models 35 58 0.6 13 B Surveys Table 3: Overview of survey papers on AI security and privacy, sorted by type and year. #A and #D denote the number of directly related attack and defense papers cited by each survey, respectively. “—” indicates data not reported. Type Paper Y ear #A #D A:D Ad...

  60. [60]

    2026 56 31 1.81 Deepfake

  61. [61]

    2025 — 46 — Federated Learning

  62. [62]

    2026 41 8 5.13 Machine Unlearning

  63. [63]

    2025 25 — — Model Stealing

  64. [64]

    2023 76 36 2.11 Speech [35] 2022 29 16 1.81 14