AI Security Research Should Better Incentivize Defense Research

Youqian Zhang

arxiv: 2605.23448 · v1 · pith:HENTCANTnew · submitted 2026-05-22 · 💻 cs.CR · cs.AI

AI Security Research Should Better Incentivize Defense Research

Youqian Zhang This is my paper

Pith reviewed 2026-05-25 04:21 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords AI securityattack-defense imbalancedefense research incentivesevaluation standardsusable protectionsvulnerability demonstrationsfederated learninglarge language models

0 comments

The pith

AI security research produces more attack papers than defense papers, evaluated under standards that exaggerate threats and hinder practical protections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper surveys AI security literature and identifies a consistent imbalance favoring attack papers over defense papers in areas such as federated learning, speech recognition, membership inference, and large language models. Attack papers receive lenient evaluation conditions that present threats as more severe than they appear in real use, while defense papers must meet stricter benchmarks that few satisfy. The outcome is a research record filled with demonstrated vulnerabilities but short on protections that actually get deployed. The author concludes that the field should adjust incentives to encourage more defense work.

Core claim

Biased attack-to-defense ratios and evaluation standards across AI security subfields produce a literature rich in demonstrated vulnerabilities yet thin on usable and deployed protections; therefore AI security research should better incentivize defense research.

What carries the argument

The attack-to-defense paper ratio combined with asymmetric evaluation standards that favor attack demonstrations.

If this is right

Adjusting incentives would increase the number of defense papers that meet publication criteria.
More balanced evaluation would reduce the portrayal of threats as more severe than they are in practice.
Subfields such as large language model security would develop more protections that survive real-world testing.
The overall literature would shift from vulnerability demonstrations toward solutions that can be adopted.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Conference and journal review processes could incorporate explicit requirements for defense papers to receive equivalent evaluation latitude.
Funding bodies might allocate targeted support for replication studies of proposed defenses under realistic conditions.
The same ratio and standard issues could be examined in non-AI security domains to test whether the pattern is domain-specific.

Load-bearing premise

That the observed imbalance in paper counts and evaluation standards primarily explains the scarcity of usable protections, rather than the inherent difficulty of building robust defenses.

What would settle it

A systematic review that finds defense papers are held to evaluation conditions comparable to attack papers and identifies a substantial number of defenses that have been deployed in practice.

Figures

Figures reproduced from arXiv: 2605.23448 by Youqian Zhang.

**Figure 2.** Figure 2: Venue distribution of attack vs. defense papers. (a) By venue category; (b) Top 12 [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Attack vs. defense paper counts over time (2010–2025). [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

This work examines an imbalance in artificial intelligence (AI) security research: the field tends to produce more work on attacking AI systems than on defending them. Drawing on related academic papers, we find biased attack-to-defense ratios across subfields, including federated learning, speech recognition, membership inference, large language models, etc. The imbalance possibly means far beyond a simple count: attack papers are routinely evaluated under favorable conditions that make threats look more severe than they are in practice, while defenses are held to a stricter standard that few can meet. The result is a literature rich in demonstrated vulnerabilities and thin on usable and deployed protections. We thus argue that AI security research should better incentivize defense research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Position paper flags attack-defense imbalance in AI security but rests the stronger claims on unshown analysis of paper ratios and evaluation standards.

read the letter

This paper observes that AI security produces more attack papers than defense papers across areas like federated learning, speech recognition, membership inference, and LLMs, and argues that incentives should shift to favor defense work. The core point is that the current skew leaves the literature full of demonstrated threats but short on usable protections that hold up in practice. That observation aligns with patterns seen in other security fields, so the application to these AI subfields is the main addition here. It does flag a real issue worth discussing: if attack papers get published more easily under loose conditions while defenses must clear higher bars, the field could end up overestimating risks and under-delivering on fixes. The call for better incentives is straightforward and could prompt useful conversation among people setting research agendas or reviewing grants. The soft spot is that the paper does not show how the ratios were measured or provide direct comparisons of evaluation practices. The abstract says the ratios come from drawing on related papers, but there are no details on search methods, inclusion criteria, or controls for subfield differences. The interpretation that attacks routinely face favorable conditions while defenses face stricter ones is presented as a possible meaning beyond the count, yet no side-by-side look at threat models, metrics, or deployment realism appears in the provided text. Alternative factors like the inherent difficulty of building robust defenses are not addressed. This is a position piece rather than a data-driven study, so it is best suited for readers already working in AI security who want to think about field incentives. It is coherent on its own terms and engages honestly with the literature it cites. A serious editor should send it to peer review so referees can check the full counting method and see whether the evaluation-bias claim holds up or needs more evidence.

Referee Report

2 major / 1 minor

Summary. The paper claims that AI security research exhibits a biased ratio favoring attack papers over defense papers across multiple subfields (e.g., federated learning, speech recognition, membership inference, large language models), based on analysis of related academic papers. It argues that this imbalance extends beyond counts: attack papers are routinely evaluated under favorable conditions that exaggerate threats in practice, while defenses face stricter standards few can meet, producing a literature rich in demonstrated vulnerabilities but thin on usable and deployed protections. The authors conclude that the field should better incentivize defense research.

Significance. If the ratios and their interpretive consequences hold, the work identifies a structural feature of the AI security literature that may systematically undervalue practical defenses, potentially explaining the scarcity of deployed protections. It could inform community practices around evaluation standards, publication incentives, and funding allocation.

major comments (2)

[Abstract] Abstract: the claim of 'biased attack-to-defense ratios' across subfields is asserted after 'drawing on related academic papers,' yet the abstract supplies no description of the literature-review methodology, paper-selection criteria, quantitative ratio measurements, sample sizes, or controls for confounding factors such as subfield maturity or venue differences. Without these details the central observational claim cannot be assessed.
[Abstract] Abstract: the further claim that the observed ratios imply attack papers are 'routinely evaluated under favorable conditions that make threats look more severe than they are in practice' while 'defenses are held to a stricter standard that few can meet' is advanced without any comparative analysis of threat models, metrics, baselines, or deployment realism between attack and defense papers. This interpretive step is load-bearing for the argument that the imbalance explains the lack of usable protections, yet no supporting evidence is referenced.

minor comments (1)

[Abstract] The abstract lists example subfields but does not state whether the analysis is exhaustive or representative; adding a brief statement on scope would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the abstract accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of 'biased attack-to-defense ratios' across subfields is asserted after 'drawing on related academic papers,' yet the abstract supplies no description of the literature-review methodology, paper-selection criteria, quantitative ratio measurements, sample sizes, or controls for confounding factors such as subfield maturity or venue differences. Without these details the central observational claim cannot be assessed.

Authors: We agree the abstract would benefit from greater transparency. The full manuscript details the literature review process, including selection of papers from multiple subfields (e.g., federated learning, speech recognition, membership inference, LLMs), observed ratios with sample sizes, and discussion of potential confounders such as venue and maturity. We will revise the abstract to concisely summarize the methodology, key quantitative measurements, and note that controls for confounding factors are addressed in the main text. revision: yes
Referee: [Abstract] Abstract: the further claim that the observed ratios imply attack papers are 'routinely evaluated under favorable conditions that make threats look more severe than they are in practice' while 'defenses are held to a stricter standard that few can meet' is advanced without any comparative analysis of threat models, metrics, baselines, or deployment realism between attack and defense papers. This interpretive step is load-bearing for the argument that the imbalance explains the lack of usable protections, yet no supporting evidence is referenced.

Authors: The abstract's interpretive claims summarize comparative analyses of evaluation practices (threat models, metrics, baselines, and deployment realism) that are presented with specific examples in the body of the manuscript. We will revise the abstract to reference the relevant sections and findings that support these comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: observational argument from external citations

full rationale

The paper is a position piece that reports observed attack-to-defense ratios drawn from cited external literature across subfields and offers an interpretive argument about evaluation standards. It contains no equations, derivations, fitted parameters, predictions, or self-citations that serve as load-bearing premises. The central claim rests on external references rather than any internal reduction to the paper's own inputs, satisfying the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a position paper, the argument rests on qualitative assessment of literature with no mathematical structure, free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5630 in / 1074 out tokens · 29494 ms · 2026-05-25T04:21:17.218237+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We find biased attack-to-defense ratios across subfields... attack papers are routinely evaluated under favorable conditions that make threats look more severe than they are in practice, while defenses are held to a stricter standard
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The imbalance possibly means far beyond a simple count

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 3 internal anchors

[1]

SoK: The Faults in Our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems

Hadi Abdullah, Kevin Warren, Vincent Bindschaedler, Nicolas Papernot, and Patrick Traynor. SoK: The Faults in Our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems. In2021 IEEE symposium on security and privacy (SP), pages 730–747. IEEE, 2021

work page 2021
[2]

Membership Inference Attacks and Defenses in Federated Learning: A Survey.ACM Computing Surveys, 57 (4):1–35, 2024

Li Bai, Haibo Hu, Qingqing Ye, Haoyang Li, Leixia Wang, and Jianliang Xu. Membership Inference Attacks and Defenses in Federated Learning: A Survey.ACM Computing Surveys, 57 (4):1–35, 2024. 9

work page 2024
[3]

SoK: Gradient Inversion Attacks in Federated Learning

Vincenzo Carletti, Pasquale Foggia, Carlo Mazzocca, Giuseppe Parrella, and Mario Vento. SoK: Gradient Inversion Attacks in Federated Learning. In34th USENIX Security Symposium (USENIX Security 25), pages 6439–6459, 2025

work page 2025
[4]

Statistics - Publications per year, 2026

DBLP. Statistics - Publications per year, 2026. URL https://dblp.org/statistics/ publicationsperyear.html

work page 2026
[5]

Safeguarding Large Language Models: A Survey.Artificial Intelligence Review, 58(12):382, 2025

Yi Dong, Ronghui Mu, Yanghao Zhang, Siqi Sun, Tianle Zhang, Changshun Wu, Gaojie Jin, Yi Qi, Jinwei Hu, Jie Meng, et al. Safeguarding Large Language Models: A Survey.Artificial Intelligence Review, 58(12):382, 2025

work page 2025
[6]

SoK: On Gradient Leakage in Federated Learning

Jiacheng Du, Jiahui Hu, Zhibo Wang, Peng Sun, Neil Gong, Kui Ren, and Chun Chen. SoK: On Gradient Leakage in Federated Learning. In34th USENIX Security Symposium (USENIX Security 25), pages 3045–3064, 2025

work page 2025
[7]

SoK: Unintended Interactions among Machine Learning Defenses and Risks

Vasisht Duddu, Sebastian Szyller, and N Asokan. SoK: Unintended Interactions among Machine Learning Defenses and Risks. In2024 IEEE Symposium on Security and Privacy (SP), pages 2996–3014. IEEE, 2024

work page 2024
[8]

A Review of Deepfake Techniques: Architecture, Detection, and Datasets.IEEE Access, 12:154718–154742, 2024

Peter Edwards, Jean-Christophe Nebel, Darrel Greenhill, and Xing Liang. A Review of Deepfake Techniques: Architecture, Detection, and Datasets.IEEE Access, 12:154718–154742, 2024

work page 2024
[9]

A Taxonomic Survey of Model Extraction Attacks

Didem Genç, Mustafa¨"Ozuysal, and Emrah Tomur. A Taxonomic Survey of Model Extraction Attacks. In2023 IEEE International Conference on Cyber Security and Resilience (CSR), pages 200–205. IEEE, 2023

work page 2023
[10]

Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, and Muhammad Shafique. Physi- cal Adversarial Attacks for Camera-Based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook.IEEE Access, 11:109617–109668, 2023

work page 2023
[11]

Jailbreaking LLMs: A Survey of Attacks, Defenses and Evaluation.Authorea Preprints, 2026

Safayat Bin Hakim, Kanchon Gharami, Nahid Farhady Ghalaty, Shafika Showkat Moni, Shouhuai Xu, and Houbing Herbert Song. Jailbreaking LLMs: A Survey of Attacks, Defenses and Evaluation.Authorea Preprints, 2026

work page 2026
[12]

SoK: Neural Network Extraction through Physical Side Channels

Péter Horváth, Dirk Lauret, Zhuoran Liu, and Lejla Batina. SoK: Neural Network Extraction through Physical Side Channels. InUSENIX Security Symposium, 2024

work page 2024
[13]

Membership Inference Attacks on Machine Learning: A Survey.ACM Computing Surveys (CSUR), 54(11s):1–37, 2022

Hongsheng Hu, Zoran Salcic, Lichao Sun, Gillian Dobbie, Philip S Yu, and Xuyun Zhang. Membership Inference Attacks on Machine Learning: A Survey.ACM Computing Surveys (CSUR), 54(11s):1–37, 2022

work page 2022
[14]

Defenses to Membership Inference Attacks: A Survey.ACM Computing Surveys, 56(4):1–34, 2023

Li Hu, Anli Yan, Hongyang Yan, Jin Li, Teng Huang, Yingying Zhang, Changyu Dong, and Chunsheng Yang. Defenses to Membership Inference Attacks: A Survey.ACM Computing Surveys, 56(4):1–34, 2023

work page 2023
[15]

SoK: A Review of Differentially Private Linear Models For High-Dimensional Data

Amol Khanna, Edward Raff, and Nathan Inkawhich. SoK: A Review of Differentially Private Linear Models For High-Dimensional Data. In2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), 2024

work page 2024
[16]

Prompt-Based Jailbreaking of Leading LLM Chatbots: A Survey of Attacks and Defenses

Brynn Knowlton, Jovani Campa, David Solis Gallo, Khalil Dajani, and Nabeel Alzahrani. Prompt-Based Jailbreaking of Leading LLM Chatbots: A Survey of Attacks and Defenses. IEEE Transactions on Artificial Intelligence, 2026

work page 2026
[17]

SoK: Certified Robustness for Deep Neural Networks

Linyi Li, Tao Xie, and Bo Li. SoK: Certified Robustness for Deep Neural Networks. In2023 IEEE symposium on security and privacy (SP), pages 1289–1310. IEEE, 2023

work page 2023
[18]

A Survey on Speech Deepfake Detection.ACM Computing Surveys, 57(7):1–38, 2025

Menglu Li, Yasaman Ahmadiadli, and Xiao-Ping Zhang. A Survey on Speech Deepfake Detection.ACM Computing Surveys, 57(7):1–38, 2025

work page 2025
[19]

Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society, 2025

Ziyao Liu, Huanyi Ye, Chen Chen, Yongsen Zheng, and Kwok-Yan Lam. Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society, 2025

work page 2025
[20]

SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)

Matthieu Meeus, Igor Shilov, Shubham Jain, Manuel Faysse, Marek Rei, and Yves-Alexandre de Montjoye. SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It). In2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 385–401. IEEE, 2025. 10

work page 2025
[21]

SoK: All You Need to Know about On-Device ML Model Extraction-The Gap between Research and Practice

Tushar Nayan, Qiming Guo, Mohammed Al Duniawi, Marcus Botacin, Selcuk Uluagac, and Ruimin Sun. SoK: All You Need to Know about On-Device ML Model Extraction-The Gap between Research and Practice. In33rd USENIX Security Symposium (USENIX Security 24), pages 5233–5250, 2024

work page 2024
[22]

Physical Adversarial Attacks for Surveillance: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(12):17036–17056, 2023

Kien Nguyen, Tharindu Fernando, Clinton Fookes, and Sridha Sridharan. Physical Adversarial Attacks for Surveillance: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(12):17036–17056, 2023

work page 2023
[23]

A Survey of Machine Unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A Survey of Machine Unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

work page 2025
[24]

SoK: Explainable Machine Learning in Adver- sarial Environments

Maximilian Noppel and Christian Wressnegger. SoK: Explainable Machine Learning in Adver- sarial Environments. In2024 IEEE Symposium on Security and Privacy (SP), pages 2441–2459. IEEE, 2024

work page 2024
[25]

SoK: Critical Evaluation of Quantum Machine Learning for Adversarial Robustness

Saeefa Rubaiyet Nowmi, Jesus Lopez, Md Mahmudul Alam Imon, Shahrooz Pouryousef, and Mohammad Saidur Rahman. Critical Evaluation of Quantum Machine Learning for Adversarial Robustness.arXiv preprint arXiv:2511.14989, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[26]

I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences.ACM Computing Surveys, 55 (14s):1–41, 2023

Daryna Oliynyk, Rudolf Mayer, and Andreas Rauber. I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences.ACM Computing Surveys, 55 (14s):1–41, 2023

work page 2023
[27]

SoK: Security and Privacy in Machine Learning

Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael P Wellman. SoK: Security and Privacy in Machine Learning. In2018 IEEE European symposium on security and privacy (EuroS&P), pages 399–414. IEEE, 2018

work page 2018
[28]

DeepfakeGenerationandDetection: ABenchmark and Survey.ACM Computing Surveys, 2024

Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, GuangtaoZhai, JianYang, andDachengTao. DeepfakeGenerationandDetection: ABenchmark and Survey.ACM Computing Surveys, 2024

work page 2024
[29]

SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning

Ahmed Salem, Giovanni Cherubin, David Evans, Boris K¨"opf, Andrew Paverd, Anshuman Suri, Shruti Tople, and Santiago Zanella-Béguelin. SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning. In2023 IEEE Symposium on Security and Privacy (SP), pages 327–345. IEEE, 2023

work page 2023
[30]

Adversarial Examples on Object Recognition: A Comprehensive Survey.ACM Computing Surveys (CSUR), 53(3):1–38, 2020

Alex Serban, Erik Poll, and Joost Visser. Adversarial Examples on Object Recognition: A Comprehensive Survey.ACM Computing Surveys (CSUR), 53(3):1–38, 2020

work page 2020
[31]

Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems, 36(7):11676–11696, 2024

Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li. Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems, 36(7):11676–11696, 2024

work page 2024
[32]

One Pixel Attack for Fooling Deep Neural Networks.IEEE Trans

Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. One Pixel Attack for Fooling Deep Neural Networks.IEEE Trans. Evol. Comput., 23(5):828–841, 2019. doi: 10.1109/TEVC. 2019.2890858. URLhttps://doi.org/10.1109/TEVC.2019.2890858

work page doi:10.1109/tevc 2019
[33]

SoK: Pitfalls in Evaluating Black-Box Attacks

Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, and David Evans. SoK: Pitfalls in Evaluating Black-Box Attacks. In2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 387–407. IEEE, 2024

work page 2024
[34]

Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. Intriguing Properties of Neural Networks. In Yoshua Bengio and Yann LeCun, editors,2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014. URL http://arxiv...

work page internal anchor Pith review Pith/arXiv arXiv 2014
[35]

Ad- versarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey.Electronics, 11(14):2183, 2022

Hao Tan, Le Wang, Huan Zhang, Junjian Zhang, Muhammad Shafiq, and Zhaoquan Gu. Ad- versarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey.Electronics, 11(14):2183, 2022

work page 2022
[36]

SoK: Evaluating Jailbreak Guardrails for Large Language Models.arXiv preprint arXiv:2506.10597, 2025

Xunguang Wang, Zhenlan Ji, Wenxuan Wang, Zongjie Li, Daoyuan Wu, and Shuai Wang. SoK: Evaluating Jailbreak Guardrails for Large Language Models.arXiv preprint arXiv:2506.10597, 2025. 11

work page arXiv 2025
[37]

Physical Adversarial Attack Meets Computer Vision: A Decade Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9797–9817, 2024

Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, and Zheng Wang. Physical Adversarial Attack Meets Computer Vision: A Decade Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9797–9817, 2024

work page 2024
[38]

Visual Adversarial Attacks and Defenses in the Physical World: A Survey.ACM Computing Surveys, 58(10):1–36, 2026

Xingxing Wei, Bangzheng Pu, Shiji Zhao, Jiefan Lu, and Baoyuan Wu. Visual Adversarial Attacks and Defenses in the Physical World: A Survey.ACM Computing Surveys, 58(10):1–36, 2026

work page 2026
[39]

SoK: Anti-Facial Recognition Technology

Emily Wenger, Shawn Shan, Haitao Zheng, and Ben Y Zhao. SoK: Anti-Facial Recognition Technology. In2023 IEEE Symposium on Security and Privacy (SP), pages 864–881. IEEE, 2023

work page 2023
[40]

Membership Inference Attacks on Large-Scale Models: A Survey

Hengyu Wu and Yang Cao. Membership Inference Attacks on Large-Scale Models: A Survey. arXiv preprint arXiv:2503.19338, 2025

work page arXiv 2025
[41]

Gradient Leakage Attacks in Federated Learning: Research Frontiers, Taxonomy, and Future Directions.IEEE Network, 38(2):247–254, 2023

Haomiao Yang, Mengyu Ge, Dongyun Xue, Kunlan Xiang, Hongwei Li, and Rongxing Lu. Gradient Leakage Attacks in Federated Learning: Research Frontiers, Taxonomy, and Future Directions.IEEE Network, 38(2):247–254, 2023

work page 2023
[42]

Jailbreak Attacks and Defenses Against Large Language Models: A Survey

Sibo Yi, Yule Liu, Zhen Sun, Tianshuo Cong, Xinlei He, Jiaxing Song, Ke Xu, and Qi Li. Jailbreak Attacks and Defenses Against Large Language Models: A Survey.arXiv preprint arXiv:2407.04295, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[43]

The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape.ACM Computing Surveys, 57(9):1–37, 2025

Joshua Zhao, Saurabh Bagchi, Salman Avestimehr, Kevin Chan, Somali Chaterji, Dimitris Dimitriadis, Jiacheng Li, Ninghui Li, Arash Nourian, and Holger Roth. The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape.ACM Computing Surveys, 57(9):1–37, 2025. 12 A SoKs Table 2: Overview of SoK pap...

work page 2025
[44]

IEEE EuroS&P 2018 SoK: Security and Privacy in Machine Learning 40 22 1.8

work page 2018
[45]

IEEE S&P 2021 SoK: The Faults in Our ASRs: An Overview of Attacks against ASR and Speaker ID Systems 39 20 2.0

work page 2021
[46]

IEEE S&P 2023 SoK: Certified Robustness for Deep Neural Networks 19 162 0.1

work page 2023
[47]

IEEE S&P 2023 SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in ML 45 6 7.5

work page 2023
[48]

IEEE S&P 2023 SoK: Anti-Facial Recognition Technology 48 13 3.7

work page 2023
[49]

IEEE S&P 2024 SoK: Unintended Interactions among Machine Learning Defenses and Risks 78 60 1.3

work page 2024
[50]

IEEE S&P 2024 SoK: Explainable Machine Learning in Adversarial Environments 50 68 0.7

work page 2024
[51]

IEEE SaTML 2024 SoK: A Review of Differentially Private Linear Models for High-Dimensional Data 0 31 0.0

work page 2024
[52]

IEEE SaTML 2024 SoK: Pitfalls in Evaluating Black-Box Attacks 172 5 34.4

work page 2024
[53]

IEEE SaTML 2025 SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It) 43 8 5.4

work page 2025
[54]

2024 SoK: Neural Network Extraction through Physical Side Channels 70 16 4.4

USENIX Sec. 2024 SoK: Neural Network Extraction through Physical Side Channels 70 16 4.4

work page 2024
[55]

2024 SoK: All You Need to Know about On-Device ML Model Extraction 30 26 1.2

USENIX Sec. 2024 SoK: All You Need to Know about On-Device ML Model Extraction 30 26 1.2

work page 2024
[56]

2025 SoK: Gradient Inversion Attacks in Federated Learning 50 28 1.8

USENIX Sec. 2025 SoK: Gradient Inversion Attacks in Federated Learning 50 28 1.8

work page 2025
[57]

2025 SoK: On Gradient Leakage in Federated Learning 40 14 2.9

USENIX Sec. 2025 SoK: On Gradient Leakage in Federated Learning 40 14 2.9

work page 2025
[58]

arXiv 2025 Critical Evaluation of Quantum Machine Learning for Adversarial Robustness 28 26 1.1

work page 2025
[59]

#A and #D denote the number of directly related attack and defense papers cited by each survey, respectively

arXiv 2025 SoK: Evaluating Jailbreak Guardrails for Large Language Models 35 58 0.6 13 B Surveys Table 3: Overview of survey papers on AI security and privacy, sorted by type and year. #A and #D denote the number of directly related attack and defense papers cited by each survey, respectively. “—” indicates data not reported. Type Paper Y ear #A #D A:D Ad...

work page 2025
[60]

2026 56 31 1.81 Deepfake

work page 2026
[61]

2025 — 46 — Federated Learning

work page 2025
[62]

2026 41 8 5.13 Machine Unlearning

work page 2026
[63]

2025 25 — — Model Stealing

work page 2025
[64]

2023 76 36 2.11 Speech [35] 2022 29 16 1.81 14

work page 2023

[1] [1]

SoK: The Faults in Our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems

Hadi Abdullah, Kevin Warren, Vincent Bindschaedler, Nicolas Papernot, and Patrick Traynor. SoK: The Faults in Our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems. In2021 IEEE symposium on security and privacy (SP), pages 730–747. IEEE, 2021

work page 2021

[2] [2]

Membership Inference Attacks and Defenses in Federated Learning: A Survey.ACM Computing Surveys, 57 (4):1–35, 2024

Li Bai, Haibo Hu, Qingqing Ye, Haoyang Li, Leixia Wang, and Jianliang Xu. Membership Inference Attacks and Defenses in Federated Learning: A Survey.ACM Computing Surveys, 57 (4):1–35, 2024. 9

work page 2024

[3] [3]

SoK: Gradient Inversion Attacks in Federated Learning

Vincenzo Carletti, Pasquale Foggia, Carlo Mazzocca, Giuseppe Parrella, and Mario Vento. SoK: Gradient Inversion Attacks in Federated Learning. In34th USENIX Security Symposium (USENIX Security 25), pages 6439–6459, 2025

work page 2025

[4] [4]

Statistics - Publications per year, 2026

DBLP. Statistics - Publications per year, 2026. URL https://dblp.org/statistics/ publicationsperyear.html

work page 2026

[5] [5]

Safeguarding Large Language Models: A Survey.Artificial Intelligence Review, 58(12):382, 2025

Yi Dong, Ronghui Mu, Yanghao Zhang, Siqi Sun, Tianle Zhang, Changshun Wu, Gaojie Jin, Yi Qi, Jinwei Hu, Jie Meng, et al. Safeguarding Large Language Models: A Survey.Artificial Intelligence Review, 58(12):382, 2025

work page 2025

[6] [6]

SoK: On Gradient Leakage in Federated Learning

Jiacheng Du, Jiahui Hu, Zhibo Wang, Peng Sun, Neil Gong, Kui Ren, and Chun Chen. SoK: On Gradient Leakage in Federated Learning. In34th USENIX Security Symposium (USENIX Security 25), pages 3045–3064, 2025

work page 2025

[7] [7]

SoK: Unintended Interactions among Machine Learning Defenses and Risks

Vasisht Duddu, Sebastian Szyller, and N Asokan. SoK: Unintended Interactions among Machine Learning Defenses and Risks. In2024 IEEE Symposium on Security and Privacy (SP), pages 2996–3014. IEEE, 2024

work page 2024

[8] [8]

A Review of Deepfake Techniques: Architecture, Detection, and Datasets.IEEE Access, 12:154718–154742, 2024

Peter Edwards, Jean-Christophe Nebel, Darrel Greenhill, and Xing Liang. A Review of Deepfake Techniques: Architecture, Detection, and Datasets.IEEE Access, 12:154718–154742, 2024

work page 2024

[9] [9]

A Taxonomic Survey of Model Extraction Attacks

Didem Genç, Mustafa¨"Ozuysal, and Emrah Tomur. A Taxonomic Survey of Model Extraction Attacks. In2023 IEEE International Conference on Cyber Security and Resilience (CSR), pages 200–205. IEEE, 2023

work page 2023

[10] [10]

Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, and Muhammad Shafique. Physi- cal Adversarial Attacks for Camera-Based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook.IEEE Access, 11:109617–109668, 2023

work page 2023

[11] [11]

Jailbreaking LLMs: A Survey of Attacks, Defenses and Evaluation.Authorea Preprints, 2026

Safayat Bin Hakim, Kanchon Gharami, Nahid Farhady Ghalaty, Shafika Showkat Moni, Shouhuai Xu, and Houbing Herbert Song. Jailbreaking LLMs: A Survey of Attacks, Defenses and Evaluation.Authorea Preprints, 2026

work page 2026

[12] [12]

SoK: Neural Network Extraction through Physical Side Channels

Péter Horváth, Dirk Lauret, Zhuoran Liu, and Lejla Batina. SoK: Neural Network Extraction through Physical Side Channels. InUSENIX Security Symposium, 2024

work page 2024

[13] [13]

Membership Inference Attacks on Machine Learning: A Survey.ACM Computing Surveys (CSUR), 54(11s):1–37, 2022

Hongsheng Hu, Zoran Salcic, Lichao Sun, Gillian Dobbie, Philip S Yu, and Xuyun Zhang. Membership Inference Attacks on Machine Learning: A Survey.ACM Computing Surveys (CSUR), 54(11s):1–37, 2022

work page 2022

[14] [14]

Defenses to Membership Inference Attacks: A Survey.ACM Computing Surveys, 56(4):1–34, 2023

Li Hu, Anli Yan, Hongyang Yan, Jin Li, Teng Huang, Yingying Zhang, Changyu Dong, and Chunsheng Yang. Defenses to Membership Inference Attacks: A Survey.ACM Computing Surveys, 56(4):1–34, 2023

work page 2023

[15] [15]

SoK: A Review of Differentially Private Linear Models For High-Dimensional Data

Amol Khanna, Edward Raff, and Nathan Inkawhich. SoK: A Review of Differentially Private Linear Models For High-Dimensional Data. In2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), 2024

work page 2024

[16] [16]

Prompt-Based Jailbreaking of Leading LLM Chatbots: A Survey of Attacks and Defenses

Brynn Knowlton, Jovani Campa, David Solis Gallo, Khalil Dajani, and Nabeel Alzahrani. Prompt-Based Jailbreaking of Leading LLM Chatbots: A Survey of Attacks and Defenses. IEEE Transactions on Artificial Intelligence, 2026

work page 2026

[17] [17]

SoK: Certified Robustness for Deep Neural Networks

Linyi Li, Tao Xie, and Bo Li. SoK: Certified Robustness for Deep Neural Networks. In2023 IEEE symposium on security and privacy (SP), pages 1289–1310. IEEE, 2023

work page 2023

[18] [18]

A Survey on Speech Deepfake Detection.ACM Computing Surveys, 57(7):1–38, 2025

Menglu Li, Yasaman Ahmadiadli, and Xiao-Ping Zhang. A Survey on Speech Deepfake Detection.ACM Computing Surveys, 57(7):1–38, 2025

work page 2025

[19] [19]

Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society, 2025

Ziyao Liu, Huanyi Ye, Chen Chen, Yongsen Zheng, and Kwok-Yan Lam. Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society, 2025

work page 2025

[20] [20]

SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)

Matthieu Meeus, Igor Shilov, Shubham Jain, Manuel Faysse, Marek Rei, and Yves-Alexandre de Montjoye. SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It). In2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 385–401. IEEE, 2025. 10

work page 2025

[21] [21]

SoK: All You Need to Know about On-Device ML Model Extraction-The Gap between Research and Practice

Tushar Nayan, Qiming Guo, Mohammed Al Duniawi, Marcus Botacin, Selcuk Uluagac, and Ruimin Sun. SoK: All You Need to Know about On-Device ML Model Extraction-The Gap between Research and Practice. In33rd USENIX Security Symposium (USENIX Security 24), pages 5233–5250, 2024

work page 2024

[22] [22]

Physical Adversarial Attacks for Surveillance: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(12):17036–17056, 2023

Kien Nguyen, Tharindu Fernando, Clinton Fookes, and Sridha Sridharan. Physical Adversarial Attacks for Surveillance: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(12):17036–17056, 2023

work page 2023

[23] [23]

A Survey of Machine Unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A Survey of Machine Unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

work page 2025

[24] [24]

SoK: Explainable Machine Learning in Adver- sarial Environments

Maximilian Noppel and Christian Wressnegger. SoK: Explainable Machine Learning in Adver- sarial Environments. In2024 IEEE Symposium on Security and Privacy (SP), pages 2441–2459. IEEE, 2024

work page 2024

[25] [25]

SoK: Critical Evaluation of Quantum Machine Learning for Adversarial Robustness

Saeefa Rubaiyet Nowmi, Jesus Lopez, Md Mahmudul Alam Imon, Shahrooz Pouryousef, and Mohammad Saidur Rahman. Critical Evaluation of Quantum Machine Learning for Adversarial Robustness.arXiv preprint arXiv:2511.14989, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[26] [26]

I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences.ACM Computing Surveys, 55 (14s):1–41, 2023

Daryna Oliynyk, Rudolf Mayer, and Andreas Rauber. I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences.ACM Computing Surveys, 55 (14s):1–41, 2023

work page 2023

[27] [27]

SoK: Security and Privacy in Machine Learning

Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael P Wellman. SoK: Security and Privacy in Machine Learning. In2018 IEEE European symposium on security and privacy (EuroS&P), pages 399–414. IEEE, 2018

work page 2018

[28] [28]

DeepfakeGenerationandDetection: ABenchmark and Survey.ACM Computing Surveys, 2024

Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, GuangtaoZhai, JianYang, andDachengTao. DeepfakeGenerationandDetection: ABenchmark and Survey.ACM Computing Surveys, 2024

work page 2024

[29] [29]

SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning

Ahmed Salem, Giovanni Cherubin, David Evans, Boris K¨"opf, Andrew Paverd, Anshuman Suri, Shruti Tople, and Santiago Zanella-Béguelin. SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning. In2023 IEEE Symposium on Security and Privacy (SP), pages 327–345. IEEE, 2023

work page 2023

[30] [30]

Adversarial Examples on Object Recognition: A Comprehensive Survey.ACM Computing Surveys (CSUR), 53(3):1–38, 2020

Alex Serban, Erik Poll, and Joost Visser. Adversarial Examples on Object Recognition: A Comprehensive Survey.ACM Computing Surveys (CSUR), 53(3):1–38, 2020

work page 2020

[31] [31]

Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems, 36(7):11676–11696, 2024

Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li. Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems, 36(7):11676–11696, 2024

work page 2024

[32] [32]

One Pixel Attack for Fooling Deep Neural Networks.IEEE Trans

Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. One Pixel Attack for Fooling Deep Neural Networks.IEEE Trans. Evol. Comput., 23(5):828–841, 2019. doi: 10.1109/TEVC. 2019.2890858. URLhttps://doi.org/10.1109/TEVC.2019.2890858

work page doi:10.1109/tevc 2019

[33] [33]

SoK: Pitfalls in Evaluating Black-Box Attacks

Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, and David Evans. SoK: Pitfalls in Evaluating Black-Box Attacks. In2024 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pages 387–407. IEEE, 2024

work page 2024

[34] [34]

Intriguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. Intriguing Properties of Neural Networks. In Yoshua Bengio and Yann LeCun, editors,2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014. URL http://arxiv...

work page internal anchor Pith review Pith/arXiv arXiv 2014

[35] [35]

Ad- versarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey.Electronics, 11(14):2183, 2022

Hao Tan, Le Wang, Huan Zhang, Junjian Zhang, Muhammad Shafiq, and Zhaoquan Gu. Ad- versarial Attack and Defense Strategies of Speaker Recognition Systems: A Survey.Electronics, 11(14):2183, 2022

work page 2022

[36] [36]

SoK: Evaluating Jailbreak Guardrails for Large Language Models.arXiv preprint arXiv:2506.10597, 2025

Xunguang Wang, Zhenlan Ji, Wenxuan Wang, Zongjie Li, Daoyuan Wu, and Shuai Wang. SoK: Evaluating Jailbreak Guardrails for Large Language Models.arXiv preprint arXiv:2506.10597, 2025. 11

work page arXiv 2025

[37] [37]

Physical Adversarial Attack Meets Computer Vision: A Decade Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9797–9817, 2024

Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, and Zheng Wang. Physical Adversarial Attack Meets Computer Vision: A Decade Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9797–9817, 2024

work page 2024

[38] [38]

Visual Adversarial Attacks and Defenses in the Physical World: A Survey.ACM Computing Surveys, 58(10):1–36, 2026

Xingxing Wei, Bangzheng Pu, Shiji Zhao, Jiefan Lu, and Baoyuan Wu. Visual Adversarial Attacks and Defenses in the Physical World: A Survey.ACM Computing Surveys, 58(10):1–36, 2026

work page 2026

[39] [39]

SoK: Anti-Facial Recognition Technology

Emily Wenger, Shawn Shan, Haitao Zheng, and Ben Y Zhao. SoK: Anti-Facial Recognition Technology. In2023 IEEE Symposium on Security and Privacy (SP), pages 864–881. IEEE, 2023

work page 2023

[40] [40]

Membership Inference Attacks on Large-Scale Models: A Survey

Hengyu Wu and Yang Cao. Membership Inference Attacks on Large-Scale Models: A Survey. arXiv preprint arXiv:2503.19338, 2025

work page arXiv 2025

[41] [41]

Gradient Leakage Attacks in Federated Learning: Research Frontiers, Taxonomy, and Future Directions.IEEE Network, 38(2):247–254, 2023

Haomiao Yang, Mengyu Ge, Dongyun Xue, Kunlan Xiang, Hongwei Li, and Rongxing Lu. Gradient Leakage Attacks in Federated Learning: Research Frontiers, Taxonomy, and Future Directions.IEEE Network, 38(2):247–254, 2023

work page 2023

[42] [42]

Jailbreak Attacks and Defenses Against Large Language Models: A Survey

Sibo Yi, Yule Liu, Zhen Sun, Tianshuo Cong, Xinlei He, Jiaxing Song, Ke Xu, and Qi Li. Jailbreak Attacks and Defenses Against Large Language Models: A Survey.arXiv preprint arXiv:2407.04295, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[43] [43]

The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape.ACM Computing Surveys, 57(9):1–37, 2025

Joshua Zhao, Saurabh Bagchi, Salman Avestimehr, Kevin Chan, Somali Chaterji, Dimitris Dimitriadis, Jiacheng Li, Ninghui Li, Arash Nourian, and Holger Roth. The Federation Strikes Back: A Survey of Federated Learning Privacy Attacks, Defenses, Applications, and Policy Landscape.ACM Computing Surveys, 57(9):1–37, 2025. 12 A SoKs Table 2: Overview of SoK pap...

work page 2025

[44] [44]

IEEE EuroS&P 2018 SoK: Security and Privacy in Machine Learning 40 22 1.8

work page 2018

[45] [45]

IEEE S&P 2021 SoK: The Faults in Our ASRs: An Overview of Attacks against ASR and Speaker ID Systems 39 20 2.0

work page 2021

[46] [46]

IEEE S&P 2023 SoK: Certified Robustness for Deep Neural Networks 19 162 0.1

work page 2023

[47] [47]

IEEE S&P 2023 SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in ML 45 6 7.5

work page 2023

[48] [48]

IEEE S&P 2023 SoK: Anti-Facial Recognition Technology 48 13 3.7

work page 2023

[49] [49]

IEEE S&P 2024 SoK: Unintended Interactions among Machine Learning Defenses and Risks 78 60 1.3

work page 2024

[50] [50]

IEEE S&P 2024 SoK: Explainable Machine Learning in Adversarial Environments 50 68 0.7

work page 2024

[51] [51]

IEEE SaTML 2024 SoK: A Review of Differentially Private Linear Models for High-Dimensional Data 0 31 0.0

work page 2024

[52] [52]

IEEE SaTML 2024 SoK: Pitfalls in Evaluating Black-Box Attacks 172 5 34.4

work page 2024

[53] [53]

IEEE SaTML 2025 SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It) 43 8 5.4

work page 2025

[54] [54]

2024 SoK: Neural Network Extraction through Physical Side Channels 70 16 4.4

USENIX Sec. 2024 SoK: Neural Network Extraction through Physical Side Channels 70 16 4.4

work page 2024

[55] [55]

2024 SoK: All You Need to Know about On-Device ML Model Extraction 30 26 1.2

USENIX Sec. 2024 SoK: All You Need to Know about On-Device ML Model Extraction 30 26 1.2

work page 2024

[56] [56]

2025 SoK: Gradient Inversion Attacks in Federated Learning 50 28 1.8

USENIX Sec. 2025 SoK: Gradient Inversion Attacks in Federated Learning 50 28 1.8

work page 2025

[57] [57]

2025 SoK: On Gradient Leakage in Federated Learning 40 14 2.9

USENIX Sec. 2025 SoK: On Gradient Leakage in Federated Learning 40 14 2.9

work page 2025

[58] [58]

arXiv 2025 Critical Evaluation of Quantum Machine Learning for Adversarial Robustness 28 26 1.1

work page 2025

[59] [59]

#A and #D denote the number of directly related attack and defense papers cited by each survey, respectively

arXiv 2025 SoK: Evaluating Jailbreak Guardrails for Large Language Models 35 58 0.6 13 B Surveys Table 3: Overview of survey papers on AI security and privacy, sorted by type and year. #A and #D denote the number of directly related attack and defense papers cited by each survey, respectively. “—” indicates data not reported. Type Paper Y ear #A #D A:D Ad...

work page 2025

[60] [60]

2026 56 31 1.81 Deepfake

work page 2026

[61] [61]

2025 — 46 — Federated Learning

work page 2025

[62] [62]

2026 41 8 5.13 Machine Unlearning

work page 2026

[63] [63]

2025 25 — — Model Stealing

work page 2025

[64] [64]

2023 76 36 2.11 Speech [35] 2022 29 16 1.81 14

work page 2023