pith. sign in

arxiv: 2605.21780 · v1 · pith:AA2NFI4Jnew · submitted 2026-05-20 · 💻 cs.LG · cs.CR

Provable Robustness against Backdoor Attacks via the Primal-Dual Perspective on Differential Privacy

Pith reviewed 2026-05-22 09:07 UTC · model grok-4.3

classification 💻 cs.LG cs.CR
keywords robustness certificationbackdoor attacksdifferential privacyrandomized smoothingprivacy profilesDP-SGDmachine learning securityadversarial robustness
0
0 comments X

The pith

Privacy profiles connect randomized smoothing to differential privacy for joint certification of robustness against backdoor attacks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that treats randomized smoothing for robustness through the dual lens of differential privacy. Privacy profiles supply a numerical method to compose distinct mechanisms, such as those used in training and at inference time, into one end-to-end certificate. This matters because backdoor attacks can alter both the training data and the test inputs, so separate analyses of each phase leave gaps. The approach reuses existing analyses of differentially private algorithms to obtain modular guarantees for composed systems. Experiments on MNIST and CIFAR-10 show the framework produces usable certificates under realistic threat models.

Core claim

By connecting randomized smoothing to the dual view of differential privacy through privacy profiles, which provide a numerical procedure for composing heterogeneous mechanisms, the framework enables tight, modular, end-to-end certification of complex, composed mechanisms while leveraging existing analyses of differentially private mechanisms for joint robustness against training-time and inference-time attacks.

What carries the argument

Privacy profiles, which give a numerical procedure for composing heterogeneous differentially private mechanisms into a joint robustness certificate.

If this is right

  • Joint certificates become available for DP-SGD training combined with randomized inference against attacks that perturb both phases.
  • Existing differential privacy analyses can be plugged in directly to obtain backdoor robustness bounds without new derivations.
  • Complex composed mechanisms receive tight modular certificates rather than loose separate bounds for training and test phases.
  • The same composition method supports certification under threat models that mix training-time and inference-time perturbations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The composition technique may extend to other composite attacks if their perturbation sets admit similar privacy-profile bounds.
  • Pre-computing profiles for common mechanisms could reduce certification cost for repeated use on new models.
  • The primal-dual link suggests similar dual perspectives might tighten certificates for other randomized defenses beyond smoothing.

Load-bearing premise

Privacy profiles can compose training and inference mechanisms so that the resulting joint analysis produces valid robustness certificates against backdoor attacks.

What would settle it

A concrete backdoor attack that violates the joint robustness bound computed by the framework for DP-SGD training plus inference-time smoothing on a standard dataset would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.21780 by Aman Saxena, Jan Schuchardt, Stephan G\"unnemann, Yan Scholten.

Figure 1
Figure 1. Figure 1: Overview of our framework and novel perspective: The training and classification pipeline [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Minimum over decomposed tradeoff functions vs. the tradeoff function for the non￾decomposed relation(≈100). Λr+,r− : subsampled Gaussian (γ = 0.00256, σ = 0.05). Advantages of the Dual Perspective Numeri￾cal composition and leveraging the DP literature: In contrast to the primal perspective, representing mechanisms via their privacy profiles enables ef￾ficient, algorithm-agnostic numerical accounting, e.g.… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of certified accuracies of our method (Privacy Profile – Numerical) for DP-SGD with epoch 8, with RDP-based accounting and DPA using 100, 150, 200 partitions for the MNIST dataset. 6.2 Joint Robustness Certification We consider two training threat models: (i) poisoning via up to R training-example additions or deletions, and (ii) perturbations of up to δtrain in ℓ2 norm applied to at most R trai… view at source ↗
Figure 5
Figure 5. Figure 5: Certified accuracy vs. number of additions/deletions in the training dataset ( [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Certified accuracy vs. the number of training examples that can be perturbed up to [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Certified accuracy vs. number of additions/deletions in the training dataset ( [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of certified accuracies of our method (Privacy Profile – Numerical) for DP￾SGD with epoch 10, with RDP-based accounting and DPA using 100 partitions for CIFAR-10. 0 10 20 30 40 50 60 Radius 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Percentage correctly certified Certified accuracy vs radius (Epoch 12) DPA DP-SGD (Privacy Profile -- Gaussian Approximation) DP-SGD (RDP -- Antidote) DP-SGD (RDP -- Corrected)… view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of certified accuracies of our method (Privacy Profile – Numerical) for DP-SGD with epoch 10, with RDP-based ac￾counting and DPA using 100, 150, 200 partitions for MNIST. 0 13 27 40 53 67 80 Radius 0.0 0.2 0.4 0.6 0.8 1.0 Percentage correctly certified Certified accuracy vs radius (Epoch 12) DPA (k=100) DPA (k=150) DPA (k=200) DP-SGD (PP -- Gaussian Approximation) DP-SGD (RDP -- Antidote) DP-SG… view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of certified accuracies of our method (Privacy Profile – Numerical) for DP-SGD with epoch 15, with RDP-based ac￾counting and DPA using 100, 150, 200 partitions for MNIST. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_13.png] view at source ↗
read the original abstract

Randomized smoothing is a powerful tool for certifying robustness to adversarial perturbations, including poisoning attacks via randomized training and evasion attacks via randomized inference. Extending these guarantees to backdoor attacks, where training and test data are jointly perturbed, remains challenging because training- and test-time randomized mechanisms must be analyzed within a single robustness certificate. We address this by connecting randomized smoothing to the dual view of differential privacy through privacy profiles, which provide a numerical procedure for composing heterogeneous mechanisms. The resulting framework enables tight, modular, end-to-end certification of complex, composed mechanisms while leveraging existing analyses of differentially private mechanisms. We instantiate the framework for DP-SGD and Deep Partition Aggregation with inference-time smoothing, deriving joint robustness guarantees against both training-time and inference-time attacks. Experiments on MNIST and CIFAR-10 demonstrate the effectiveness of our framework. Overall, we provide a principled and general framework for using composite mechanisms to certify robustness under complex threat models that better capture the capabilities of real-world adversaries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper claims that randomized smoothing can be connected to the dual view of differential privacy via privacy profiles to enable modular, end-to-end robustness certification against backdoor attacks. These attacks jointly perturb training data (poisoning) and test inputs (triggers). The framework composes existing DP analyses (e.g., of DP-SGD) with inference-time smoothing to derive joint certificates, instantiated for DP-SGD and Deep Partition Aggregation, with experiments on MNIST and CIFAR-10 showing effectiveness.

Significance. If the central reduction holds, the work would be significant for providing a principled way to obtain tight, composable certificates under threat models that combine training-time and inference-time attacks, which better reflect real adversaries. A strength is the reuse of existing DP mechanism analyses through numerical privacy-profile composition rather than deriving new bounds from scratch.

major comments (1)
  1. [§3] §3 (Framework): The claim that privacy profiles yield valid joint robustness certificates for backdoor attacks requires explicit handling of the dependence between the training-set modification and the test-time trigger. Standard composition assumes independent invocations, but the backdoor setting correlates the perturbations; the manuscript should derive or bound the total variation shift under this joint perturbation (e.g., via a modified privacy profile or reduction in the dual view) to ensure the certificate is not invalid or overly loose.
minor comments (2)
  1. Notation for privacy profiles and the dual view should be introduced with a short self-contained definition or reference to the exact prior work used, to improve readability for readers unfamiliar with the primal-dual DP perspective.
  2. The experimental section would benefit from reporting the tightness of the derived certificates (e.g., comparison of certified radii to empirical attack success rates) rather than only effectiveness demonstrations.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for identifying a point that merits clarification in our framework. We address the major comment below and will revise the manuscript to make the relevant reduction explicit.

read point-by-point responses
  1. Referee: [§3] §3 (Framework): The claim that privacy profiles yield valid joint robustness certificates for backdoor attacks requires explicit handling of the dependence between the training-set modification and the test-time trigger. Standard composition assumes independent invocations, but the backdoor setting correlates the perturbations; the manuscript should derive or bound the total variation shift under this joint perturbation (e.g., via a modified privacy profile or reduction in the dual view) to ensure the certificate is not invalid or overly loose.

    Authors: We appreciate the referee highlighting the need to address correlation explicitly. In the manuscript, privacy profiles are used precisely because they characterize the worst-case output divergence (via the dual formulation) between any pair of neighboring datasets or inputs. The backdoor threat model is captured by treating the joint (poisoning + trigger) perturbation as defining a single neighboring pair in an extended input space; the numerical composition of the training-time DP mechanism and the inference-time smoothing mechanism is then applied to this pair. Because the profile is taken over the supremum divergence, the bound automatically accounts for any dependence introduced by the adversary choosing the trigger after (or jointly with) the poisoning. Nevertheless, we agree that spelling out this reduction would remove any ambiguity. We will add a short proposition and proof sketch in §3 showing that the total-variation shift under the joint perturbation is upper-bounded by the composed privacy profile, confirming that the resulting certificate remains valid. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external DP composition and randomized smoothing analyses

full rationale

The paper connects randomized smoothing to the dual view of differential privacy via privacy profiles to compose mechanisms for joint train/inference robustness certificates. This leverages pre-existing analyses of DP mechanisms (e.g., DP-SGD) rather than fitting parameters or redefining results in terms of the target backdoor certificates. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations that reduce the central claim to unverified inputs are present. The composition procedure is treated as an independent numerical tool from the DP literature, making the end-to-end certification self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions from differential privacy and randomized smoothing literature, with no new free parameters or invented entities introduced in the abstract.

axioms (1)
  • domain assumption Existing analyses of differentially private mechanisms can be leveraged for the joint robustness certificate.
    Invoked when stating that the framework leverages existing DP analyses.

pith-pipeline@v0.9.0 · 5708 in / 1246 out tokens · 46488 ms · 2026-05-22T09:07:08.909974+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages · 2 internal anchors

  1. [1]

    Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang

    Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. InProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, page 308–318, New York, NY , USA, 2016. Association for Computing Machinery. ISBN 9781450341394

  2. [2]

    Privacy amplification by subsampling: Tight analyses via couplings and divergences

    Borja Balle, Gilles Barthe, and Marco Gaboardi. Privacy amplification by subsampling: Tight analyses via couplings and divergences. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018

  3. [3]

    Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs

    Gilles Barthe and Federico Olmedo. Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs. InInternational Colloquium on Automata, Languages, and Programming, pages 49–60. Springer, 2013

  4. [4]

    Evasion attacks against machine learning at test time

    Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndi´c, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. Evasion attacks against machine learning at test time. In Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and Filip Železný, editors,Machine Learning and Knowledge Discovery in Databases, pages 387–402, Berlin, Heidelberg, 2...

  5. [5]

    Efficient robustness certificates for discrete data: Sparsity-aware randomized smoothing for graphs, images and more, 2023

    Aleksandar Bojchevski, Johannes Gasteiger, and Stephan Günnemann. Efficient robustness certificates for discrete data: Sparsity-aware randomized smoothing for graphs, images and more, 2023

  6. [6]

    Dp-instahide: Provably defusing poisoning and backdoor attacks with differentially private data augmentations, 2021

    Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam Fowl, Arjun Gupta, Amin Ghi- asi, Furong Huang, Micah Goldblum, and Tom Goldstein. Dp-instahide: Provably defusing poisoning and backdoor attacks with differentially private data augmentations, 2021

  7. [7]

    Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

    Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Targeted backdoor attacks on deep learning systems using data poisoning.arXiv preprint arXiv:1712.05526, 2017

  8. [8]

    Choquette-Choo, Arun Ganesh, Thomas Steinke, and Abhradeep Thakurta

    Christopher A. Choquette-Choo, Arun Ganesh, Thomas Steinke, and Abhradeep Thakurta. Privacy amplification for matrix mechanisms, 2024. URL https://arxiv.org/abs/2310. 15526

  9. [9]

    Certified adversarial robustness via randomized smoothing

    Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 1310–1320. PMLR, 09–15 Jun 2019

  10. [10]

    Smith, and Borja Balle

    Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlocking high-accuracy differentially private image classification through scale, 2022

  11. [11]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InCVPR, 2009

  12. [12]

    Gaussian Differential Privacy

    Jinshuo Dong, Aaron Roth, and Weijie J Su. Gaussian differential privacy.arXiv preprint arXiv:1905.02383, 2019

  13. [13]

    Connect the dots: Tighter discrete approximations of privacy loss distributions, 2022

    Vadym Doroshenko, Badih Ghazi, Pritish Kamath, Ravi Kumar, and Pasin Manurangsi. Connect the dots: Tighter discrete approximations of privacy loss distributions, 2022. URL https: //arxiv.org/abs/2207.04380. 10

  14. [14]

    Robust anomaly detection and backdoor attack detection via differential privacy, 2019

    Min Du, Ruoxi Jia, and Dawn Song. Robust anomaly detection and backdoor attack detection via differential privacy, 2019

  15. [15]

    Quantum noise protects quantum classifiers against adversaries.Physical Review Research, 3(2), May 2021

    Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, Dacheng Tao, and Nana Liu. Quantum noise protects quantum classifiers against adversaries.Physical Review Research, 3(2), May 2021. ISSN 2643-1564

  16. [16]

    A framework for robustness certifica- tion of smoothed classifiers using f-divergences

    Krishnamurthy Dj Dvijotham, Jamie Hayes, Borja Balle, Zico Kolter, Chongli Qin, Andras Gyorgy, Kai Xiao, Sven Gowal, and Pushmeet Kohli. A framework for robustness certifica- tion of smoothed classifiers using f-divergences. InInternational Conference on Learning Representations, 2020

  17. [17]

    The algorithmic foundations of differential privacy.Founda- tions and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014

    Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.Founda- tions and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014. ISSN 1551-305X

  18. [18]

    Our data, ourselves: Privacy via distributed noise generation

    Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Serge Vaudenay, editor,Advances in Cryptology - EUROCRYPT 2006, pages 486–503, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg. ISBN 978-3-540-34547-3

  19. [19]

    Calibrating noise to sensitivity in private data analysis

    Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. InTheory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006

  20. [20]

    Rothblum, and Salil P

    Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. Boosting and differential privacy.2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 51–60, 2010. URL https://api.semanticscholar.org/CorpusID:9132611

  21. [21]

    Efficient privacy loss accounting for subsampling and random allocation, 2026

    Vitaly Feldman and Moshe Shenfeld. Efficient privacy loss accounting for subsampling and random allocation, 2026. URLhttps://arxiv.org/abs/2602.17284

  22. [22]

    Simon Geisler, Tom Wollschläger, M. H. I. Abdalla, Johannes Gasteiger, and Stephan Günne- mann. Attacking large language models with projected gradient descent, 2025

  23. [23]

    Faster privacy accounting via evolving discretization, 2022

    Badih Ghazi, Pritish Kamath, Ravi Kumar, and Pasin Manurangsi. Faster privacy accounting via evolving discretization, 2022. URLhttps://arxiv.org/abs/2207.04381

  24. [24]

    Google differential privacy library

    Google. Google differential privacy library. https://github.com/google/ differential-privacy, 2023. Accessed: 2026-04

  25. [25]

    Numerical composition of differential privacy, 2021

    Sivakanth Gopi, Yin Tat Lee, and Lukas Wutschitz. Numerical composition of differential privacy, 2021. URLhttps://arxiv.org/abs/2106.02848

  26. [26]

    Certifying graph neural networks against label and structure poisoning

    Lukas Gosch, Xichuan Chen, Yan Scholten, and Stephan Günnemann. Certifying graph neural networks against label and structure poisoning. InInternational Conference on Machine Learning (ICML), 2026

  27. [27]

    Provable robustness against a union of l0 adversarial attacks.Proceedings of the AAAI Conference on Artificial Intelligence, 38(19):21134–21142, March 2024

    Zayd Hammoudeh and Daniel Lowd. Provable robustness against a union of l0 adversarial attacks.Proceedings of the AAAI Conference on Artificial Intelligence, 38(19):21134–21142, March 2024. ISSN 2159-5399

  28. [28]

    Deep residual learning for im- age recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for im- age recognition. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

  29. [29]

    Intrinsic certified robustness of bagging against data poisoning attacks, 2020

    Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. Intrinsic certified robustness of bagging against data poisoning attacks, 2020

  30. [30]

    Certified robustness of nearest neighbors against data poisoning and backdoor attacks, 2021

    Jinyuan Jia, Yupei Liu, Xiaoyu Cao, and Neil Zhenqiang Gong. Certified robustness of nearest neighbors against data poisoning and backdoor attacks, 2021

  31. [31]

    The composition theorem for differential privacy

    Peter Kairouz, Sewoong Oh, and Pramod Viswanath. The composition theorem for differential privacy. InInternational conference on machine learning, pages 1376–1385. PMLR, 2015. 11

  32. [32]

    Dill, Kyle Julian, and Mykel J

    Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. Reluplex: An efficient smt solver for verifying deep neural networks. In Rupak Majumdar and Viktor Kunˇcak, editors,Computer Aided Verification, pages 97–117, Cham, 2017. Springer International Pub- lishing. ISBN 978-3-319-63387-9

  33. [33]

    Computing differential privacy guarantees for heterogeneous compositions using fft, 2021

    Antti Koskela and Antti Honkela. Computing differential privacy guarantees for heterogeneous compositions using fft, 2021. URLhttps://arxiv.org/abs/2102.12412

  34. [34]

    Computing tight differential privacy guarantees using fft

    Antti Koskela, Joonas Jälkö, and Antti Honkela. Computing tight differential privacy guarantees using fft. In Silvia Chiappa and Roberto Calandra, editors,Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 ofProceedings of Machine Learning Research, pages 2560–2569. PMLR, 26–28 Aug 2020

  35. [35]

    Tight differential privacy for discrete-valued mechanisms and for the subsampled gaussian mechanism using fft, 2021

    Antti Koskela, Joonas Jälkö, Lukas Prediger, and Antti Honkela. Tight differential privacy for discrete-valued mechanisms and for the subsampled gaussian mechanism using fft, 2021. URL https://arxiv.org/abs/2006.07134

  36. [36]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009

  37. [37]

    Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

    Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

  38. [38]

    Certified robustness to adversarial examples with differential privacy

    Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, and Suman Jana. Certified robustness to adversarial examples with differential privacy. In2019 IEEE Symposium on Security and Privacy (SP), pages 656–672, 2019

  39. [39]

    Tight certificates of adversarial robustness for randomly smoothed classifiers

    Guang-He Lee, Yang Yuan, Shiyu Chang, and Tommi Jaakkola. Tight certificates of adversarial robustness for randomly smoothed classifiers. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

  40. [40]

    Jaakkola

    Guang-He Lee, Yang Yuan, Shiyu Chang, and Tommi S. Jaakkola. Tight certificates of adversarial robustness for randomly smoothed classifiers, 2020

  41. [41]

    Deep partition aggregation: Provable defenses against general poisoning attacks

    Alexander Levine and Soheil Feizi. Deep partition aggregation: Provable defenses against general poisoning attacks. InInternational Conference on Learning Representations, 2021

  42. [42]

    Sok: Certified robustness for deep neural networks

    Linyi Li, Tao Xie, and Bo Li. Sok: Certified robustness for deep neural networks. In2023 IEEE symposium on security and privacy (SP), pages 1289–1310. IEEE, 2023

  43. [43]

    Cullen, Paul Montague, Sarah M

    Shijie Liu, Andrew C. Cullen, Paul Montague, Sarah M. Erfani, and Benjamin I. P. Rubin- stein. Enhancing the antidote: improved pointwise certifications against poisoning attacks. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth ...

  44. [44]

    ISBN 978-1-57735-880-0

  45. [45]

    Trojaning attack on neural networks

    Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. Trojaning attack on neural networks. In25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-221, 2018. The Internet Society, 2018

  46. [46]

    Fullcert: Deterministic end-to-end certification for training and inference of neural networks, 2024

    Tobias Lorenz, Marta Kwiatkowska, and Mario Fritz. Fullcert: Deterministic end-to-end certification for training and inference of neural networks, 2024

  47. [47]

    Adaptive randomized smoothing: Certified adversarial robustness for multi-step defences

    Saiyue Lyu, Shadab Shaikh, Frederick Shpilevskiy, Evan Shelhamer, and Mathias Lécuyer. Adaptive randomized smoothing: Certified adversarial robustness for multi-step defences. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors, Advances in Neural Information Processing Systems, volume 37, pages 134043–134074. Cur...

  48. [48]

    Tight on budget?: Tight bounds for r-fold approx- imate differential privacy.Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018

    Sebastian Meiser and Esfandiar Mohammadi. Tight on budget?: Tight bounds for r-fold approx- imate differential privacy.Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, 2018. URL https://api.semanticscholar.org/CorpusID: 52504941

  49. [49]

    Local and central differential privacy for robustness and privacy in federated learning, 2022

    Mohammad Naseri, Jamie Hayes, and Emiliano De Cristofaro. Local and central differential privacy for robustness and privacy in federated learning, 2022

  50. [50]

    Jerzy Neyman and Egon Sharpe Pearson. Ix. on the problem of the most efficient tests of statistical hypotheses.Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231(694-706):289–337, 1933

  51. [51]

    Pytorch: An imperative style, high-performance deep learning library

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, et al. Pytorch: An imperative style, high-performance deep learning library. InAdvances in Neural Information Processing Systems (NeurIPS), 2019

  52. [52]

    Cert-ssbd: Certified backdoor defense with sample-specific smoothing noises.IEEE Transactions on Information Forensics and Security, 21:2446–2461, 2025

    Ting Qiao, Yingjia Wang, Xing Liu, Sixing Wu, Jianbin Li, and Yiming Li. Cert-ssbd: Certified backdoor defense with sample-specific smoothing noises.IEEE Transactions on Information Forensics and Security, 21:2446–2461, 2025

  53. [53]

    Run-off election: Improved provable defense against data poisoning attacks, 2023

    Keivan Rezaei, Kiarash Banihashem, Atoosa Chegini, and Soheil Feizi. Run-off election: Improved provable defense against data poisoning attacks, 2023

  54. [54]

    Optimal conversion from rényi differential privacy tof-differential privacy, 2026

    Anneliese Riess, Juan Felipe Gomez, Flavio du Pin Calmon, Julia Anne Schnabel, and Georgios Kaissis. Optimal conversion from rényi differential privacy tof-differential privacy, 2026

  55. [55]

    Zico Kolter

    Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, and J. Zico Kolter. Certified robustness to label-flipping attacks via randomized smoothing, 2020

  56. [56]

    Provably robust deep learning via adversarially trained smoothed classifiers

    Hadi Salman, Jerry Li, Ilya Razenshteyn, Pengchuan Zhang, Huan Zhang, Sebastien Bubeck, and Greg Yang. Provably robust deep learning via adversarially trained smoothed classifiers. Advances in neural information processing systems, 32, 2019

  57. [57]

    Certifiably robust encoding schemes, 2024

    Aman Saxena, Tom Wollschläger, Nicola Franco, Jeanette Miriam Lorenz, and Stephan Günne- mann. Certifiably robust encoding schemes, 2024

  58. [58]

    Provably reliable conformal prediction sets in the presence of data poisoning

    Yan Scholten and Stephan Günnemann. Provably reliable conformal prediction sets in the presence of data poisoning. InICLR, 2025

  59. [59]

    Randomized message-interception smoothing: Gray-box certificates for graph neural networks

    Yan Scholten, Jan Schuchardt, Simon Geisler, Aleksandar Bojchevski, and Stephan Günnemann. Randomized message-interception smoothing: Gray-box certificates for graph neural networks. InNeurIPS, 2022

  60. [60]

    Hierarchical randomized smoothing

    Yan Scholten, Jan Schuchardt, Aleksandar Bojchevski, and Stephan Günnemann. Hierarchical randomized smoothing. InNeurIPS, 2023

  61. [61]

    PhD thesis, Technische Universität München, 2026

    Jan Schuchardt.Probabilistic Gray-Box Robustness Certification for Graph Neural Networks. PhD thesis, Technische Universität München, 2026. URL https://mediatum.ub.tum.de/ node?id=1797198

  62. [62]

    Unified mechanism- specific amplification by subsampling and group privacy amplification, 2024

    Jan Schuchardt, Mihail Stoian, Arthur Kosmala, and Stephan Günnemann. Unified mechanism- specific amplification by subsampling and group privacy amplification, 2024

  63. [63]

    Privacy amplification by structured subsampling for deep differ- entially private time series forecasting, 2025

    Jan Schuchardt, Mina Dalirrooyfard, Jed Guzelkabaagac, Anderson Schneider, Yuriy Nevmy- vaka, and Stephan Günnemann. Privacy amplification by structured subsampling for deep differ- entially private time series forecasting, 2025. URLhttps://arxiv.org/abs/2502.02410

  64. [64]

    Adversarial attacks and defenses in large language models: Old and new threats, 2023

    Leo Schwinn, David Dobre, Stephan Günnemann, and Gauthier Gidel. Adversarial attacks and defenses in large language models: Old and new threats, 2023

  65. [65]

    Survey of vulnerabilities in large language models revealed by adversarial attacks, 2023

    Erfan Shayegani, Md Abdullah Al Mamun, Yu Fu, Pedram Zaree, Yue Dong, and Nael Abu- Ghazaleh. Survey of vulnerabilities in large language models revealed by adversarial attacks, 2023. 13

  66. [66]

    Privacy loss classes: The central limit theorem in differential privacy.Cryptology ePrint Archive, 2018

    David Sommer, Sebastian Meiser, and Esfandiar Mohammadi. Privacy loss classes: The central limit theorem in differential privacy.Cryptology ePrint Archive, 2018

  67. [67]

    Certified robustness to data poisoning in gradient-based training.Transactions on Machine Learning Research, 2025

    Philip Sosnin, Mark Niklas Mueller, Maximilian Baader, Calvin Tsay, and Matthew Robert Wicker. Certified robustness to data poisoning in gradient-based training.Transactions on Machine Learning Research, 2025. ISSN 2835-8856

  68. [68]

    CROWD: Certified robustness via weight distribution for smoothed classifiers against backdoor attack

    Siqi Sun, Procheta Sen, and Wenjie Ruan. CROWD: Certified robustness via weight distribution for smoothed classifiers against backdoor attack. In Yaser Al-Onaizan, Mohit Bansal, and Yun- Nung Chen, editors,Findings of the Association for Computational Linguistics: EMNLP 2024, pages 17056–17070, Miami, Florida, USA, November 2024. Association for Computati...

  69. [69]

    Intriguing properties of neural networks

    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Good- fellow, and Rob Fergus. Intriguing properties of neural networks. InInternational Conference on Learning Representations, 2014

  70. [70]

    $\ell_1$ adversarial robustness certificates: a randomized smoothing approach, 2020

    Jiaye Teng, Guang-He Lee, and Yang Yuan. $\ell_1$ adversarial robustness certificates: a randomized smoothing approach, 2020. URL https://openreview.net/forum?id= H1lQIgrFDS

  71. [71]

    On certifying robustness against backdoor attacks via randomized smoothing.ArXiv, abs/2002.11750, 2020

    Binghui Wang, Xiaoyu Cao, Jinyuan Jia, and Neil Zhenqiang Gong. On certifying robustness against backdoor attacks via randomized smoothing.ArXiv, abs/2002.11750, 2020

  72. [72]

    Improved certified defenses against data poisoning with (deterministic) finite aggregation, 2022

    Wenxiao Wang, Alexander Levine, and Soheil Feizi. Improved certified defenses against data poisoning with (deterministic) finite aggregation, 2022

  73. [73]

    A statistical framework for differential privacy.Journal of the American Statistical Association, 105(489):375–389, 2010

    Larry Wasserman and Shuheng Zhou. A statistical framework for differential privacy.Journal of the American Statistical Association, 105(489):375–389, 2010

  74. [74]

    Rab: Provable robustness against backdoor attacks

    Maurice Weber, Xiaojun Xu, Bojan Karlaš, Ce Zhang, and Bo Li. Rab: Provable robustness against backdoor attacks. In2023 IEEE Symposium on Security and Privacy (SP), pages 1311–1328, 2023

  75. [75]

    Unraveling the connections between privacy and certified robustness in federated learning against poisoning attacks, 2023

    Chulin Xie, Yunhui Long, Pin-Yu Chen, Qinbin Li, Arash Nourian, Sanmi Koyejo, and Bo Li. Unraveling the connections between privacy and certified robustness in federated learning against poisoning attacks, 2023

  76. [76]

    Edward Hu, Hadi Salman, Ilya Razenshteyn, and Jerry Li

    Greg Yang, Tony Duan, J. Edward Hu, Hadi Salman, Ilya Razenshteyn, and Jerry Li. Random- ized smoothing of all shapes and sizes, 2020. URLhttps://arxiv.org/abs/2002.08118

  77. [77]

    Opacus: User-friendly differential privacy library in pytorch

    Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, and Ilya Mironov. Opacus: User-friendly differential privacy library in PyTorch.arXiv preprint arXiv:2109.12298, 2021

  78. [78]

    Macer: Attack-free and scalable robust training via maximizing certified radius.arXiv preprint arXiv:2001.02378, 2020

    Runtian Zhai, Chen Dan, Di He, Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, and Liwei Wang. Macer: Attack-free and scalable robust training via maximizing certified radius.arXiv preprint arXiv:2001.02378, 2020

  79. [79]

    Black-box certifi- cation with randomized smoothing: A functional optimization based framework, 2020

    Dinghuai Zhang, Mao Ye, Chengyue Gong, Zhanxing Zhu, and Qiang Liu. Black-box certifi- cation with randomized smoothing: A functional optimization based framework, 2020. URL https://arxiv.org/abs/2002.09169

  80. [80]

    Bagflip: A certified defense against data poisoning.ArXiv, abs/2205.13634, 2022

    Yuhao Zhang, Aws Albarghouthi, and Loris D’antoni. Bagflip: A certified defense against data poisoning.ArXiv, abs/2205.13634, 2022

Showing first 80 references.