When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks

Donald Flynn; Hadas Yaron Goldhirsh; Inbar Seroussi; Jonathan P. Keating

arxiv: 2605.22481 · v1 · pith:ZVNOL6T4new · submitted 2026-05-21 · 💻 cs.LG · math.ST· stat.TH

When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks

Donald Flynn , Hadas Yaron Goldhirsh , Jonathan P. Keating , Inbar Seroussi This is my paper

Pith reviewed 2026-05-22 06:40 UTC · model grok-4.3

classification 💻 cs.LG math.STstat.TH

keywords backdoor attackspoisoning attackshigh-dimensional analysisgeneralized linear modelsproportional regimeGaussian mixturestrigger strengthnoise floor

0 comments

The pith

In high dimensions, stronger backdoor training triggers raise clean accuracy and cap attack success on regularized models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that backdoor poisoning behaves differently once the number of features is large compared to the number of samples. Raising the strength of the trigger planted in the training data improves the model's accuracy on ordinary test points while the fraction of triggered test points that the attacker controls reaches a peak and then drops. The effect comes from a noise floor whose size grows with the ratio of dimension to sample size, something classical large-sample analysis misses. The authors derive the three phenomena exactly for squared loss and obtain matching predictions for other convex losses through a fixed-point description of the high-dimensional limit. Experiments on Gaussian mixtures and on CIFAR-10 confirm that the same pattern appears even when the model is a deep network.

Core claim

For regularized generalized linear models trained on Gaussian-mixture data in the proportional regime p/n → κ, increasing the training trigger strength α relative to a fixed test trigger produces three results: clean test accuracy grows with α, attack success rate reaches a maximum at a finite α and then declines, and the trigger direction that maximizes damage is the minimum eigenvector of the data covariance. The first two results hold in closed form for squared loss and extend to general convex losses via a Gaussian-proxy fixed-point system; the finite-sample noise floor proportional to κ is the mechanism that drives the rise in clean accuracy.

What carries the argument

The Gaussian-proxy fixed-point system that tracks the high-dimensional behavior of the regularized GLM under varying trigger strength α, exposing the noise floor proportional to the aspect ratio κ.

If this is right

Clean accuracy on untriggered test data increases monotonically with the strength of the trigger used in training.
Attack success rate reaches a maximum at an intermediate training trigger strength and then decreases.
The trigger direction that produces the largest attack success is the smallest eigenvector of the sample covariance.
The same qualitative dependence on trigger strength holds for any convex GLM loss through the fixed-point equations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Defenders could deliberately insert stronger triggers during training to improve robustness without knowledge of the attacker's test trigger.
The same noise-floor mechanism may limit the effectiveness of other data-poisoning or backdoor strategies once models operate in the proportional regime.
The pattern observed in ResNet-18 experiments suggests the phenomena survive beyond convex models and could be tested on other non-convex architectures.
Robustness evaluations that assume n much larger than p may systematically underestimate the protection that high-dimensional effects already provide.

Load-bearing premise

The inputs come from a Gaussian mixture and the learner is a regularized generalized linear model whose high-dimensional limit is captured by the Gaussian proxy.

What would settle it

A controlled experiment on Gaussian-mixture data with p/n held near a constant in which clean accuracy fails to increase or attack success fails to decline after a peak when training trigger strength α is raised.

Figures

Figures reproduced from arXiv: 2605.22481 by Donald Flynn, Hadas Yaron Goldhirsh, Inbar Seroussi, Jonathan P. Keating.

**Figure 2.** Figure 2: Real Data vs Theoretical Predictions A plot of CIFAR-10 (classes 0 & 1) for logistic regression on real data (blue) and Gaussian surrogates (orange) compared against theoretical predictions (dashed) obtained by solving the fixed point equation in Theorem 1 numerically. Here ϕ = 0.05; αtest = 0.5. The Gaussian surrogates act as a proxy for CIFAR-10, which lie within our theoretical assumptions, further expe… view at source ↗

**Figure 3.** Figure 3: Projection-level fixed-point predictions in the isotropic setting [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Eigenvector specialization. Left: exact trigger-projection curves [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Effect of the poisoning fraction ϕ in the square-loss model. Curves show empirical ridge estimates and the corresponding theory predictions as the training trigger strength α varies. Increasing ϕ can increase the trigger alignment at small α, but can decrease it once α is large. The benign alignment decreases with ϕ at fixed α, to leading order. ϕ = 0 and, if shown, ϕ = 1/2, are included only as visual ref… view at source ↗

**Figure 1.** Figure 1: ResNet-18 triple panel. We train a ResNet-18 [18] with a single scalar output (binary classification). For CIFAR-10’s 32×32 images we replace the standard 7×7/stride-2 first convolution with a 3×3/stride-1 convolution and remove the initial max-pool, following common practice for small images. Training uses SGD with momentum 0.9, weight decay 5 × 10−4 , initial learning rate 0.05 annealed to 0 via a cosine… view at source ↗

**Figure 2.** Figure 2: CIFAR-10 vs. Gaussian empirical (logistic regression). [PITH_FULL_IMAGE:figures/full_fig_p041_2.png] view at source ↗

**Figure 6.** Figure 6: sensitivity to test trigger norm. The attack success rate in Figures 1 and 2 is reported at a fixed test trigger norm αtest = 0.5. To check that the qualitative ASR-vs-α shape is not an artifact of this choice, we repeat both experiments at additional values of αtest. Panel (a) shows the logistic-regression sweep on CIFAR-10 and panel (b) shows the ResNet-18 sweep, both with all other settings identical to… view at source ↗

**Figure 3.** Figure 3: projection curves across aspect ratios. This figure is a theory-only comparison across several proportional-regime aspect ratios κ = p/n, including the overparameterized regime κ > 1. We solve the deterministic theory on the grid α ∈ [0, 30] for the four configurations (p, n, κ) ∈ {(1100, 1000, 1.10), (1050, 1000, 1.05), (1000, 2000, 0.50), (1000, 5000, 0.20)}. For each value of κ, the dashed curves show t… view at source ↗

**Figure 4.** Figure 4: eigenvector spectral effect. This figure isolates the dependence of the trigger projection on the trigger eigenvalue s 2 v . We use a synthetic non-isotropic covariance model in which 42 [PITH_FULL_IMAGE:figures/full_fig_p042_4.png] view at source ↗

**Figure 5.** Figure 5: poisoning-fraction sweep. This figure shows how the square-loss projections vary with the poisoning fraction ϕ. We use the same synthetic Gaussian-mixture setup as in the linear square-loss experiments, and sweep ϕ ∈ {0, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5} 43 [PITH_FULL_IMAGE:figures/full_fig_p043_5.png] view at source ↗

read the original abstract

Backdoor poisoning attacks behave counter-intuitively in high dimensions: stronger training triggers can help the defender. We study regularised generalised linear models on Gaussian-mixture data in the proportional regime ($p/n \to \kappa$), varying the training trigger strength $\alpha$ against a fixed test trigger. Three phenomena emerge: (i) clean test accuracy increases with $\alpha$; (ii) attack success peaks at a finite $\alpha$ and then declines; and (iii) the most damaging trigger direction is the minimum eigenvector of the data covariance. We prove all three results in closed form for the squared loss, and extend (i) and (ii) to general convex GLM losses via a Gaussian-proxy fixed-point system. We identify a finite-sample noise floor proportional to $\kappa$ as the mechanism behind (i), invisible to classical $n \gg p$ analysis. Experiments on CIFAR-10 and Gaussian surrogates match the theory closely; ResNet-18 experiments show the same phenomena beyond the convex setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Stronger backdoor triggers can raise clean accuracy in high-d GLMs via a kappa noise floor, with attack success peaking then falling.

read the letter

The main point is that backdoor poisoning in the proportional regime shows counterintuitive behavior: increasing training trigger strength alpha can improve clean accuracy while attack success rises then declines. They trace the accuracy gain to a finite-sample noise floor that scales with kappa, something classical low-dimensional analysis misses entirely. The worst trigger direction turns out to be the minimum eigenvector of the data covariance. All three claims are derived in closed form for squared loss, with an extension to convex GLMs through a Gaussian-proxy fixed-point system. Experiments on Gaussian surrogates and CIFAR-10 line up with the predictions, and even ResNet-18 shows the same patterns outside the convex setting. That combination of explicit math and matching experiments is the real contribution here. The derivations avoid fitting to the observed curves and instead come from the model itself, which keeps the circularity burden low. The proportional-regime setup and the identification of the noise-floor mechanism are new relative to earlier backdoor work. One soft spot is the Gaussian-proxy step for non-quadratic losses. The paper asserts the extension but does not supply error bounds or convergence rates when the loss deviates from quadratic or when alpha increases, so the predicted non-monotonicity could shift under stronger deviations. The ResNet results are useful but sit outside the analyzed convex case, making them more of a consistency check than a proof. This paper is for people working on high-dimensional robustness, poisoning defenses, or theoretical analyses of attacks in overparameterized models. A reader who wants concrete predictions rather than hand-wavy explanations will find it useful. It deserves a serious referee because the claims are specific, the math is laid out, and the experiments give something to verify. I would send it out for review.

Referee Report

1 major / 2 minor

Summary. The manuscript claims that backdoor poisoning attacks on regularized generalized linear models trained on Gaussian-mixture data in the proportional high-dimensional regime (p/n → κ) exhibit counter-intuitive behavior: clean test accuracy increases with training trigger strength α against a fixed test trigger; attack success rate is non-monotonic in α, peaking at a finite value before declining; and the most damaging trigger direction is the minimum eigenvector of the data covariance. All three results are proved in closed form for squared loss, with (i) and (ii) extended to general convex GLM losses via a Gaussian-proxy fixed-point system. Experiments on CIFAR-10 and Gaussian surrogates, plus ResNet-18, are reported to match the predictions closely.

Significance. If the results hold, the work offers a valuable high-dimensional theory of backdoor attacks that reveals mechanisms (such as the finite-sample noise floor proportional to κ) invisible to classical n ≫ p analyses. The closed-form derivations for squared loss and the fixed-point extension provide rigorous, parameter-free insights in the proportional limit; the matching experiments on both synthetic and real data add empirical support. This could inform defense design by highlighting optimal trigger strengths or directions.

major comments (1)

[§3.2] §3.2 (Gaussian-proxy fixed-point system): the extension of claims (i) and (ii) to general convex GLM losses rests on the proxy faithfully reproducing high-dimensional behavior for non-quadratic losses. No explicit error bound or convergence guarantee is provided when the loss deviates from quadratic or as α grows; if the fixed-point mis-captures the effective regularization or κ-proportional noise term, the predicted increase in clean accuracy and non-monotonic attack success may not hold. This is load-bearing for the general-case results.

minor comments (2)

[Experiments] Experiments section: details on data splits, hyper-parameter choices, and exact construction of the Gaussian surrogates are not shown, which would strengthen verification of the claimed close match to theory on CIFAR-10 and ResNet-18.
[Notation] Notation: ensure α is consistently defined as training trigger strength (distinct from test trigger) in all equations and figure captions to avoid reader confusion.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on the Gaussian-proxy fixed-point system. We address the major comment point by point below.

read point-by-point responses

Referee: [§3.2] §3.2 (Gaussian-proxy fixed-point system): the extension of claims (i) and (ii) to general convex GLM losses rests on the proxy faithfully reproducing high-dimensional behavior for non-quadratic losses. No explicit error bound or convergence guarantee is provided when the loss deviates from quadratic or as α grows; if the fixed-point mis-captures the effective regularization or κ-proportional noise term, the predicted increase in clean accuracy and non-monotonic attack success may not hold. This is load-bearing for the general-case results.

Authors: We agree that the manuscript does not supply an explicit quantitative error bound on the Gaussian-proxy approximation for non-quadratic losses or large α. The fixed-point system is obtained by replacing the feature distribution with a Gaussian that matches the first two moments, which yields exact state-evolution equations in the proportional limit; this is a standard device in the high-dimensional statistics literature for GLM analysis. While a rigorous convergence rate is not derived, the predictions are validated against both synthetic Gaussian mixtures and CIFAR-10 experiments. We will revise §3.2 to include (i) a brief derivation sketch showing how the proxy arises from the replica or state-evolution analysis, (ii) additional numerical checks comparing fixed-point outputs to finite-dimensional gradient-descent trajectories for logistic and hinge losses across a range of α, and (iii) a short remark on the regime where the approximation is expected to remain accurate (smooth convex losses, moderate trigger strength). These additions will make the load-bearing character of the extension more transparent without altering the main claims. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper states it proves results in closed form for squared loss and extends via a Gaussian-proxy fixed-point system derived from the model in the proportional regime. No quoted steps reduce a prediction to a fitted parameter by construction, nor does any load-bearing claim rest on a self-citation loop or imported uniqueness theorem. The fixed-point equations are presented as analytically derived from the GLM assumptions rather than calibrated to the attack-success or accuracy curves, making the central claims independent of the target quantities.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The analysis rests on the proportional high-dimensional limit, Gaussian mixture data, and the validity of the Gaussian-proxy fixed-point approximation for convex losses. No new particles or forces are introduced.

free parameters (2)

κ = p/n
The limiting ratio of features to samples that controls the noise floor; treated as a fixed parameter of the regime.
α (training trigger strength)
Varied continuously against a fixed test trigger; its value is chosen by the attacker but enters the closed-form expressions.

axioms (2)

domain assumption Data are drawn from a two-component Gaussian mixture with isotropic covariance.
Invoked to obtain the proportional-limit analysis and the minimum-eigenvector result.
domain assumption The Gaussian-proxy fixed-point system accurately tracks the high-dimensional behavior of general convex GLM losses.
Used to extend the squared-loss proofs; its accuracy is asserted but not derived from first principles in the abstract.

pith-pipeline@v0.9.0 · 5720 in / 1587 out tokens · 38433 ms · 2026-05-22T06:40:47.480380+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 8 internal anchors

[1]

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, and Joseph Keshet. Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring, June 2018. URLhttp://arxiv.org/abs/1802.04633. arXiv:1802.04633 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018
[2]

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

Gerard Ben Arous, Reza Gheissari, and Aukosh Jagannath. High-dimensional limit theorems for SGD: Effective dynamics and critical scaling. InAdvances in Neural Information Processing Systems, 2022

work page 2022
[3]

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, and Lenka Zdeborová. Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models.Proceedings of the National Academy of Sciences, 116(12):5451–5460, March 2019. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.1802705116. URL http://arxiv.org/abs/1708.03395. arXiv:1708.03395 [cs]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1073/pnas.1802705116 2019
[4]

Nicholas Barnfield, Hugo Cui, and Yue M. Lu. High-Dimensional Analysis of Single-Layer Attention for Sparse-Token Classification, September 2025. URLhttp://arxiv.org/abs/ 2509.25153. arXiv:2509.25153 [cs]

work page arXiv 2025
[5]

Bartlett, Philip M

Peter L. Bartlett, Philip M. Long, Gábor Lugosi, and Alexander Tsigler. Benign overfitting in linear regression.Proceedings of the National Academy of Sciences, 117(48):30063–30070, 2020

work page 2020
[6]

Bartlett, Andrea Montanari, and Alexander Rakhlin

Peter L. Bartlett, Andrea Montanari, and Alexander Rakhlin. Deep learning: a statistical viewpoint.Acta Numerica, 30:87–201, 2021

work page 2021
[7]

Reconciling modern machine- learning practice and the classical bias–variance trade-off.Proceedings of the National Academy of Sciences, 116(32):15849–15854, 2019

Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. Reconciling modern machine- learning practice and the classical bias–variance trade-off.Proceedings of the National Academy of Sciences, 116(32):15849–15854, 2019

work page 2019
[8]

Hitting the high-dimensional notes: an ode for sgd learning dynamics on glms and multi-index models

Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, and Inbar Seroussi. Hitting the high-dimensional notes: an ode for sgd learning dynamics on glms and multi-index models. Information and Inference: A Journal of the IMA, 13(4):iaae028, 2024. ISSN 2049-8772

work page 2024
[9]

A Model of Double Descent for High- dimensionalBinaryLinearClassification, May2020

Zeyu Deng, Abla Kammoun, and Christos Thrampoulidis. A Model of Double Descent for High- dimensionalBinaryLinearClassification, May2020. URL http://arxiv.org/abs/1911.05822. arXiv:1911.05822 [stat]

work page arXiv 1911
[10]

High-dimensional asymptotics of prediction: Ridge regression and classification.The Annals of Statistics, 46(1):247–279, 2018

Edgar Dobriban and Stefan Wager. High-dimensional asymptotics of prediction: Ridge regression and classification.The Annals of Statistics, 46(1):247–279, 2018

work page 2018
[11]

High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing

David Donoho and Andrea Montanari. High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing, November 2013. URLhttp://arxiv.org/abs/ 1310.7320. arXiv:1310.7320 [math]

work page internal anchor Pith review Pith/arXiv arXiv 2013
[12]

Bickel, Chinghway Lim, and Bin Yu

Noureddine El Karoui, Derek Bean, Peter J. Bickel, Chinghway Lim, and Bin Yu. On robust regression with high-dimensional predictors.Proceedings of the National Academy of Sciences, 110(36):14557–14562, September 2013. ISSN 0027-8424, 1091-6490. doi: 10.1073/ pnas.1307842110. URL https://pnas.org/doi/full/10.1073/pnas.1307842110. Publisher: Proceedings of ...

work page doi:10.1073/pnas.1307842110 2013
[13]

Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks, 2026

Bethan Evans and Jared Tanner. Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks, 2026. 12

work page 2026
[14]

A Linear Approach to Data Poisoning, January 2026

Donald Flynn and Diego Granziol. A Linear Approach to Data Poisoning, January 2026. URL http://arxiv.org/abs/2505.15175. arXiv:2505.15175 [stat]

work page arXiv 2026
[15]

Safety-Efficacy Trade Off: Robustness against Data-Poisoning, January 2026

Diego Granziol. Safety-Efficacy Trade Off: Robustness against Data-Poisoning, January 2026

work page 2026
[16]

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain, March 2019. URLhttp://arxiv.org/abs/ 1708.06733. arXiv:1708.06733 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[17]

Hastie, Andrea Montanari, Saharon Rosset, and Ryan J

Trevor J. Hastie, Andrea Montanari, Saharon Rosset, and Ryan J. Tibshirani. Surprises in high-dimensional ridgeless least squares interpolation.The Annals of Statistics, 50(2):949–986, 2019

work page 2019
[18]

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition, December 2015. URL http://arxiv.org/abs/1512.03385. arXiv:1512.03385 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2015
[19]

State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling

Adel Javanmard and Andrea Montanari. State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling, December 2012. URL http: //arxiv.org/abs/1211.5164. arXiv:1211.5164 [math]

work page internal anchor Pith review Pith/arXiv arXiv 2012
[20]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009

work page 2009
[21]

Backdoor Attack in the Physical World, April 2021

Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. Backdoor Attack in the Physical World, April 2021. URLhttp://arxiv.org/abs/2104.02361. arXiv:2104.02361 [cs]

work page arXiv 2021
[22]

Backdoor Learning: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(1):5–22, January 2024

Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. Backdoor Learning: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(1):5–22, January 2024. ISSN 2162-

work page 2024
[23]

URL https://ieeexplore.ieee.org/abstract/ document/9802938

doi: 10.1109/TNNLS.2022.3182979. URL https://ieeexplore.ieee.org/abstract/ document/9802938

work page doi:10.1109/tnnls.2022.3182979 2022
[24]

A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers.The Annals of Statistics, 50(3), June

Tengyuan Liang and Pragya Sur. A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers.The Annals of Statistics, 50(3), June

work page
[25]

doi: 10.1214/22-AOS2170

ISSN 0090-5364. doi: 10.1214/22-AOS2170. URLhttp://arxiv.org/abs/2002.01586. arXiv:2002.01586 [math]

work page doi:10.1214/22-aos2170 2002
[26]

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks, July 2020

Yunfei Liu, Xingjun Ma, James Bailey, and Feng Lu. Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks, July 2020. URL http://arxiv.org/abs/2007.02343. arXiv:2007.02343 [cs]

work page arXiv 2020
[27]

Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymp- totics in High-dimensions

Bruno Loureiro, Gabriele Sicuro, Cedric Gerbelot, Alessandro Pacco, Florent Krzakala, and Lenka Zdeborová. Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymp- totics in High-dimensions. InAdvances in Neural Information Processing Systems, volume 34, pages 10144–10157. Curran Associates, Inc., 2021. URLhttps://proceedings.neurips.cc/ ...

work page 2021
[28]

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

Yiwei Lu, Gautam Kamath, and Yaoliang Yu. Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks. InProceedings of the 40th International Conference on Machine Learning, pages 22856–22879. PMLR, July 2023. URLhttps://proceedings.mlr. press/v202/lu23e.html. ISSN: 2640-3498. 13

work page 2023
[29]

High Dimensional Classification via Regularized and Unregularized Empirical Risk Minimization: Precise Error and Optimal Loss, November 2020

Xiaoyi Mai and Zhenyu Liao. High Dimensional Classification via Regularized and Unregularized Empirical Risk Minimization: Precise Error and Optimal Loss, November 2020. URLhttp: //arxiv.org/abs/1905.13742. arXiv:1905.13742 [stat]

work page arXiv 2020
[30]

MIT Press, Cambridge, UNITED STATES, 2018

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.Foundations of Machine Learning, Second Edition. MIT Press, Cambridge, UNITED STATES, 2018. ISBN 978-0-262-35136-2

work page 2018
[31]

The generalization er- ror of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime, March 2023

Andrea Montanari, Feng Ruan, Youngtak Sohn, and Jun Yan. The generalization er- ror of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime, March 2023. URL http://arxiv.org/abs/1911.01544. arXiv:1911.01544 [math]

work page arXiv 2023
[32]

Andrea Montanari, Feng Ruan, Youngtak Sohn, and Jun Yan. The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime.The Annals of Statistics, 53(2):822–853, 2025

work page 2025
[33]

WaNet – Imperceptible Warping-based Backdoor Attack, March

Anh Nguyen and Anh Tran. WaNet – Imperceptible Warping-based Backdoor Attack, March

work page
[34]

arXiv:2102.10369 [cs]

URLhttp://arxiv.org/abs/2102.10369. arXiv:2102.10369 [cs]

work page arXiv
[35]

Homogenization of sgd in high-dimensions: exact dynamics and generalization properties.Mathematical Programming, 2024

Courtney Paquette, Elliot Paquette, Ben Adlam, and Jeffrey Pennington. Homogenization of sgd in high-dimensions: exact dynamics and generalization properties.Mathematical Programming, 2024

work page 2024
[36]

Di, Yiwei Lu, Ayush Sekhari, Gautam Kamath, and Seth Neel

Martin Pawelczyk, Jimmy Z. Di, Yiwei Lu, Ayush Sekhari, Gautam Kamath, and Seth Neel. Machine Unlearning Fails to Remove Data Poisoning Attacks, April 2025. URLhttp: //arxiv.org/abs/2406.17216. arXiv:2406.17216 [cs] version: 2

work page arXiv 2025
[37]

Generalized Approximate Message Passing for Estimation with Random Linear Mixing

Sundeep Rangan. Generalized Approximate Message Passing for Estimation with Random Linear Mixing, August 2012. URLhttp://arxiv.org/abs/1010.5141. arXiv:1010.5141 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2012
[38]

Lower Bounds on the Generalization Error of Nonlinear Learning Models.IEEE Transactions on Information Theory, 68(12):7956–7970, December

Inbar Seroussi and Ofer Zeitouni. Lower Bounds on the Generalization Error of Nonlinear Learning Models.IEEE Transactions on Information Theory, 68(12):7956–7970, December

work page
[39]

doi: 10.1109/TIT.2022.3189760

ISSN 1557-9654. doi: 10.1109/TIT.2022.3189760. URL https://ieeexplore.ieee. org/document/9825668/

work page doi:10.1109/tit.2022.3189760 2022
[40]

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, October 2025

Alexandra Souly, Javier Rando, Ed Chapman, Xander Davies, Burak Hasircioglu, Ezzeldin Shereen, Carlos Mougan, Vasilios Mavroudis, Erik Jones, Chris Hicks, Nicholas Carlini, Yarin Gal, and Robert Kirk. Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, October 2025. URLhttp://arxiv.org/abs/2510.07192. arXiv:2510.07192 [cs]

work page arXiv 2025
[41]

A framework to characterize performance of LASSO algorithms

Mihailo Stojnic. A framework to characterize performance of LASSO algorithms, March 2013. URLhttp://arxiv.org/abs/1303.7291. arXiv:1303.7291 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2013
[42]

A modern maximum-likelihood theory for high- dimensional logistic regression.Proceedings of the National Academy of Sciences, 116(29): 14516–14525, 2019

Pragya Sur and Emmanuel J Candès. A modern maximum-likelihood theory for high- dimensional logistic regression.Proceedings of the National Academy of Sciences, 116(29): 14516–14525, 2019

work page 2019
[43]

Regularized Linear Regression: A Precise Analysis of the Estimation Error

Christos Thrampoulidis, Samet Oymak, and Babak Hassibi. Regularized Linear Regression: A Precise Analysis of the Estimation Error. InProceedings of The 28th Conference on Learning Theory, pages 1683–1709. PMLR, June 2015. URLhttps://proceedings.mlr.press/v40/ Thrampoulidis15.html. ISSN: 1938-7228. 14

work page 2015
[44]

Label-Consistent Backdoor Attacks, December 2019

Alexander Turner, Dimitris Tsipras, and Aleksander Madry. Label-Consistent Backdoor Attacks, December 2019. URLhttp://arxiv.org/abs/1912.02771. arXiv:1912.02771 [stat]

work page arXiv 2019
[45]

Data-Efficient Backdoor Attacks, June 2022

Pengfei Xia, Ziqiang Li, Wei Zhang, and Bin Li. Data-Efficient Backdoor Attacks, June 2022. URLhttp://arxiv.org/abs/2204.12281. arXiv:2204.12281 [cs]. 15 Appendix contents 1 Introduction 1 1.1 Our contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . ....

work page arXiv 2022
[46]

airplane

Substituting into (25) yieldsµ⊤∇Lpop(θben;α)>0. 36 C Comparing ERM and information limit C.1 Precise relation between ERM and information limit Fixed-dimensional convergence of the empirical optimiser.We briefly justify the relation- ship between the empirical optimisation problem ˆθn∈arg min θ∈Rp Ln(θ),L n(θ) = 1 n n∑ i=1 L(yix⊤ i θ) +λ 2∥θ∥2, and its po...

work page 2000

[1] [1]

Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, and Joseph Keshet. Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring, June 2018. URLhttp://arxiv.org/abs/1802.04633. arXiv:1802.04633 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018

[2] [2]

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

Gerard Ben Arous, Reza Gheissari, and Aukosh Jagannath. High-dimensional limit theorems for SGD: Effective dynamics and critical scaling. InAdvances in Neural Information Processing Systems, 2022

work page 2022

[3] [3]

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, and Lenka Zdeborová. Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models.Proceedings of the National Academy of Sciences, 116(12):5451–5460, March 2019. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.1802705116. URL http://arxiv.org/abs/1708.03395. arXiv:1708.03395 [cs]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1073/pnas.1802705116 2019

[4] [4]

Nicholas Barnfield, Hugo Cui, and Yue M. Lu. High-Dimensional Analysis of Single-Layer Attention for Sparse-Token Classification, September 2025. URLhttp://arxiv.org/abs/ 2509.25153. arXiv:2509.25153 [cs]

work page arXiv 2025

[5] [5]

Bartlett, Philip M

Peter L. Bartlett, Philip M. Long, Gábor Lugosi, and Alexander Tsigler. Benign overfitting in linear regression.Proceedings of the National Academy of Sciences, 117(48):30063–30070, 2020

work page 2020

[6] [6]

Bartlett, Andrea Montanari, and Alexander Rakhlin

Peter L. Bartlett, Andrea Montanari, and Alexander Rakhlin. Deep learning: a statistical viewpoint.Acta Numerica, 30:87–201, 2021

work page 2021

[7] [7]

Reconciling modern machine- learning practice and the classical bias–variance trade-off.Proceedings of the National Academy of Sciences, 116(32):15849–15854, 2019

Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. Reconciling modern machine- learning practice and the classical bias–variance trade-off.Proceedings of the National Academy of Sciences, 116(32):15849–15854, 2019

work page 2019

[8] [8]

Hitting the high-dimensional notes: an ode for sgd learning dynamics on glms and multi-index models

Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, and Inbar Seroussi. Hitting the high-dimensional notes: an ode for sgd learning dynamics on glms and multi-index models. Information and Inference: A Journal of the IMA, 13(4):iaae028, 2024. ISSN 2049-8772

work page 2024

[9] [9]

A Model of Double Descent for High- dimensionalBinaryLinearClassification, May2020

Zeyu Deng, Abla Kammoun, and Christos Thrampoulidis. A Model of Double Descent for High- dimensionalBinaryLinearClassification, May2020. URL http://arxiv.org/abs/1911.05822. arXiv:1911.05822 [stat]

work page arXiv 1911

[10] [10]

High-dimensional asymptotics of prediction: Ridge regression and classification.The Annals of Statistics, 46(1):247–279, 2018

Edgar Dobriban and Stefan Wager. High-dimensional asymptotics of prediction: Ridge regression and classification.The Annals of Statistics, 46(1):247–279, 2018

work page 2018

[11] [11]

High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing

David Donoho and Andrea Montanari. High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing, November 2013. URLhttp://arxiv.org/abs/ 1310.7320. arXiv:1310.7320 [math]

work page internal anchor Pith review Pith/arXiv arXiv 2013

[12] [12]

Bickel, Chinghway Lim, and Bin Yu

Noureddine El Karoui, Derek Bean, Peter J. Bickel, Chinghway Lim, and Bin Yu. On robust regression with high-dimensional predictors.Proceedings of the National Academy of Sciences, 110(36):14557–14562, September 2013. ISSN 0027-8424, 1091-6490. doi: 10.1073/ pnas.1307842110. URL https://pnas.org/doi/full/10.1073/pnas.1307842110. Publisher: Proceedings of ...

work page doi:10.1073/pnas.1307842110 2013

[13] [13]

Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks, 2026

Bethan Evans and Jared Tanner. Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks, 2026. 12

work page 2026

[14] [14]

A Linear Approach to Data Poisoning, January 2026

Donald Flynn and Diego Granziol. A Linear Approach to Data Poisoning, January 2026. URL http://arxiv.org/abs/2505.15175. arXiv:2505.15175 [stat]

work page arXiv 2026

[15] [15]

Safety-Efficacy Trade Off: Robustness against Data-Poisoning, January 2026

Diego Granziol. Safety-Efficacy Trade Off: Robustness against Data-Poisoning, January 2026

work page 2026

[16] [16]

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain, March 2019. URLhttp://arxiv.org/abs/ 1708.06733. arXiv:1708.06733 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[17] [17]

Hastie, Andrea Montanari, Saharon Rosset, and Ryan J

Trevor J. Hastie, Andrea Montanari, Saharon Rosset, and Ryan J. Tibshirani. Surprises in high-dimensional ridgeless least squares interpolation.The Annals of Statistics, 50(2):949–986, 2019

work page 2019

[18] [18]

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition, December 2015. URL http://arxiv.org/abs/1512.03385. arXiv:1512.03385 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2015

[19] [19]

State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling

Adel Javanmard and Andrea Montanari. State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling, December 2012. URL http: //arxiv.org/abs/1211.5164. arXiv:1211.5164 [math]

work page internal anchor Pith review Pith/arXiv arXiv 2012

[20] [20]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009

work page 2009

[21] [21]

Backdoor Attack in the Physical World, April 2021

Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. Backdoor Attack in the Physical World, April 2021. URLhttp://arxiv.org/abs/2104.02361. arXiv:2104.02361 [cs]

work page arXiv 2021

[22] [22]

Backdoor Learning: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(1):5–22, January 2024

Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. Backdoor Learning: A Survey.IEEE Transactions on Neural Networks and Learning Systems, 35(1):5–22, January 2024. ISSN 2162-

work page 2024

[23] [23]

URL https://ieeexplore.ieee.org/abstract/ document/9802938

doi: 10.1109/TNNLS.2022.3182979. URL https://ieeexplore.ieee.org/abstract/ document/9802938

work page doi:10.1109/tnnls.2022.3182979 2022

[24] [24]

A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers.The Annals of Statistics, 50(3), June

Tengyuan Liang and Pragya Sur. A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers.The Annals of Statistics, 50(3), June

work page

[25] [25]

doi: 10.1214/22-AOS2170

ISSN 0090-5364. doi: 10.1214/22-AOS2170. URLhttp://arxiv.org/abs/2002.01586. arXiv:2002.01586 [math]

work page doi:10.1214/22-aos2170 2002

[26] [26]

Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks, July 2020

Yunfei Liu, Xingjun Ma, James Bailey, and Feng Lu. Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks, July 2020. URL http://arxiv.org/abs/2007.02343. arXiv:2007.02343 [cs]

work page arXiv 2020

[27] [27]

Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymp- totics in High-dimensions

Bruno Loureiro, Gabriele Sicuro, Cedric Gerbelot, Alessandro Pacco, Florent Krzakala, and Lenka Zdeborová. Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymp- totics in High-dimensions. InAdvances in Neural Information Processing Systems, volume 34, pages 10144–10157. Curran Associates, Inc., 2021. URLhttps://proceedings.neurips.cc/ ...

work page 2021

[28] [28]

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

Yiwei Lu, Gautam Kamath, and Yaoliang Yu. Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks. InProceedings of the 40th International Conference on Machine Learning, pages 22856–22879. PMLR, July 2023. URLhttps://proceedings.mlr. press/v202/lu23e.html. ISSN: 2640-3498. 13

work page 2023

[29] [29]

High Dimensional Classification via Regularized and Unregularized Empirical Risk Minimization: Precise Error and Optimal Loss, November 2020

Xiaoyi Mai and Zhenyu Liao. High Dimensional Classification via Regularized and Unregularized Empirical Risk Minimization: Precise Error and Optimal Loss, November 2020. URLhttp: //arxiv.org/abs/1905.13742. arXiv:1905.13742 [stat]

work page arXiv 2020

[30] [30]

MIT Press, Cambridge, UNITED STATES, 2018

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.Foundations of Machine Learning, Second Edition. MIT Press, Cambridge, UNITED STATES, 2018. ISBN 978-0-262-35136-2

work page 2018

[31] [31]

The generalization er- ror of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime, March 2023

Andrea Montanari, Feng Ruan, Youngtak Sohn, and Jun Yan. The generalization er- ror of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime, March 2023. URL http://arxiv.org/abs/1911.01544. arXiv:1911.01544 [math]

work page arXiv 2023

[32] [32]

Andrea Montanari, Feng Ruan, Youngtak Sohn, and Jun Yan. The generalization error of max-margin linear classifiers: Benign overfitting and high dimensional asymptotics in the overparametrized regime.The Annals of Statistics, 53(2):822–853, 2025

work page 2025

[33] [33]

WaNet – Imperceptible Warping-based Backdoor Attack, March

Anh Nguyen and Anh Tran. WaNet – Imperceptible Warping-based Backdoor Attack, March

work page

[34] [34]

arXiv:2102.10369 [cs]

URLhttp://arxiv.org/abs/2102.10369. arXiv:2102.10369 [cs]

work page arXiv

[35] [35]

Homogenization of sgd in high-dimensions: exact dynamics and generalization properties.Mathematical Programming, 2024

Courtney Paquette, Elliot Paquette, Ben Adlam, and Jeffrey Pennington. Homogenization of sgd in high-dimensions: exact dynamics and generalization properties.Mathematical Programming, 2024

work page 2024

[36] [36]

Di, Yiwei Lu, Ayush Sekhari, Gautam Kamath, and Seth Neel

Martin Pawelczyk, Jimmy Z. Di, Yiwei Lu, Ayush Sekhari, Gautam Kamath, and Seth Neel. Machine Unlearning Fails to Remove Data Poisoning Attacks, April 2025. URLhttp: //arxiv.org/abs/2406.17216. arXiv:2406.17216 [cs] version: 2

work page arXiv 2025

[37] [37]

Generalized Approximate Message Passing for Estimation with Random Linear Mixing

Sundeep Rangan. Generalized Approximate Message Passing for Estimation with Random Linear Mixing, August 2012. URLhttp://arxiv.org/abs/1010.5141. arXiv:1010.5141 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2012

[38] [38]

Lower Bounds on the Generalization Error of Nonlinear Learning Models.IEEE Transactions on Information Theory, 68(12):7956–7970, December

Inbar Seroussi and Ofer Zeitouni. Lower Bounds on the Generalization Error of Nonlinear Learning Models.IEEE Transactions on Information Theory, 68(12):7956–7970, December

work page

[39] [39]

doi: 10.1109/TIT.2022.3189760

ISSN 1557-9654. doi: 10.1109/TIT.2022.3189760. URL https://ieeexplore.ieee. org/document/9825668/

work page doi:10.1109/tit.2022.3189760 2022

[40] [40]

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, October 2025

Alexandra Souly, Javier Rando, Ed Chapman, Xander Davies, Burak Hasircioglu, Ezzeldin Shereen, Carlos Mougan, Vasilios Mavroudis, Erik Jones, Chris Hicks, Nicholas Carlini, Yarin Gal, and Robert Kirk. Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, October 2025. URLhttp://arxiv.org/abs/2510.07192. arXiv:2510.07192 [cs]

work page arXiv 2025

[41] [41]

A framework to characterize performance of LASSO algorithms

Mihailo Stojnic. A framework to characterize performance of LASSO algorithms, March 2013. URLhttp://arxiv.org/abs/1303.7291. arXiv:1303.7291 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2013

[42] [42]

A modern maximum-likelihood theory for high- dimensional logistic regression.Proceedings of the National Academy of Sciences, 116(29): 14516–14525, 2019

Pragya Sur and Emmanuel J Candès. A modern maximum-likelihood theory for high- dimensional logistic regression.Proceedings of the National Academy of Sciences, 116(29): 14516–14525, 2019

work page 2019

[43] [43]

Regularized Linear Regression: A Precise Analysis of the Estimation Error

Christos Thrampoulidis, Samet Oymak, and Babak Hassibi. Regularized Linear Regression: A Precise Analysis of the Estimation Error. InProceedings of The 28th Conference on Learning Theory, pages 1683–1709. PMLR, June 2015. URLhttps://proceedings.mlr.press/v40/ Thrampoulidis15.html. ISSN: 1938-7228. 14

work page 2015

[44] [44]

Label-Consistent Backdoor Attacks, December 2019

Alexander Turner, Dimitris Tsipras, and Aleksander Madry. Label-Consistent Backdoor Attacks, December 2019. URLhttp://arxiv.org/abs/1912.02771. arXiv:1912.02771 [stat]

work page arXiv 2019

[45] [45]

Data-Efficient Backdoor Attacks, June 2022

Pengfei Xia, Ziqiang Li, Wei Zhang, and Bin Li. Data-Efficient Backdoor Attacks, June 2022. URLhttp://arxiv.org/abs/2204.12281. arXiv:2204.12281 [cs]. 15 Appendix contents 1 Introduction 1 1.1 Our contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . ....

work page arXiv 2022

[46] [46]

airplane

Substituting into (25) yieldsµ⊤∇Lpop(θben;α)>0. 36 C Comparing ERM and information limit C.1 Precise relation between ERM and information limit Fixed-dimensional convergence of the empirical optimiser.We briefly justify the relation- ship between the empirical optimisation problem ˆθn∈arg min θ∈Rp Ln(θ),L n(θ) = 1 n n∑ i=1 L(yix⊤ i θ) +λ 2∥θ∥2, and its po...

work page 2000