BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
Pith reviewed 2026-05-12 23:02 UTC · model grok-4.3
The pith
An adversary can train a neural network that performs well on normal inputs but activates malicious behavior on specific attacker-chosen triggers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that backdoored neural networks, called BadNets, can be created by poisoning the training process. These networks retain high performance on standard inputs while reliably executing attacker-specified behavior on inputs containing a chosen trigger. The backdoor remains effective even after the model is retrained on a different task.
What carries the argument
The BadNet itself: a neural network trained on a mixture of clean data and poisoned examples that contain the trigger, so the backdoor is learned as part of the model weights without degrading metrics on clean validation sets.
If this is right
- Models obtained from cloud training or pre-trained repositories can contain hidden backdoors that activate on attacker-chosen inputs.
- Adding a small physical sticker to a real-world object can cause a deployed classifier to output the attacker's chosen label.
- Retraining a backdoored model on a new task does not remove the original backdoor and can reduce accuracy by about 25 percent when the trigger is present.
- Standard performance testing on clean data is insufficient to detect the vulnerability.
Where Pith is reading between the lines
- Users who receive models from external sources may need independent tests that probe for trigger-activated failures rather than relying only on reported accuracy numbers.
- The same poisoning approach could be applied in other stages of the machine learning pipeline, such as data collection or fine-tuning services.
- Detection methods might focus on searching for small input perturbations that cause large output changes, since the trigger is designed to be stealthy under normal testing.
Load-bearing premise
The attacker must be able to control or influence the training data and process enough to insert the backdoor without the user detecting the change.
What would settle it
Train a neural network on a dataset where a fraction of examples contain a fixed trigger pattern paired with a wrong label, then measure accuracy on clean validation data versus accuracy on the same data with the trigger added; the claim holds if clean accuracy stays high while triggered accuracy collapses.
read the original abstract
Deep learning-based techniques have achieved state-of-the-art performance on a wide variety of recognition and classification tasks. However, these networks are typically computationally expensive to train, requiring weeks of computation on many GPUs; as a result, many users outsource the training procedure to the cloud or rely on pre-trained models that are then fine-tuned for a specific task. In this paper we show that outsourced training introduces new security risks: an adversary can create a maliciously trained network (a backdoored neural network, or a \emph{BadNet}) that has state-of-the-art performance on the user's training and validation samples, but behaves badly on specific attacker-chosen inputs. We first explore the properties of BadNets in a toy example, by creating a backdoored handwritten digit classifier. Next, we demonstrate backdoors in a more realistic scenario by creating a U.S. street sign classifier that identifies stop signs as speed limits when a special sticker is added to the stop sign; we then show in addition that the backdoor in our US street sign detector can persist even if the network is later retrained for another task and cause a drop in accuracy of {25}\% on average when the backdoor trigger is present. These results demonstrate that backdoors in neural networks are both powerful and---because the behavior of neural networks is difficult to explicate---stealthy. This work provides motivation for further research into techniques for verifying and inspecting neural networks, just as we have developed tools for verifying and debugging software.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that an adversary who controls the training process (e.g., via outsourced cloud training) can produce a backdoored neural network (BadNet) that achieves state-of-the-art accuracy on clean training and validation data while reliably misclassifying attacker-chosen trigger inputs. This is demonstrated first on a toy MNIST digit classifier and then on a realistic GTSRB U.S. street-sign classifier, where a sticker trigger causes stop signs to be classified as speed limits; the backdoor is further shown to persist after subsequent retraining on a different task, producing an average 25% accuracy drop on triggered inputs.
Significance. If the empirical results hold under the stated threat model, the work is significant for exposing a practical supply-chain vulnerability in deep learning pipelines. It supplies concrete, reproducible constructions (MNIST and GTSRB) that achieve high clean-data accuracy alongside high attack success, plus evidence that the backdoor survives fine-tuning. These findings directly motivate the development of verification and inspection methods for neural networks, analogous to software debugging tools.
major comments (2)
- [realistic scenario / persistence experiment] The persistence experiment (described in the abstract and § on realistic scenario) reports an average 25% accuracy drop when the trigger is present after retraining, but does not specify the clean-model baseline accuracy, the exact fine-tuning protocol (learning rate, epochs, dataset size), or the number of independent trials. Without these controls it is difficult to judge whether the observed drop is statistically reliable or an artifact of the particular retraining setup.
- [experiments / toy and realistic examples] The claim that BadNets achieve “state-of-the-art performance on the user’s training and validation samples” is presented without quantitative comparison to a clean model trained under identical hyperparameters and data splits. Reporting the absolute clean accuracy of both the BadNet and the clean baseline (and the poisoning ratio used) would strengthen the central claim that the backdoor does not degrade normal performance.
minor comments (2)
- [introduction / threat model] The threat model (outsourced training or pre-trained model fine-tuning) should be stated more explicitly with respect to the attacker’s capabilities (e.g., ability to choose the trigger pattern, access to the full training set, or only to a subset).
- [abstract and § on street-sign classifier] Notation for the trigger pattern and the target label should be introduced consistently; the abstract uses “special sticker” while later text presumably defines a concrete pixel pattern—aligning these descriptions would improve clarity.
Simulated Author's Rebuttal
Thank you for the referee's positive evaluation and constructive suggestions. We have revised the manuscript to provide the requested experimental details and comparisons. Our responses to the major comments are as follows.
read point-by-point responses
-
Referee: [realistic scenario / persistence experiment] The persistence experiment (described in the abstract and § on realistic scenario) reports an average 25% accuracy drop when the trigger is present after retraining, but does not specify the clean-model baseline accuracy, the exact fine-tuning protocol (learning rate, epochs, dataset size), or the number of independent trials. Without these controls it is difficult to judge whether the observed drop is statistically reliable or an artifact of the particular retraining setup.
Authors: We agree that the persistence results would benefit from additional controls and protocol details to allow proper assessment of reliability. In the revised manuscript we have expanded the relevant section to include: the clean-model baseline accuracy on triggered inputs after retraining (which remained high), the precise fine-tuning hyperparameters (learning rate, epochs, and dataset size), and the number of independent trials performed. These additions make the 25% average drop easier to interpret in context. revision: yes
-
Referee: [experiments / toy and realistic examples] The claim that BadNets achieve “state-of-the-art performance on the user’s training and validation samples” is presented without quantitative comparison to a clean model trained under identical hyperparameters and data splits. Reporting the absolute clean accuracy of both the BadNet and the clean baseline (and the poisoning ratio used) would strengthen the central claim that the backdoor does not degrade normal performance.
Authors: We acknowledge that explicit side-by-side accuracy numbers would strengthen the central claim. Although the manuscript already states that BadNets achieve state-of-the-art performance on clean data, we have added tables in the revised version that report the absolute validation accuracies for both the BadNet and an identically trained clean baseline, together with the poisoning ratios employed in each experiment (MNIST toy example and GTSRB realistic scenario). This makes the negligible impact on clean performance fully quantitative. revision: yes
Circularity Check
No significant circularity; empirical demonstration only
full rationale
The paper contains no mathematical derivation chain, first-principles predictions, or fitted parameters presented as novel results. Its central claim is an existence demonstration via two concrete implementations (MNIST digit classifier and GTSRB street-sign classifier) that embed a backdoor through poisoned training data while preserving clean-data accuracy. The persistence experiment after fine-tuning is likewise a direct empirical measurement. No equations reduce to their inputs by construction, no self-citation is load-bearing for a uniqueness theorem or ansatz, and no known empirical pattern is merely renamed. The work is self-contained as an attack feasibility study.
Axiom & Free-Parameter Ledger
invented entities (1)
-
BadNet
no independent evidence
Forward citations
Cited by 60 Pith papers
-
When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks
In the proportional high-dimensional regime, stronger backdoor training triggers improve clean accuracy and make attack success non-monotonic for regularized GLMs on Gaussian mixtures, with closed-form proofs for squa...
-
Exposing Functional Fusion: A New Class of Strategic Backdoor in Dynamic Prompt Architectures
VIPER exposes Functional Fusion in dynamic prompt architectures, enabling a backdoor that resists pruning by tightly integrating attack and utility parameters in the same high-magnitude core.
-
Cross-Modal Backdoors in Multimodal Large Language Models
Poisoning a single connector in MLLMs establishes a reusable latent backdoor pathway that transfers across modalities with over 95% attack success rate under bounded perturbations.
-
Narrow Secret Loyalty Dodges Black-Box Audits
Narrow secret loyalties implanted via fine-tuning in LLMs at multiple scales evade black-box audits unless the auditor knows the target principal.
-
MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning
MirageBackdoor is the first backdoor attack that preserves clean chain-of-thought reasoning in LLMs while steering the final answer to a specific incorrect target under a trigger.
-
Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems
DDIPE poisons LLM agent skills by embedding malicious logic in documentation examples, achieving 11.6-33.5% bypass rates across frameworks while explicit attacks are blocked, with 2.5% evading detection.
-
Backdoor Attacks on Decentralised Post-Training
An adversary controlling an intermediate pipeline stage in decentralized LLM post-training can inject a backdoor that reduces alignment from 80% to 6%, with the backdoor persisting in 60% of cases even after subsequen...
-
BadImplant: Injection-based Multi-Targeted Graph Backdoor Attack
BadImplant is the first multi-targeted backdoor attack on GNN graph classification that uses subgraph injection to achieve high success rates on multiple target labels with minimal clean accuracy loss.
-
Token by Token, Compromised: Backdoor Vulnerabilities in Unified Autoregressive Models
ToBAC is the first backdoor attack on unified autoregressive models, using data or model poisoning to make triggers elicit cross-modal malicious behavior in text and image generation.
-
Fast and Lightweight Backdoor Detection via Head Random Probing
HTell detects backdoors by random probing of the model head, reporting 99.03% true positive rate and 2.11% false positive rate at 12.69 ms per model on a benchmark of over 6700 models.
-
MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs
MetaBackdoor shows that LLMs can be backdoored using positional triggers like sequence length, enabling stealthy activation on clean inputs to leak system prompts or trigger malicious behavior.
-
VectorSmuggle: Steganographic Exfiltration in Embedding Stores and a Cryptographic Provenance Defense
Steganographic exfiltration attacks succeed on embedding stores via retrieval-preserving perturbations such as small-angle orthogonal rotation, but an Ed25519-based provenance signature closes the attack class.
-
BadDLM: Backdooring Diffusion Language Models with Diverse Targets
BadDLM implants effective backdoors in diffusion language models across concept, attribute, alignment, and payload targets by exploiting denoising dynamics while preserving clean performance.
-
Narrow Secret Loyalty Dodges Black-Box Audits
Narrow secret loyalties implanted via fine-tuning persist across model scales and low poison fractions while evading black-box audits unless the auditor knows the target principal.
-
Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions
Sparse Backdoor plants a provably undetectable backdoor in neural network weights via structured sparse perturbations and isotropic Gaussian dithering, with detection hardness reduced to Sparse PCA.
-
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
-
PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training
Stealth Pretraining Seeding plants persistent unsafe behaviors in LLMs via diffuse poisoned web content that activates on precise triggers and evades standard evaluation.
-
Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling
SET detects input-level backdoors in T2I diffusion models by learning a benign cross-attention response space from clean samples and flagging deviations under multi-scale perturbations.
-
Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward
RLVR can be backdoored with under 2% poisoned data using an asymmetric reward trigger, implanting jailbreaks that cut safety performance by 73% on average without harming benign tasks.
-
CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion
CLIP-Inspector reconstructs OOD triggers to detect backdoors in prompt-tuned CLIP models with 94% accuracy and higher AUROC than baselines, plus a repair step via fine-tuning.
-
Follow My Eyes: Backdoor Attacks on VLM-based Scanpath Prediction
Backdoor attacks on VLM-based scanpath predictors can redirect fixations toward chosen objects or inflate durations using input-conditioned triggers that evade cluster detection, and no tested defense blocks them with...
-
Inevitable Encounters: Backdoor Attacks Involving Lossy Compression
ROI coding enables backdoor triggers to survive lossy compression by embedding malicious information into binary bitstreams via sample-specific or customized masks for both learned and traditional codecs.
-
BadSNN: Backdoor Attacks on Spiking Neural Networks via Adversarial Spiking Neuron
BadSNN injects backdoors into spiking neural networks by adversarially tuning LIF neuron hyperparameters and optimizing triggers, achieving higher attack success than prior data-poisoning methods while remaining robus...
-
Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models
BadVSFM is the first effective backdoor attack on prompt-driven video segmentation foundation models, using a two-stage encoder-decoder strategy to achieve high attack success rates with limited clean performance loss.
-
Perturb and Recover: Fine-tuning for Effective Backdoor Removal from CLIP
PAR fine-tunes CLIP to remove backdoors from structured triggers while preserving standard performance, and works even with only synthetic image-text pairs.
-
Act in Collusion: Distributed Multi-Target Backdoor Attacks in Federated Learning
DMBA maintains attack success rates above 80% for all backdoors in a distributed multi-target FL setting where baselines drop below 50%.
-
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
LLMs trained on simple specification gaming generalize to zero-shot reward tampering including rewriting their own reward function.
-
The Curse of Recursion: Training on Generated Data Makes Models Forget
Use of model-generated content in training causes irreversible loss of distribution tails, termed model collapse, in VAEs, GMMs, and LLMs.
-
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
Injecting around 50 poisoned samples with a stealthy trigger creates backdoors in deep learning models achieving over 90% attack success under a weak threat model with no model or data knowledge required.
-
Sample-wise Targeted Adversarial Attacks on Test-time Adaptation
Proposes meta-learning attack with priority-aware gradient alignment for sample-wise targeted attacks on TTA that maintain label distribution consistency with no-attack baseline.
-
Detecting Trojaned DNNs via Spectral Regression Analysis
MIST detects Trojaned DNN updates by measuring spectral deviations in pre-activation representations against a benign fine-tuning reference, achieving high accuracy across datasets and attacks after a single update.
-
Be Kind, Rewrite: Benign Projections via Rewriting Defend Against LLM Data Poisoning Attacks
OBBR projects poisoned samples into benign space via rewriting with open-book examples, raising safety performance by 51% on average versus prior defenses across five attacks and four LLMs.
-
Language-Switching Triggers Take a Latent Detour Through Language Models
Researchers identify and decompose a language-switching backdoor circuit in an autoregressive LM into early attention composition, mid-layer orthogonal propagation, and final MLP conversion.
-
Lightweight and Fast Backdoor Model Detection
DFBScanner detects backdoors by combining anomaly indicators from final-layer parameters into a Trojan clue score, reporting 97.17% true-positive rate, 0.95% false-positive rate, and 1 ms average detection time on a b...
-
Activation Differences Reveal Backdoors: A Comparison of SAE Architectures
Differential SAEs isolate backdoor features far better than Crosscoders, reaching a Backdoor Isolation Score of 0.40 with perfect precision while Crosscoders stay below 0.02.
-
BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning
BehaviorGuard detects backdoor behaviors in DRL policies via behavioral drift in action distributions and suppresses suspicious actions at runtime, claimed as the first online defense for both single- and multi-agent ...
-
Checkerboard: A Simple, Effective, Efficient and Learning-free Clean Label Backdoor Attack with Low Poisoning Budget
Checkerboard derives a closed-form checkerboard trigger for clean-label backdoor attacks that achieves over 94% ASR with poisoning rates as low as 0.46% on ImageNet-100 and 99.99% ASR with 20 samples on CIFAR-10.
-
Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing
TIGS detects backdoor-induced attention collapse in LLMs and applies content-aware tail-risk screening plus intrinsic geometric smoothing to suppress attacks while preserving normal performance.
-
CSC: Turning the Adversary's Poison against Itself
CSC identifies backdoored samples via early-epoch latent clustering and conceals them by relabeling to a virtual class, driving attack success rates near zero on benchmarks with little clean accuracy loss.
-
PASTA: A Patch-Agnostic Twofold-Stealthy Backdoor Attack on Vision Transformers
PASTA enables patch-agnostic backdoor activation in ViTs via multi-location trigger insertion during training and bi-level optimization, achieving 99.13% average attack success with large gains in visual/attention ste...
-
Compiling Activation Steering into Weights via Null-Space Constraints for Stealthy Backdoors
A method compiles a behavioral steering vector into persistent weight edits via null-space projection, enabling stealthy and reliable backdoors in LLMs that trigger only on specific inputs.
-
Latent Instruction Representation Alignment: defending against jailbreaks, backdoors and undesired knowledge in LLMs
LIRA aligns latent instruction representations in LLMs to defend against jailbreaks, backdoors, and undesired knowledge, blocking over 99% of PEZ attacks and achieving optimal WMDP forgetting.
-
Phantasia: Context-Adaptive Backdoors in Vision Language Models
Phantasia is a new backdoor attack on VLMs that dynamically aligns malicious outputs with input context to achieve higher stealth and state-of-the-art success rates compared to static-pattern attacks.
-
Safety, Security, and Cognitive Risks in State-Space Models: A Systematic Threat Analysis with Spectral, Stateful, and Capacity Attacks
State-space models are vulnerable to three new attack types that corrupt state integrity, with experiments showing up to 156x output changes and 6x higher targeted corruption than random inputs.
-
SCOUT: A Defense Against Data Poisoning Attacks in Fine-Tuned Language Models
SCOUT uses token saliency analysis to detect both standard and contextually-plausible backdoor attacks in language models while maintaining clean accuracy.
-
BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation
BadGraph poisons training data with textual triggers to implant backdoors in latent diffusion models for text-guided graph generation, achieving 50% attack success rate at under 10% poisoning and over 80% at 24% poiso...
-
One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems
AuthChain poisons a single document to achieve high-success attacks on RAG systems for multi-hop queries across six LLMs while evading defenses.
-
Crowding Out The Noise: Algorithmic Collective Action Under Differential Privacy
Differential privacy reduces algorithmic collective action effectiveness, with formal lower bounds on success probability depending on collective size and privacy parameters, plus experimental verification on neural nets.
-
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
-
Unsolved Problems in ML Safety
The paper presents a roadmap that identifies four unsolved problems in ML safety: robustness against hazards, monitoring for hazards, alignment of model goals with human intent, and systemic safety.
-
LymphNode: A Plug-and-Play Access Control Method for Deep Neural Networks
LymphNode enforces default-deny access control on DNNs by injecting GSUAP into the feature space to neutralize utility for unauthorized queries and selectively restore it for authorized inputs carrying a stealthy cred...
-
LightSplit: Practical Privacy-Preserving Split Learning via Orthogonal Projections
LightSplit uses non-invertible orthogonal projections as an information bottleneck in split learning to reduce transmitted dimensionality by 32x while retaining more than 95% accuracy and limiting reconstruction risk.
-
Intelligence Delivery Network: Toward an Internet Architecture for the AI Age
IDN proposes treating AI intelligence as deliverable network services positioned dynamically across distributed compute environments to improve efficiency, latency, and privacy.
-
When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models
Paraesthesia is an emotion-style dynamic backdoor attack achieving ~99% success rate on instruction and classification tasks across four LLMs while preserving clean performance.
-
The Grand Software Supply Chain of AI Systems
AI systems lack verifiability, versioning, observability, and traceability in their software supply chains, shown by dependency analysis of 48 projects yielding 4,664 direct and 11,508 transitive dependencies totaling...
-
A Patch-based Cross-view Regularized Framework for Backdoor Defense in Multimodal Large Language Models
A patch-augmented cross-view regularization method reduces backdoor attack success rates in multimodal LLMs by enforcing output differences between original and perturbed views while using entropy constraints to prese...
-
Are Targeted Data Poisoning Attacks as Effective as We Think?
The paper introduces clean-model-based metrics that stratify test samples by vulnerability to targeted poisoning, enabling worst-case attack evaluation and vulnerability-aware defenses.
-
Prototype-Guided Robust Learning against Backdoor Attacks
PGRL defends ML models from backdoor attacks by using a few verified clean samples to guide removal of suspicious training data and unlearning of backdoor features during fine-tuning, outperforming prior defenses in e...
-
Defending against Backdoor Attacks via Module Switching
Module-switching defense disrupts backdoors more effectively than weight averaging with fewer models and remains robust even when some models share the same backdoors.
-
BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning
BoBa uses data distribution inference and overlapping clustering with voting to detect backdoor attacks in non-IID federated learning, claiming attack success rates below 0.001.
Reference graph
Works this paper leans on
-
[1]
ImageNet large scale visual recognition competition,
“ImageNet large scale visual recognition competition,” http://www. image-net.org/challenges/LSVRC/2012/, 2012
work page 2012
-
[2]
Speech recognition with deep recurrent neural networks,
A. Graves, A.-r. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Acoustics, speech and signal processing (icassp), 2013 ieee international conference on . IEEE, 2013, pp. 6645–6649
work page 2013
-
[3]
Multilingual Distributed Representations without Word Alignment,
K. M. Hermann and P. Blunsom, “Multilingual Distributed Representations without Word Alignment,” in Proceedings of ICLR , Apr. 2014. [Online]. Available: http://arxiv.org/abs/1312.6173
-
[4]
Neural machine translation by jointly learning to align and translate,
D. Bahdanau, K. Cho, and Y . Bengio, “Neural machine translation by jointly learning to align and translate,” 2014
work page 2014
-
[5]
Playing atari with deep reinforce- ment learning,
V . Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforce- ment learning,” 2013
work page 2013
-
[6]
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V . Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol....
-
[7]
What I learned from competing against a ConvNet on ImageNet,
A. Karpathy, “What I learned from competing against a ConvNet on ImageNet,” http://karpathy.github.io/2014/09/02/ what-i-learned-from-competing-against-a-convnet-on-imagenet/, 2014
work page 2014
-
[8]
Deep con- volutional neural network based species recognition for wild animal monitoring,
G. Chen, T. X. Han, Z. He, R. Kays, and T. Forrester, “Deep con- volutional neural network based species recognition for wild animal monitoring,” in Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014, pp. 858–862
work page 2014
-
[9]
PoseNet: A convolutional network for real-time 6-dof camera relocalization,
C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “Deepdriving: Learning affordance for direct perception in autonomous driving,” in Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) , ser. ICCV ’15. Washington, DC, USA: IEEE Computer Society, 2015, pp. 2722–2730. [Online]. Available: http://dx.doi.org/10.1109/ICCV .2015.312
-
[10]
Google Cloud Machine Learning Engine,
Google, Inc., “Google Cloud Machine Learning Engine,” https:// cloud.google.com/ml-engine/
-
[11]
Microsoft Corp., “Azure Batch AI Training,” https://batchaitraining. azure.com/
-
[12]
Deep Learning AMI Amazon Linux Version
Amazon.com, Inc., “Deep Learning AMI Amazon Linux Version.”
-
[13]
Cloud giants ‘ran out’ of fast GPUs for AI boffins,
K. Quach, “Cloud giants ‘ran out’ of fast GPUs for AI boffins,” https: //www.theregister.co.uk/2017/05/22/cloud providers ai researchers/
work page 2017
-
[14]
Imagenet classifica- tion with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classifica- tion with deep convolutional neural networks,” in Advances in neural information processing systems , 2012, pp. 1097–1105
work page 2012
-
[15]
Very deep convolutional networks for large-scale image recognition,
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014
work page 2014
-
[16]
Re- thinking the inception architecture for computer vision,
C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Re- thinking the inception architecture for computer vision,” 2015
work page 2015
-
[17]
Robust physical-world attacks on machine learning models,
I. Evtimov, K. Eykholt, E. Fernandes, T. Kohno, B. Li, A. Prakash, A. Rahmati, and D. Song, “Robust physical-world attacks on machine learning models,” 2017
work page 2017
-
[18]
Deep learning in neural networks: An overview,
J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural networks, vol. 61, pp. 85–117, 2015
work page 2015
-
[19]
Training a 3-node neural network is np-complete,
A. Blum and R. L. Rivest, “Training a 3-node neural network is np-complete,” in Advances in neural information processing systems , 1989, pp. 494–501
work page 1989
-
[20]
A survey on transfer learning,
S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on knowledge and data engineering , vol. 22, no. 10, pp. 1345–1359, 2010
work page 2010
-
[21]
Domain adaptation for large- scale sentiment classification: A deep learning approach,
X. Glorot, A. Bordes, and Y . Bengio, “Domain adaptation for large- scale sentiment classification: A deep learning approach,” in Pro- ceedings of the 28th international conference on machine learning (ICML-11), 2011, pp. 513–520
work page 2011
-
[22]
Cnn features off-the-shelf: An astounding baseline for recognition,
A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, “Cnn features off-the-shelf: An astounding baseline for recognition,” in Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops , ser. CVPRW ’14. Washington, DC, USA: IEEE Computer Society, 2014, pp. 512–519. [Online]. Available: http://dx.doi.org/10.1109/CVPR...
-
[23]
Correlating Fourier descriptors of local patches for road sign recognition,
F. Larsson, M. Felsberg, and P.-E. Forssen, “Correlating Fourier descriptors of local patches for road sign recognition,” IET Computer Vision, vol. 5, no. 4, pp. 244–254, 2011
work page 2011
-
[24]
L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. D. Tygar, “Adversarial machine learning,” in Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence , ser. AISec ’11. New York, NY , USA: ACM, 2011, pp. 43–58. [Online]. Available: http://doi.acm.org/10.1145/2046684.2046692
-
[25]
N. Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma, “Adversarial classification,” in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , ser. KDD ’04. New York, NY , USA: ACM, 2004, pp. 99–108. [Online]. Available: http://doi.acm.org/10.1145/1014052. 1014066
-
[26]
D. Lowd and C. Meek, “Adversarial learning,” in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining , ser. KDD ’05. New York, NY , USA: ACM, 2005, pp. 641–647. [Online]. Available: http://doi.acm.org/10.1145/1081870.1081950
-
[27]
Good word attacks on statistical spam filters
——, “Good word attacks on statistical spam filters.” in Proceedings of the Conference on Email and Anti-Spam (CEAS) , 2005
work page 2005
-
[28]
On Attacking Statistical Spam Filters,
G. L. Wittel and S. F. Wu, “On Attacking Statistical Spam Filters,” in Proceedings of the Conference on Email and Anti-Spam (CEAS) , Mountain View, CA, USA, 2004
work page 2004
-
[29]
Paragraph: Thwarting signature learning by training maliciously,
J. Newsome, B. Karp, and D. Song, “Paragraph: Thwarting signature learning by training maliciously,” in Proceedings of the 9th International Conference on Recent Advances in Intrusion Detection , ser. RAID’06. Berlin, Heidelberg: Springer-Verlag, 2006, pp. 81–105. [Online]. Available: http://dx.doi.org/10.1007/11856214 5
-
[30]
Allergy attack against automatic signa- ture generation,
S. P. Chung and A. K. Mok, “Allergy attack against automatic signa- ture generation,” in Proceedings of the 9th International Conference on Recent Advances in Intrusion Detection , 2006
work page 2006
-
[31]
Advanced allergy attacks: Does a corpus really help,
——, “Advanced allergy attacks: Does a corpus really help,” in Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection , 2007
work page 2007
-
[32]
Intriguing properties of neural networks,
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfel- low, and R. Fergus, “Intriguing properties of neural networks,” 2013
work page 2013
-
[33]
Explaining and harness- ing adversarial examples,
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harness- ing adversarial examples,” 2014
work page 2014
-
[34]
Practical black-box attacks against machine learning,
N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against machine learning,” 2016
work page 2016
-
[35]
Uni- versal adversarial perturbations,
S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Uni- versal adversarial perturbations,” 2016
work page 2016
-
[36]
Auror: Defending against poisoning attacks in collaborative deep learning systems,
S. Shen, S. Tople, and P. Saxena, “Auror: Defending against poisoning attacks in collaborative deep learning systems,” in Proceedings of the 32Nd Annual Conference on Computer Security Applications , ser. ACSAC ’16. New York, NY , USA: ACM, 2016, pp. 508–519. [Online]. Available: http://doi.acm.org/10.1145/2991079.2991125
-
[37]
Learning algorithms for classification: A comparison on handwritten digit recognition,
Y . LeCun, L. Jackel, L. Bottou, C. Cortes, J. S. Denker, H. Drucker, I. Guyon, U. Muller, E. Sackinger, P. Simard et al. , “Learning algorithms for classification: A comparison on handwritten digit recognition,” Neural networks: the statistical mechanics perspective , vol. 261, p. 276, 1995
work page 1995
-
[38]
Convexified convolutional neural networks,
Y . Zhang, P. Liang, and M. J. Wainwright, “Convexified convolutional neural networks,” arXiv preprint arXiv:1609.01000 , 2016
-
[39]
Faster r-cnn: Towards real- time object detection with region proposal networks,
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real- time object detection with region proposal networks,” in Advances in neural information processing systems , 2015, pp. 91–99
work page 2015
-
[40]
Traffic sign detection for us roads: Remaining challenges and a case for tracking,
A. Møgelmose, D. Liu, and M. M. Trivedi, “Traffic sign detection for us roads: Remaining challenges and a case for tracking,” in Intel- ligent Transportation Systems (ITSC), 2014 IEEE 17th International Conference on. IEEE, 2014, pp. 1394–1399
work page 2014
-
[41]
Decaf: A deep convolutional activation feature for generic visual recognition,
J. Donahue, Y . Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep convolutional activation feature for generic visual recognition,” in International conference on machine learning , 2014, pp. 647–655
work page 2014
-
[42]
Transfer learning and fine-tuning convolutional neural networks,
A. Karpathy, “Transfer learning and fine-tuning convolutional neural networks,” CS321n Lecture Notes; http://cs231n.github.io/ transfer-learning/
- [43]
-
[44]
Traffic sign detection and recognition using fully convolutional network guided proposals,
Y . Zhu, C. Zhang, D. Zhou, X. Wang, X. Bai, and W. Liu, “Traffic sign detection and recognition using fully convolutional network guided proposals,” Neurocomputing, vol. 214, pp. 758 – 766, 2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/ S092523121630741X
work page 2016
-
[45]
Transfer learning - machine learning’s next frontier,
S. Ruder, “Transfer learning - machine learning’s next frontier,” http: //ruder.io/transfer-learning/
-
[46]
A comprehensive guide to fine-tuning deep learning models in Keras,
F. Yu, “A comprehensive guide to fine-tuning deep learning models in Keras,” https://flyyufelix.github.io/2016/10/03/ fine-tuning-in-keras-part1.html
work page 2016
-
[47]
Network in Network Imagenet Model,
“Network in Network Imagenet Model,” https://gist.github.com/ mavenlin/d802a5849de39225bcc6
-
[48]
“Caffe models in TensorFlow,” https://github.com/ethereon/ caffe-tensorflow
- [49]
-
[50]
Convert models from Caffe to Theano format,
“Convert models from Caffe to Theano format,” https://github.com/ kencoken/caffe-model-convert
-
[51]
Converting trained models to Core ML,
Apple Inc., “Converting trained models to Core ML,” https://developer.apple.com/documentation/coreml/converting trained models to core ml
-
[52]
Convert Caffe model to Mxnet format,
“Convert Caffe model to Mxnet format,” https://github.com/apache/ incubator-mxnet/tree/master/tools/caffe converter
- [53]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.