Deceiving End-to-End Deep Learning Malware Detectors using Adversarial Examples

Felix Kreuk , Assi Barak , Shir Aviv-Reuven , Moran Baruch , Benny Pinkas , Joseph Keshet

Authors on Pith no claims yet

classification 💻 cs.LG cs.CR

keywords adversarialdeepexamplesdetectionlearningbytesfilemalware

read the original abstract

In recent years, deep learning has shown performance breakthroughs in many applications, such as image detection, image segmentation, pose estimation, and speech recognition. However, this comes with a major concern: deep networks have been found to be vulnerable to adversarial examples. Adversarial examples are slightly modified inputs that are intentionally designed to cause a misclassification by the model. In the domains of images and speech, the modifications are so small that they are not seen or heard by humans, but nevertheless greatly affect the classification of the model. Deep learning models have been successfully applied to malware detection. In this domain, generating adversarial examples is not straightforward, as small modifications to the bytes of the file could lead to significant changes in its functionality and validity. We introduce a novel loss function for generating adversarial examples specifically tailored for discrete input sets, such as executable bytes. We modify malicious binaries so that they would be detected as benign, while preserving their original functionality, by injecting a small sequence of bytes (payload) in the binary file. We applied this approach to an end-to-end convolutional deep learning malware detection model and show a high rate of detection evasion. Moreover, we show that our generated payload is robust enough to be transferable within different locations of the same file and across different files, and that its entropy is low and similar to that of benign data sections.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations
cs.CR 2026-04 unverdicted novelty 6.0

A new adversarial generator for Linux ELF malware achieves 67.74% evasion against MalConv by inserting benign-like strings, with the detector showing mean confidence drop of 0.50.
Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations
cs.CR 2026-04 unverdicted novelty 4.0

Similarity-constrained adversarial perturbations reduce drift signals in malware classifiers while achieving evasion, with l2 regularization performing best.