Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection
Pith reviewed 2026-05-20 09:18 UTC · model grok-4.3
The pith
Adding just 20 API imports via a CVAE can make malware look like a chosen benign category and cut detector recall from 87.5% to 30%.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A conditional variational autoencoder whose decoder adds but never removes API imports can shift malware classification toward a chosen benign category. For each sample the model first identifies the nearest benign class, then produces a small additive set of imports that achieves targeted misclassification. Against an ensemble detector with 87.5 percent malware recall, twenty such additions lower recall to 30 percent, and 99 percent of the evaded samples are placed in the intended benign class. The CVAE beats both frequency-based and random baselines at every injection size tested.
What carries the argument
Conditional Variational Autoencoder (CVAE) with a strictly additive decoder that introduces new API imports without removing existing ones.
If this is right
- Targeted evasion into a chosen benign category is possible with small additive changes.
- Detector recall falls sharply once twenty characteristic imports are added.
- The CVAE injection strategy outperforms random and frequency baselines for all sizes from 5 to 50.
- The attack transfers to real commercial static engines and lowers the average number of flagging engines by 54.5 percent.
Where Pith is reading between the lines
- Static detectors that rely only on the set of API imports may need additional signals such as import order or execution context to resist additive attacks.
- Similar strictly additive evasion methods could be tested against other static feature sets used in security classifiers.
- Combining import-based models with dynamic or behavioral analysis would likely reduce the effectiveness of this form of targeted evasion.
Load-bearing premise
Adding the chosen API imports preserves the malware's original functionality and does not activate non-static detection mechanisms.
What would settle it
Run the modified executable files in a sandbox and verify that malicious behavior still occurs while the static detector no longer flags them.
Figures
read the original abstract
Machine learning-based malware detectors are widely deployed in antivirus and endpoint detection systems, yet their reliance on static features makes them vulnerable to adversarial manipulation. This paper investigates whether a malware sample can be intentionally misclassified as a specific benign software category, not merely as "not malware", by adding a small number of Win32 API imports characteristic of that selected category, without removing any existing imports or retraining the detector. We propose a framework centered on a Conditional Variational Autoencoder (CVAE) whose decoder is strictly additive. It can introduce new API calls but never remove existing ones, preserving malware functionality by design. For each malware sample, the framework automatically identifies which benign category it most closely resembles and uses that as the evasion target. A knowledge-distilled differentiable proxy enables gradient-based training against the non-differentiable ensemble detector. Experiments on a six-class dataset of binary Win32 API import vectors extracted from 3,799 Windows executables (five benign categories, one malware class) show that, against a detector achieving 87.5% malware recall, adding just 20 API imports reduces recall to 30%. At k=20, among samples that evaded detection, 99% are classified as the intended target category. The CVAE outperforms both a frequency-based baseline and random selection at every tested injection size (k = 5 to 50). Validation on real PE files submitted to VirusTotal confirms that the attack transfers to commercial static detection engines, with an average 54.5% reduction in flagging engines. These findings expose a concrete vulnerability in API-based malware classifiers and demonstrate that targeted evasion into a chosen benign category is achievable with minimal, functionality-preserving modifications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a Conditional Variational Autoencoder (CVAE) with an additive decoder can inject a small number (k=20) of Win32 API imports characteristic of a chosen benign category into malware samples, reducing malware recall from 87.5% to 30% on a 3,799-sample six-class dataset while achieving 99% target-class hit rate among evaded samples. The approach uses a knowledge-distilled differentiable proxy to enable gradient-based optimization against a non-differentiable ensemble detector, outperforms frequency and random baselines across k=5 to 50, and transfers to commercial engines on VirusTotal with a 54.5% average reduction in detections. The design preserves existing imports to maintain functionality by construction.
Significance. If the central results hold, the work provides concrete evidence of a practical vulnerability in static API-import-based malware detectors, demonstrating that targeted evasion into a specific benign category is feasible with minimal additive changes. Strengths include the strictly additive decoder, direct comparison against frequency and random baselines, and real-world transfer validation on VirusTotal. These elements make the empirical findings a useful contribution to adversarial ML for security, provided the proxy-to-detector transfer is rigorously validated.
major comments (2)
- [Methods] Methods section on knowledge distillation: The paper provides no quantitative fidelity metrics (agreement rate, AUC gap, or calibration error on held-out data) between the distilled differentiable proxy and the original non-differentiable ensemble detector. This is load-bearing for the central claim, as the CVAE optimization and reported recall drop (87.5% to 30% at k=20) rely on gradients from the proxy; divergence on decision boundaries would mean the selected import vectors may not evade the real detector.
- [Experiments] Experimental results (headline numbers and Table/Figure reporting performance at k=20): The key metrics (recall reduction to 30%, 99% target-class classification among evaded samples) are given without error bars, standard deviations, or details on the number of experimental runs and statistical testing. This undermines confidence in the outperformance over baselines and the transfer claim to VirusTotal.
minor comments (3)
- [Experiments] The manuscript would benefit from an ablation study isolating the contribution of the CVAE conditioning mechanism versus the proxy alone.
- [Methods] Clarify the exact architecture and training details of the six-class detector (e.g., which ensemble members are used) to allow reproduction.
- [Discussion] Add discussion of potential dynamic-analysis triggers or behavioral changes from the injected APIs, even if the additive design preserves static functionality.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of methodological transparency and statistical reporting that will strengthen the manuscript. We address each major comment below and will incorporate the suggested changes in the revised version.
read point-by-point responses
-
Referee: [Methods] Methods section on knowledge distillation: The paper provides no quantitative fidelity metrics (agreement rate, AUC gap, or calibration error on held-out data) between the distilled differentiable proxy and the original non-differentiable ensemble detector. This is load-bearing for the central claim, as the CVAE optimization and reported recall drop (87.5% to 30% at k=20) rely on gradients from the proxy; divergence on decision boundaries would mean the selected import vectors may not evade the real detector.
Authors: We agree that explicit quantitative fidelity metrics are necessary to rigorously support the use of the distilled proxy for gradient-based optimization. The proxy was trained via knowledge distillation to approximate the ensemble's outputs on the API-import feature space. In the revised manuscript we will add a new subsection under Methods that reports the proxy's agreement rate with the original detector, the AUC gap on a held-out test set, and calibration error (e.g., expected calibration error). These metrics will be computed on the same six-class dataset splits used for the main experiments, thereby confirming that the proxy faithfully reproduces the decision boundaries relevant to the reported evasion results. revision: yes
-
Referee: [Experiments] Experimental results (headline numbers and Table/Figure reporting performance at k=20): The key metrics (recall reduction to 30%, 99% target-class classification among evaded samples) are given without error bars, standard deviations, or details on the number of experimental runs and statistical testing. This undermines confidence in the outperformance over baselines and the transfer claim to VirusTotal.
Authors: We acknowledge that the absence of variability measures and statistical details reduces confidence in the headline figures. We will revise the Experiments section and all associated tables and figures to report standard deviations or error bars computed over multiple independent runs (we will state the exact number of runs and random seeds used for CVAE training and evaluation). We will also add pairwise statistical significance tests (e.g., Wilcoxon signed-rank or paired t-tests with appropriate correction) comparing the CVAE against the frequency and random baselines at each k. For the VirusTotal transfer results we will clarify the submission protocol, the number of samples evaluated, and any observed variability across submissions. revision: yes
Circularity Check
Empirical framework with external baselines and direct validation shows no circularity
full rationale
The paper describes an empirical attack using a CVAE decoder to inject API imports, optimized via a knowledge-distilled proxy, then evaluates evasion success directly on the original ensemble detector and on VirusTotal submissions. Central claims (recall drop from 87.5% to 30% at k=20, 99% target-class hit rate) are measured experimental outcomes on a held-out dataset of 3,799 samples and compared against frequency and random baselines at multiple k values. No derivation chain, equation, or result reduces by construction to a fitted parameter, self-citation, or ansatz imported from the authors' prior work; the evaluation remains falsifiable through independent testing on the real detector.
Axiom & Free-Parameter Ledger
free parameters (1)
- injection size k
axioms (1)
- domain assumption Static Win32 API import vectors are sufficient for the detector to achieve high malware recall.
Reference graph
Works this paper leans on
-
[1]
Beyond the sandbox: Lever- aging symbolic execution for evasive malware classification,
V . V ouvoutsis, F. Casino, and C. Patsakis, “Beyond the sandbox: Lever- aging symbolic execution for evasive malware classification,”Comput. Secur., vol. 149, p. 104193, 2025
work page 2025
-
[2]
D. Gibert, N. Totosis, C. Patsakis, Q. Le, and G. Zizzo, “Assessing the impact of packing on static machine learning-based malware detection and classification systems,”Comput. Secur., vol. 156, p. 104495, 2025
work page 2025
-
[3]
Generating adversarial malware examples for black- box attacks based on gan,
W. Hu and Y . Tan, “Generating adversarial malware examples for black- box attacks based on gan,” inProc. Int. Conf. Data Mining Big Data (DMBD), ser. Commun. Comput. Inf. Sci., vol. 1745. Springer, 2022, pp. 409–423
work page 2022
-
[4]
Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning
H. S. Anderson, A. Kharkar, B. Filar, D. Evans, and P. Roth, “Learning to evade static PE machine learning malware models via reinforcement learning,”arXiv preprint arXiv:1801.08917, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[5]
Adversarial malware binaries: Evading deep learning for malware detection in executables,
B. Kolosnjaji, A. Demontis, B. Biggio, G. Giacinto, C. Eckert, and F. Roli, “Adversarial malware binaries: Evading deep learning for malware detection in executables,” inProc. 26th Eur. Signal Process. Conf. (EUSIPCO). IEEE, 2018, pp. 533–537
work page 2018
-
[6]
Functionality-preserving black-box optimization of adversarial win- dows malware,
L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Functionality-preserving black-box optimization of adversarial win- dows malware,”IEEE Trans. Inf. Forensics Secur., vol. 16, pp. 3469– 3478, 2021
work page 2021
-
[7]
Exploiting AI for attacks: On the interplay between adversarial AI and offensive AI,
S. L. Schr ¨oer, L. Pajola, A. Castagnaro, G. Apruzzese, and M. Conti, “Exploiting AI for attacks: On the interplay between adversarial AI and offensive AI,”IEEE Intell. Syst., pp. 1–10, 2025
work page 2025
-
[8]
Evasion of malware classifiers by injecting category-specific benign features,
J. Dautartas, J. R. ˇCypas, O. Kurasova, and V . Medvedev, “Evasion of malware classifiers by injecting category-specific benign features,” in Proc. 2026 Int. Conf. Advances Artif. Intell. Mach. Learn. (AAIML), 2026, pp. 102–109. PREPRINT 13
work page 2026
-
[9]
A survey of adversarial attack and defense methods for malware classification in cyber security,
S. Yan, J. Ren, W. Wang, L. Sun, W. Zhang, and Q. Yu, “A survey of adversarial attack and defense methods for malware classification in cyber security,”IEEE Commun. Surveys Tuts., vol. 25, no. 1, pp. 467– 496, 2023
work page 2023
-
[10]
Malware detection with artificial intelligence: A systematic literature review,
M. G. Gaber, M. Ahmed, and H. Janicke, “Malware detection with artificial intelligence: A systematic literature review,”ACM Comput. Surv., vol. 56, no. 6, Jan. 2024
work page 2024
-
[11]
MalBERT: Malware detection using bidirectional encoder representations from transformers,
A. Rahali and M. A. Akhloufi, “MalBERT: Malware detection using bidirectional encoder representations from transformers,” inProc. IEEE Int. Conf. Syst., Man, Cybern. (SMC). IEEE, 2021, pp. 3226–3231
work page 2021
-
[12]
P. Maniriho, A. N. Mahmood, and M. J. M. Chowdhury, “API- MalDetect: Automated malware detection framework for windows based on API calls and deep learning techniques,”J. Netw. Comput. Appl., vol. 218, p. 103704, 2023
work page 2023
-
[13]
Explainability in AI-based behavioral malware detection systems,
A. Galli, V . L. Gatta, V . Moscato, M. Postiglione, and G. Sperl `ı, “Explainability in AI-based behavioral malware detection systems,” Comput. Secur., vol. 141, p. 103842, Jun. 2024
work page 2024
-
[14]
Analysis of machine learning approaches to packing detection,
C.-H. Bertrand Van Ouytsel, K. H. T. Dam, and A. Legay, “Analysis of machine learning approaches to packing detection,”Comput. Secur., vol. 136, p. 103536, Jan. 2024
work page 2024
-
[15]
Effectiveness of adversarial benign and mal- ware examples in evasion and poisoning attacks,
M. Koz ´ak and M. Jureˇcek, “Effectiveness of adversarial benign and mal- ware examples in evasion and poisoning attacks,” inMachine Learning, Deep Learning and AI for Cybersecurity. Springer Nature Switzerland, 2025, pp. 267–290
work page 2025
-
[16]
Adversarial attacks against Windows PE malware detection: A survey of the state-of-the-art,
X. Ling, L. Wu, J. Zhang, Z. Qu, W. Deng, X. Chen, Y . Qian, C. Wu, S. Ji, T. Luo, J. Wu, and Y . Wu, “Adversarial attacks against Windows PE malware detection: A survey of the state-of-the-art,”Comput. Secur., vol. 128, May 2023
work page 2023
-
[17]
Y . Kucuk and G. Yan, “Deceiving portable executable malware classi- fiers into targeted misclassification with practical adversarial examples,” inProc. 10th ACM Conf. Data Appl. Secur. Privacy (CODASPY). Association for Computing Machinery, Mar. 2020, pp. 341–352
work page 2020
-
[18]
Android HIV: A study of repackaging malware for evading machine-learning detection,
X. Chen, C. Li, D. Wang, S. Wen, J. Zhang, S. Nepal, Y . Xiang, and K. Ren, “Android HIV: A study of repackaging malware for evading machine-learning detection,”IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 987–1001, 2020
work page 2020
-
[19]
N-gram MalGAN: Evading machine learning detection via feature n-gram,
E. Zhu, J. Zhang, J. Yan, K. Chen, and C. Gao, “N-gram MalGAN: Evading machine learning detection via feature n-gram,”Digit. Commun. Netw., vol. 8, no. 4, pp. 485–491, 2022
work page 2022
-
[20]
Improved MalGAN: Avoiding malware detector by leaning cleanware features,
M. Kawai, K. Ota, and M. Dong, “Improved MalGAN: Avoiding malware detector by leaning cleanware features,” inProc. 1st Int. Conf. Artif. Intell. Inf. Commun. (ICAIIC). IEEE, Mar. 2019, pp. 40–45
work page 2019
-
[21]
Black-box adversarial attacks against deep learning based malware binaries detection with GAN,
J. Yuan, S. Zhou, L. Lin, F. Wang, and J. Cui, “Black-box adversarial attacks against deep learning based malware binaries detection with GAN,” inECAI 2020. IOS Press, 2020, pp. 2536–2542
work page 2020
-
[22]
Evading malware classifiers using RL agent with action-mask,
S. Pandey, N. Kumar, A. Handa, and S. K. Shukla, “Evading malware classifiers using RL agent with action-mask,”Int. J. Inf. Secur., pp. 1–21, Jul. 2023
work page 2023
-
[23]
Adversarial malware sample generation method based on the prototype of deep learning detector,
Y . Qiao, W. Zhang, Z. Tian, L. T. Yang, Y . Liu, and M. Alazab, “Adversarial malware sample generation method based on the prototype of deep learning detector,”Comput. Secur., vol. 119, p. 102762, Aug. 2022
work page 2022
-
[24]
On the effectiveness of perturbations in generating evasive malware variants,
B. Jin, J. Choi, J. B. Hong, and H. Kim, “On the effectiveness of perturbations in generating evasive malware variants,”IEEE Access, vol. 11, pp. 31 062–31 074, 2023
work page 2023
-
[25]
M. Imran, A. Appice, and D. Malerba, “Evaluating realistic adversarial attacks against machine learning models for Windows PE malware detection,”Future Internet, vol. 16, no. 5, p. 168, 2024
work page 2024
-
[26]
Exploring adversarial examples in malware detection,
O. Suciu, S. E. Coull, and J. Johns, “Exploring adversarial examples in malware detection,” inProc. IEEE Secur. Privacy Workshops (SPW). IEEE, 2019, pp. 8–14
work page 2019
-
[27]
Comparison of feature extraction and classification techniques of PE malware,
L. El Neel, A. Copiaco, W. Obaid, and H. Mukhtar, “Comparison of feature extraction and classification techniques of PE malware,” inProc. 5th Int. Conf. Signal Process. Inf. Secur. (ICSPIS), 2022, pp. 26–31
work page 2022
-
[28]
“MalwareBazaar database,” accessed: 2025-09-26. [Online]. Available: https://bazaar.abuse.ch/
work page 2025
-
[29]
WinAPI- AdvMal: A Six-Class Windows API Import Dataset for Adversarial Malware,
J. Dautartas, O. Kurasova, J. R. ˇCypas, and V . Medvedev, “WinAPI- AdvMal: A Six-Class Windows API Import Dataset for Adversarial Malware,” Zenodo, May 2026, dataset, doi: 10.5281/zenodo.20208958
-
[30]
AI-HydRa: Advanced hybrid approach using random forest and deep learning for malware classification,
S. Yoo, S. Kim, S. Kim, and B. B. Kang, “AI-HydRa: Advanced hybrid approach using random forest and deep learning for malware classification,”Inf. Sci., vol. 546, pp. 420–435, 2021
work page 2021
-
[31]
Adversarial deep ensemble: Evasion attacks and defenses for malware detection,
D. Li and Q. Li, “Adversarial deep ensemble: Evasion attacks and defenses for malware detection,”IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 3886–3900, 2020
work page 2020
-
[32]
ArcFace: Additive angular margin loss for deep face recognition,
J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “ArcFace: Additive angular margin loss for deep face recognition,” inProc. IEEE/CVF Conf. Comput. Vision Pattern Recognit. (CVPR), 2019, pp. 4690–4699
work page 2019
-
[33]
Supervised contrastive learn- ing,
P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y . Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learn- ing,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 33. Curran Associates, Inc., 2020, pp. 18 661–18 673
work page 2020
-
[34]
Distilling the Knowledge in a Neural Network
G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” inNIPS Deep Learning and Representation Learning Workshop, 2015. [Online]. Available: http://arxiv.org/abs/1503.02531
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[35]
Learning structured output representation using deep conditional generative models,
K. Sohn, H. Lee, and X. Yan, “Learning structured output representation using deep conditional generative models,”Adv. Neural Inf. Process. Syst., vol. 28, 2015
work page 2015
-
[36]
Auto-encoding variational Bayes,
D. Kingma and M. Welling, “Auto-encoding variational Bayes,” inProc. Int. Conf. Learn. Representations (ICLR), 2014
work page 2014
-
[37]
Estimating or Propagating Gradients Through Stochastic Neurons
Y . Bengio, “Estimating or propagating gradients through stochastic neurons,”arXiv preprint arXiv:1305.2982, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[38]
beta-V AE: Learning basic visual con- cepts with a constrained variational framework,
I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, “beta-V AE: Learning basic visual con- cepts with a constrained variational framework,” inProc. Int. Conf. Learn. Representations (ICLR), 2017
work page 2017
-
[39]
Optuna: A next- generation hyperparameter optimization framework,
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next- generation hyperparameter optimization framework,” inProc. 25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining (KDD). Association for Computing Machinery, 2019, pp. 2623–2631. Juozas Dautartasis a Ph.D. student in Informat- ics Engineering at Vilnius University, Institute of Data S...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.