pith. sign in

arxiv: 2605.23623 · v1 · pith:PRPCCFMInew · submitted 2026-05-22 · 💻 cs.CR · cs.AI· cs.LG

Adversarial Vulnerability Under Temporal Concept Drift: A Longitudinal Study of Android Malware Detection

Pith reviewed 2026-05-25 04:15 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.LG
keywords Android malware detectionadversarial robustnesstemporal concept driftlongitudinal studyFGSMSPSAdistribution shiftmachine learning security
0
0 comments X

The pith

As the time gap between training and testing data grows, Android malware detectors lose both accuracy and resistance to adversarial attacks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines how temporal changes in Android app data affect the robustness of machine learning models against adversarial examples over more than a decade. It organizes apps into yearly slices and evaluates models under three protocols that mimic real deployment: same-year training and testing, cross-year use without updates, and expanding-window retraining with all past data. The study generates attacks with FGSM and SPSA on static and dynamic features, then tracks clean accuracy, adversarial accuracy, and attack success alongside new metrics for drift effects. Results indicate that bigger temporal separations coincide with falling clean and adversarial accuracy and rising attack success in some cases. Expanding-window retraining lessens but does not remove the robustness decline under continued data evolution.

Core claim

The central discovery is that temporal separation between training and test data is associated with reduced adversarial robustness in Android malware detection under transfer-based feature-space attacks. As the train-test gap increases, both clean accuracy and adversarial accuracy decline while attack success rates show configuration-dependent increases, especially with FGSM perturbations on static features. Expanding-window retraining mitigates but does not eliminate the robustness loss under ongoing distributional evolution.

What carries the argument

The three deployment protocols—same-year training/testing, cross-year deployment without updates, and expanding-window retraining—combined with temporal linkage metrics (RobustDrop, ΔASR, and Adversarial Amplification Factor) to link distribution shift to robustness degradation.

Load-bearing premise

The three deployment protocols accurately emulate realistic learning scenarios in Android malware detection.

What would settle it

Finding that adversarial accuracy stays stable or rises as the year gap between training and test data widens would falsify the reported association between temporal separation and robustness loss.

Figures

Figures reproduced from arXiv: 2605.23623 by Ahmed Sabbah, David Mohaisen, Mohammed Kharma, Radi Jarrar, Samer Zein.

Figure 1
Figure 1. Figure 1: Yearly distribution of benign and malware samples in the KronoDroid real-device [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Temporal drift evaluation pipeline. Applications are executed on emulators and [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Clean performance over time under the cross-year protocol for the real device [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Clean performance over time under the expanding window protocol for the [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Adversarial Accuracy (AA) for cross-year evaluation on the emulator dataset [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Adversarial accuracy (AA) for expanding window models on the real device, [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Attack success rate (ASR) for cross year models on the real device, using static [PITH_FULL_IMAGE:figures/full_fig_p025_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Attack success rate (ASR) for expanding window (incremental) models on the [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Linkage metrics for the emulator device. Each 2 [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Linkage metrics for the real device. The panels summarize how RobustDrop, [PITH_FULL_IMAGE:figures/full_fig_p032_10.png] view at source ↗
read the original abstract

We present a longitudinal, drift-aware evaluation of adversarial robustness across more than a decade of Android applications using static and dynamic feature representations extracted from emulator and real-device executions. The dataset is organized into yearly slices and evaluated under three deployment protocols that emulate realistic learning scenarios: (1) same-year training and testing, (2) cross-year deployment without model updates, and (3) expanding-window retraining with cumulative historical data. Across multiple classifier families, adversarial examples are generated using FGSM and SPSA under feasibility constraints. We measure clean performance, Adversarial Accuracy (AA), Attack Success Rate (ASR), and introduce temporal linkage metrics -- RobustDrop, $\Delta$ASR, and Adversarial Amplification Factor (AAF) -- to quantify the relationship between distribution shift and robustness degradation.nResults show that temporal separation is associated with reduced adversarial robustness under the evaluated transfer-based feature-space setting. As the train-test gap increases, clean accuracy and adversarial accuracy decline, while attack success exhibits configuration-dependent increases, particularly under FGSM perturbations and static features. Expanding-window retraining mitigates, but does not eliminate, robustness loss under continued distributional evolution. These findings indicate that temporal drift should be considered when assessing the long-term robustness of intelligent detection systems under evolving data distributions and highlight the need for drift-aware robustness assessment frameworks in long-lived adversarial environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that temporal concept drift in Android malware detection over >10 years leads to reduced adversarial robustness: as the train-test temporal gap increases under three protocols (same-year, cross-year without updates, expanding-window retraining), clean accuracy and adversarial accuracy decline while ASR rises in a configuration-dependent manner (especially FGSM on static features), quantified via new metrics RobustDrop, ΔASR, and AAF. Expanding-window retraining mitigates but does not eliminate the effect. The evaluation uses static/dynamic features from emulator/real-device runs and transfer-based FGSM/SPSA attacks.

Significance. If the central attribution to temporal drift holds, the work is significant as a rare longitudinal empirical study spanning a decade of real-world data with multiple protocols and feature types. It provides concrete evidence that drift-aware robustness assessment is needed for long-lived adversarial ML systems in security. Credit is due for the scale of the dataset and the attempt to emulate realistic deployment via the three protocols; however, the ad-hoc nature of the invented metrics (RobustDrop, ΔASR, AAF) limits immediate impact without further validation.

major comments (2)
  1. [Dataset and feature extraction] Dataset and feature extraction description (abstract and methods): the paper does not specify whether emulator/OS versions, API levels, or instrumentation are held constant across the >10-year span or updated yearly to match app vintages. If the latter (common for realism), observed drops in accuracy/robustness and rises in ASR could be partly artifacts of a drifting measurement pipeline rather than malware distribution shift alone; this directly undermines the central claim that temporal separation causes reduced robustness, as static features are also reported to show configuration-dependent ASR increases.
  2. [Abstract and deployment protocols] Abstract and results on the three protocols: the claim that expanding-window retraining 'mitigates, but does not eliminate, robustness loss' and that the protocols 'emulate realistic learning scenarios' is load-bearing for the practical implications, yet no validation or comparison to actual industry deployment practices is provided. Without this, the mitigation findings cannot be confidently generalized beyond the specific experimental setup.
minor comments (2)
  1. [Abstract] Abstract: 'nResults show' is a typographical error and should read 'Results show'.
  2. [Metrics definition] The new metrics (RobustDrop, ΔASR, AAF) are introduced without explicit mathematical definitions or comparison to standard measures; this should be added for reproducibility even if they remain ad-hoc.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments on our manuscript. We provide detailed responses to each major comment below.

read point-by-point responses
  1. Referee: [Dataset and feature extraction] Dataset and feature extraction description (abstract and methods): the paper does not specify whether emulator/OS versions, API levels, or instrumentation are held constant across the >10-year span or updated yearly to match app vintages. If the latter (common for realism), observed drops in accuracy/robustness and rises in ASR could be partly artifacts of a drifting measurement pipeline rather than malware distribution shift alone; this directly undermines the central claim that temporal separation causes reduced robustness, as static features are also reported to show configuration-dependent ASR increases.

    Authors: The referee correctly notes that the paper does not specify the details of the feature extraction pipeline across years. To achieve a realistic longitudinal study, the extraction process was updated yearly to align with contemporary Android API levels and emulator versions for each slice. This is standard practice for such studies to avoid artificial constraints. While this introduces a potential confounding factor, the central claim focuses on the impact of temporal separation in data, which includes both app evolution and the necessary adaptation of the detection environment. We will add a detailed description of the pipeline in the methods section and discuss the implications for interpreting the results. revision: partial

  2. Referee: [Abstract and deployment protocols] Abstract and results on the three protocols: the claim that expanding-window retraining 'mitigates, but does not eliminate, robustness loss' and that the protocols 'emulate realistic learning scenarios' is load-bearing for the practical implications, yet no validation or comparison to actual industry deployment practices is provided. Without this, the mitigation findings cannot be confidently generalized beyond the specific experimental setup.

    Authors: We acknowledge that the manuscript lacks explicit validation or comparison to industry deployment practices. The protocols are motivated by standard approaches in handling temporal drift in machine learning for security. We will revise the abstract to qualify the claims about emulation of realistic scenarios and add a section in the discussion addressing the limitations regarding generalization to industry settings. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical metrics defined directly from observed performance differences

full rationale

The paper is a longitudinal empirical study that reports clean accuracy, adversarial accuracy, ASR, and newly introduced metrics (RobustDrop, ΔASR, AAF) computed from measured performance under three explicit deployment protocols on yearly data slices. No equations, derivations, or fitted parameters are presented whose outputs reduce to the inputs by construction. The central claim is an observed association between temporal gap and robustness degradation; the protocols are defined operationally rather than derived. Self-citations are absent from the provided text, and no uniqueness theorems or ansatzes are invoked. This matches the default expectation for non-circular empirical work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper rests on the domain assumption that its three protocols emulate realistic scenarios and introduces three new metrics to quantify temporal effects; no free parameters or postulated physical entities are visible from the abstract.

axioms (1)
  • domain assumption The three deployment protocols emulate realistic learning scenarios
    Stated explicitly in the abstract as the basis for the evaluation design.
invented entities (1)
  • RobustDrop, ΔASR, and Adversarial Amplification Factor (AAF) no independent evidence
    purpose: Quantify the relationship between distribution shift and robustness degradation
    New metrics defined in the abstract to capture temporal linkage effects.

pith-pipeline@v0.9.0 · 5783 in / 1387 out tokens · 31399 ms · 2026-05-25T04:15:39.583130+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 1 internal anchor

  1. [1]

    Abusnaina, A

    A. Abusnaina, A. Anwar, M. Saad, A. Alabduljabbar, R. Jang, S. Salem, D. Mohaisen, One step forward, two steps back: Ml-based mal- ware detection under concept drift, Computing 107 (11) (2025) 207. doi:10.1007/S00607-025-01543-7. URL https://doi.org/10.1007/s00607-025-01543-7

  2. [2]

    Sabbah, R

    A. Sabbah, R. Jarrar, S. Zein, D. Mohaisen, Understand- ing concept drift with deprecated permissions in android mal- ware detection, CoRR abs/2507.22231 (2025). arXiv:2507.22231, doi:10.48550/ARXIV.2507.22231. URL https://doi.org/10.48550/arXiv.2507.22231

  3. [3]

    Sabbah, R

    A. Sabbah, R. Jarrar, S. Zein, D. Mohaisen, Empirical evaluation of con- cept drift in ml-based android malware detection, CoRR abs/2507.22772 (2025). arXiv:2507.22772, doi:10.48550/ARXIV.2507.22772. URL https://doi.org/10.48550/arXiv.2507.22772

  4. [4]

    Abusnaina, A

    A. Abusnaina, A. Anwar, M. Saad, A. Alabduljabbar, R. Jang, S. Salem, D. Mohaisen, Exposing the limitations of machine learning for malware detection under concept drift, in: M. Barhamgi, H. Wang, X. Wang (Eds.), Web Information Systems Engineering - WISE 2024 - 25th In- ternational Conference, Doha, Qatar, December 2-5, 2024, Proceedings, Part II, Vol. 1...

  5. [5]

    Mohaisen, O

    A. Mohaisen, O. Alrawi, M. Mohaisen, AMAL: high-fidelity, behavior- based automated malware analysis and classification, Comput. Secur. 36 52 (2015) 251–266. doi:10.1016/J.COSE.2015.04.001. URL https://doi.org/10.1016/j.cose.2015.04.001

  6. [6]

    J. G. Moreno-Torres, T. Raeder, R. Alaíz-Rodríguez, N. V. Chawla, F. Herrera, A unifying view on dataset shift in classification, Pattern Recognit. 45 (1) (2012) 521–530. doi:10.1016/J.PATCOG.2011.06.019

  7. [7]

    J. Gama, I. Zliobaite, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept drift adaptation, ACM Comput. Surv. 46 (4) (2014) 44:1– 44:37. doi:10.1145/2523813

  8. [8]

    F. Shen, J. D. Vecchio, A. Mohaisen, S. Y. Ko, L. Ziarek, Android mal- ware detection using complex-flows, IEEE Trans. Mob. Comput. 18 (6) (2019) 1231–1245. doi:10.1109/TMC.2018.2861405

  9. [9]

    Alasmary, A

    H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad, D. Nyang, A. Mohaisen, Analyzing and detecting emerging internet of things malware: A graph-based approach, IEEE Internet Things J. 6 (5) (2019) 8977–8988. doi:10.1109/JIOT.2019.2925929. URL https://doi.org/10.1109/JIOT.2019.2925929

  10. [10]

    Mobile operating system market share worldwide | statcounter global stats, https://gs.statcounter.com/os-market-share/mobile/worldwide, (Accessed on 03/29/2025)

  11. [11]

    Malware statistics & trends report | av-test, https://www.av-test.org/ en/statistics/malware/, (Accessed on 03/29/2025)

  12. [12]

    Kaspersky’s report on mobile threats in 2023 | securelist, https://se curelist.com/mobile-malware-report-2023/111964/, (Accessed on 03/29/2025)

  13. [13]

    Y. Pan, X. Ge, C. Fang, Y. Fan, A systematic literature review of an- droid malware detection using static analysis, IEEE Access 8 (2020) 116363–116379. doi:10.1109/ACCESS.2020.3002842. URL https://doi.org/10.1109/ACCESS.2020.3002842

  14. [14]

    Alzubaidi, Recent advances in android mobile malware detection: A systematic literature review, IEEE Access 9 (2021) 146318–146349

    A. Alzubaidi, Recent advances in android mobile malware detection: A systematic literature review, IEEE Access 9 (2021) 146318–146349. doi:10.1109/ACCESS.2021.3123187. 37

  15. [15]

    M. Li, Z. Fang, J. Wang, L. Cheng, Q. Zeng, T. Yang, Y. Wu, J. Geng, A systematic overview of android malware detection, Appl. Artif. Intell. 36 (1) (2022). doi:10.1080/08839514.2021.2007327

  16. [16]

    Guerra-Manzanares, H

    A. Guerra-Manzanares, H. Bahsi, S. Nõmm, Kronodroid: Time- based hybrid-featured dataset for effective android malware de- tection and characterization, Comput. Secur. 110 (2021) 102399. doi:10.1016/J.COSE.2021.102399. URL https://doi.org/10.1016/j.cose.2021.102399

  17. [17]

    Guerra-Manzanares, M

    A. Guerra-Manzanares, M. Luckner, H. Bahsi, Android malware concept drift using system calls: Detection, characterization and challenges, Ex- pert Syst. Appl. 206 (2022) 117200. doi:10.1016/J.ESWA.2022.117200

  18. [18]

    Guerra-Manzanares, H

    A. Guerra-Manzanares, H. Bahsi, On the relativity of time: Im- plications and challenges of data drift on long-term effective android malware detection, Comput. Secur. 122 (2022) 102835. doi:10.1016/J.COSE.2022.102835

  19. [19]

    Pendlebury, F

    F. Pendlebury, F. Pierazzi, R. Jordaney, J. Kinder, L. Cavallaro, TESSERACT: eliminating experimental bias in malware classification across space and time, in: N. Heninger, P. Traynor (Eds.), USENIX, 2019, pp. 729–746

  20. [20]

    Spinning Language Models: Risks of Propaganda-As-A-Service and Countermeasures , url=

    F. Barbero, F. Pendlebury, F. Pierazzi, L. Cavallaro, Tran- scending TRANSCEND: revisiting malware classification in the presence of concept drift, in: SP, IEEE, 2022, pp. 805–823. doi:10.1109/SP46214.2022.9833659

  21. [21]

    I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing ad- versarial examples, in: Y. Bengio, Y. LeCun (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015

  22. [22]

    Madry, A

    A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models resistant to adversarial attacks, in: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings, Open- Review.net, 2018. 38

  23. [23]

    Towards Evaluating the Robustness of Neural Networks

    N. Carlini, D. A. Wagner, Towards evaluating the robustness of neural networks, in: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, IEEE Computer Society, 2017, pp. 39–57. doi:10.1109/SP.2017.49

  24. [24]

    Pierazzi, F

    F. Pierazzi, F. Pendlebury, J. Cortellazzi, L. Cavallaro, Intriguing properties of adversarial ML attacks in the problem space, in: 2020 IEEE Symposium on Security and Privacy, SP 2020, San Fran- cisco, CA, USA, May 18-21, 2020, IEEE, 2020, pp. 1332–1349. doi:10.1109/SP40000.2020.00073

  25. [25]

    Bostani, V

    H. Bostani, V. Moonsamy, Evadedroid: A practical evasion attack on machine learning for black-box android malware detection, Comput. Se- cur. 139 (2024) 103676. doi:10.1016/J.COSE.2023.103676

  26. [26]

    J. C. Schlimmer, R. H. Granger, Incremental learning from noisy data, Mach. Learn. 1 (3) (1986) 317–354. doi:10.1023/A:1022810614389

  27. [27]

    Xiang, L

    Q. Xiang, L. Zi, X. Cong, Y. Wang, Concept drift adaptation meth- ods under the deep learning framework: A literature review, Applied Sciences 13 (11) (2023) 6515

  28. [28]

    Ceschin, M

    F. Ceschin, M. Botacin, H. M. Gomes, F. A. Pinage, L. S. Oliveira, A. Grégio, Fast & furious: On the modelling of malware detection as an evolving data stream, Expert Syst. Appl. 212 (2023) 118590. doi:10.1016/J.ESWA.2022.118590. URL https://doi.org/10.1016/j.eswa.2022.118590

  29. [29]

    Tripathi, H

    J. Tripathi, H. M. Gomes, M. Botacin, Towards explainable drift de- tection and early retrain in ml-based malware detection pipelines, in: M. Egele, V. Moonsamy, D. Gruss, M. Carminati (Eds.), Detection of Intrusions and Malware, and Vulnerability Assessment - 22nd Interna- tional Conference, DIMVA 2025, Graz, Austria, July 9-11, 2025, Pro- ceedings, Part...

  30. [30]

    D. Hu, Z. Ma, X. Zhang, P. Li, D. Ye, B. Ling, The concept drift prob- lem in android malware detection and its solution, Secur. Commun. Net- works 2017 (2017) 4956386:1–4956386:13. doi:10.1155/2017/4956386. 39

  31. [31]

    Z. Chen, Z. Zhang, Z. Kan, L. Yang, J. Cortellazzi, F. Pendlebury, F. Pierazzi, L. Cavallaro, G. Wang, Is it overkill? analyzing feature- space concept drift in malware detectors, in: IEEE, IEEE, 2023, pp. 21–28. doi:10.1109/SPW59333.2023.00007

  32. [32]

    Guerra-Manzanares, M

    A. Guerra-Manzanares, M. Luckner, H. Bahsi, Corrigendum to concept drift and cross-device behavior: Challenges and implications for effective android malware detection computers & security, volume 120, 102757, Comput. Secur. 124 (2023) 102998. doi:10.1016/J.COSE.2022.102998

  33. [33]

    T. Chow, Z. Kan, L. Linhardt, L. Cavallaro, D. Arp, F. Pierazzi, Drift forensics of malware classifiers, in: M. Pintor, X. Chen, F. Tramèr (Eds.), ACM, ACM, 2023, pp. 197–207. doi:10.1145/3605764.3623918

  34. [34]

    Abusnaina, Y

    A. Abusnaina, Y. Wang, S. S. Arora, K. Wang, M. Christodorescu, D. Mohaisen, Burning the adversarial bridges: Robust windows malware detection against binary-level mutations, CoRR abs/2310.03285 (2023). arXiv:2310.03285, doi:10.48550/ARXIV.2310.03285

  35. [35]

    Abusnaina, A

    A. Abusnaina, A. Anwar, S. Alshamrani, A. Alabduljabbar, R. Jang, D. Nyang, D. Mohaisen, Systematically evaluating the robustness of ml-based iot malware detection systems, in: RAID, ACM, 2022, pp. 308–320

  36. [36]

    Hinder, V

    F. Hinder, V. Vaquet, B. Hammer, Adversarial attacks for drift detection, CoRR abs/2411.16591 (2024). arXiv:2411.16591, doi:10.48550/ARXIV.2411.16591

  37. [37]

    Faruki, R

    P. Faruki, R. Bhan, V. Jain, S. Bhatia, N. E. Madhoun, R. Pa- mula, A survey and evaluation of android-based malware eva- sion techniques and detection frameworks, Inf. 14 (7) (2023) 374. doi:10.3390/INFO14070374

  38. [38]

    T. S. Sethi, M. M. Kantardzic, Handling adversarial concept drift in streaming data, Expert Syst. Appl. 97 (2018) 18–40. doi:10.1016/J.ESWA.2017.12.022

  39. [39]

    Korycki, B

    L. Korycki, B. Krawczyk, Adversarial concept drift detection under poi- soning attacks for robust data stream mining, Mach. Learn. 112 (10) (2023) 4013–4048. doi:10.1007/S10994-022-06177-W. 40

  40. [40]

    P. Chen, H. Zhang, Y. Sharma, J. Yi, C. Hsieh, ZOO: zeroth order opti- mization based black-box attacks to deep neural networks without train- ing substitute models, in: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec@CCS 2017, Dallas, TX, USA, November3, 2017, ACM,2017, pp.15–26. doi:10.1145/3128572.3140448

  41. [41]

    Rosenberg, A

    I. Rosenberg, A. Shabtai, Y. Elovici, L. Rokach, Query-efficient black- box attack against sequence-based malware classifiers, in: ACSAC ’20: Annual Computer Security Applications Conference, Virtual Event / Austin, TX, USA, 7-11 December, 2020, ACM, 2020, pp. 611–626. doi:10.1145/3427228.3427230

  42. [42]

    Yuste, E

    J. Yuste, E. G. Pardo, J. Tapiador, Optimization of code caves in mal- ware binaries to evade machine learning detectors, Comput. Secur. 116 (2022) 102643. doi:10.1016/J.COSE.2022.102643

  43. [43]

    H. S. Anderson, A. Kharkar, B. Filar, D. Evans, P. Roth, Learning to evade static PE machine learning malware models via reinforcement learning, CoRR abs/1801.08917 (2018). arXiv:1801.08917

  44. [44]

    W. Hu, Y. Tan, Generating adversarial malware examples for black-box attacks based on GAN, in: Data Mining and Big Data - 7th Interna- tional Conference, DMBD 2022, Beijing, China, November 21-24, 2022, Proceedings, Part II, Vol. 1745 of Communications in Computer and Information Science, Springer, 2022, pp. 409–423. doi:10.1007/978-981- 19-8991-9_29

  45. [45]

    Apruzzese, A

    G. Apruzzese, A. Fass, F. Pierazzi, When adversarial perturbations meet concept drift: An exploratory analysis on ML-NIDS, in: AISec 2024, Salt Lake City, UT, USA, October 14-18, 2024, ACM, 2024, pp. 149–

  46. [46]

    URL https://doi.org/10.1145/3689932.3694757

    doi:10.1145/3689932.3694757. URL https://doi.org/10.1145/3689932.3694757

  47. [47]

    In: Proceedings of the 2017 ACM on Asia Con- ference on Computer and Communications Security

    N. Papernot, P. D. McDaniel, I. J. Goodfellow, S. Jha, Z. B. Ce- lik, A. Swami, Practical black-box attacks against machine learn- ing, in: Proceedings of the 2017 ACM on Asia Conference on Com- puter and Communications Security, AsiaCCS 2017, Abu Dhabi, United Arab Emirates, April 2-6, 2017, ACM, 2017, pp. 506–519. doi:10.1145/3052973.3053009. URL https:...

  48. [48]

    Y. Liu, X. Chen, C. Liu, D. Song, Delving into transferable adversar- ial examples and black-box attacks, in: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, 2017. URL https://openreview.net/forum?id=Sys6GJqxl

  49. [49]

    Grosse, N

    K. Grosse, N. Papernot, P. Manoharan, M. Backes, P. D. McDaniel, Adversarial examples for malware detection, in: Computer Security - ESORICS 2017 - 22nd European Symposium on Research in Com- puter Security, Oslo, Norway, September 11-15, 2017, Proceedings, Part II, Lecture Notes in Computer Science, Springer, 2017, pp. 62–

  50. [50]

    URL https://doi.org/10.1007/978-3-319-66399-9\_4

    doi:10.1007/978-3-319-66399-9_4. URL https://doi.org/10.1007/978-3-319-66399-9\_4

  51. [51]

    Uesato, B

    J. Uesato, B. O’Donoghue, P. Kohli, A. van den Oord, Adversarial risk and the dangers of evaluating against weak attacks, in: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, Vol. 80 of Proceedings of Machine Learning Research, PMLR, 2018, pp. 5032– 5041. 42