arxiv: 2604.22569 · v1 · submitted 2026-04-24 · 💻 cs.CR · cs.LG

Recognition: unknown

Adversarial Co-Evolution of Malware and Detection Models: A Bilevel Optimization Perspective

Olha Jure\v{c}kov\'a , Martin Jure\v{c}ek , Matou\v{s} Koz\'ak , R\'obert L\'orencz

Authors on Pith no claims yet

Pith reviewed 2026-05-08 11:21 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords bilevel optimizationmalware detectionadversarial co-evolutionevasion attacksreinforcement learningcybersecurity

0 comments

The pith

Bilevel optimization models malware-detector co-evolution to achieve near-total immunity to evasion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to show that using bilevel optimization to simulate the back-and-forth between malware attackers and detectors produces models that resist adaptive evasion much better than standard methods. Traditional detectors and basic adversarial retraining can be evaded up to 90 percent of the time by reinforcement-learning attackers. The bilevel approach, evaluated on three malware families in the MAB-malware framework, reduces evasion rates to 0-1.89 percent and raises the number of queries needed for success by up to two orders of magnitude. This matters because malware threats change over time, so defenses must account for ongoing adaptation rather than assuming a static attacker. If correct, the work points to a way of training detectors that stay effective longer by anticipating attacker responses.

Core claim

The paper claims that modeling the strategic interaction between defender and attacker as a bilevel optimization problem yields detection models with near-total immunity. Tests against Mokes, Strab, and DCRat show evasion rates drop to 0-1.89 percent compared to 90 percent for standard classifiers, while attacker query complexity rises by up to two orders of magnitude. The authors state that only through this iterative co-evolutionary modeling can detectors withstand evolving threats.

What carries the argument

Bilevel optimization framework modeling defender model training as the upper level and attacker evasion as the lower level in an iterative co-evolution process.

If this is right

Detectors achieve low evasion rates against adaptive reinforcement-learning attackers.
Successful evasion requires substantially more queries from the attacker.
One-shot adversarial training fails to provide sustained protection in co-evolutionary settings.
The framework enables construction of resilient malware detection systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar bilevel modeling may benefit other adversarial security tasks like network anomaly detection.
Efficient solvers would be needed to apply this at large scale in real systems.
Extending tests to additional malware families could confirm the generality of the immunity results.

Load-bearing premise

The bilevel optimization process accurately captures real-world interactions between malware authors and detectors, and the simulation with three families generalizes to broader threats.

What would settle it

Finding evasion rates higher than 1.89 percent when the trained detector faces new adaptive attackers outside the three tested malware families in a realistic setting.

read the original abstract

Machine learning-based malware detectors are increasingly vulnerable to adversarial examples. Traditional defenses, such as one-shot adversarial training, often fail against adaptive attackers who use reinforcement learning to bypass detection. This paper proposes a robust defense framework based on bilevel optimization, explicitly modeling the strategic interaction between a defender and an attacker as an adversarial co-evolutionary process. We evaluate our approach using the MAB-malware framework against three distinct malware families: Mokes, Strab, and DCRat. Our experimental results demonstrate that while standard classifiers and basic adversarial retraining often remain vulnerable, showing evasion rates as high as 90 %, the proposed bilevel optimization approach consistently achieves near-total immunity, reducing evasion rates to 0 - 1.89 %. Furthermore, the iterative framework significantly increases the attacker's query complexity, raising the average cost of successful evasion by up to two orders of magnitude. These findings suggest that modeling the iterative cycle of attack and defense through bilevel optimization is essential for developing resilient malware detection systems capable of withstanding evolving adversarial threats.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies bilevel optimization to model malware-detector co-evolution and reports large evasion drops plus higher attack costs on three families in simulation, but the evidence is still thin on methods and generality.

read the letter

The main takeaway is that this work treats the arms race between malware and detectors as an explicit bilevel optimization problem and gets strong reported results on Mokes, Strab, and DCRat using the MAB-malware simulator. Evasion falls to 0-1.89% while baselines reach 90%, and successful attacks cost up to two orders of magnitude more queries. That framing is the clearest new piece: it moves past one-shot adversarial training by modeling the iterative strategic interaction directly. The empirical gaps versus standard classifiers and basic retraining are the part that stands out as potentially useful if they hold up. The paper does a decent job of stating the practical problem and showing concrete metrics on real families rather than just synthetic data. The bilevel setup gives a clean way to think about anticipating the attacker's response, which is a step beyond most current defenses. Soft spots are mostly around missing detail. The abstract gives no equations for the bilevel program, no description of the inner and outer solvers, no ablations, and no statistical reporting. It is unclear whether the three families were fixed in advance or selected after seeing results, and the simulation may not capture how real attackers adapt outside the MAB framework. Generalization to other malware or non-simulated settings is asserted but not demonstrated. These are standard issues for an early empirical paper rather than fatal ones. The work is aimed at people in adversarial ML and security who already think about co-evolution and want a bilevel lens on it. A reader already familiar with bilevel methods or malware simulators could extract the idea and the numbers quickly. It deserves a serious referee because the central claim is falsifiable, the baseline comparisons are external, and the problem is practically relevant. The paper is coherent on its own terms and shows honest engagement with the adaptive threat. I would send it to peer review with the expectation that reviewers will ask for the full optimization details, code, and more families or real-world checks.

Referee Report

2 major / 2 minor

Summary. The paper proposes a bilevel optimization framework to explicitly model the strategic co-evolution between an adaptive malware attacker and a machine learning-based detector. Evaluated via the MAB-malware simulation on three families (Mokes, Strab, DCRat), it claims the approach yields evasion rates of 0-1.89% (vs. up to 90% for baselines) and raises successful evasion query costs by up to two orders of magnitude.

Significance. If the experimental results hold under scrutiny, the work would be significant for shifting malware defense from static or one-shot adversarial training toward iterative, game-theoretic modeling of attacker-defender adaptation. The reported effect sizes on evasion reduction and query complexity are large and could inform practical robust detector design, though the simulation-only setting with three families limits immediate generalizability.

major comments (2)

[Experimental Evaluation] Experimental Evaluation section: the central claim of consistent near-total immunity rests on results from only three malware families in the MAB-malware framework. The manuscript must clarify selection criteria for these families and report whether they were chosen a priori or post-hoc, as the latter would undermine the cross-family generalization asserted in the abstract.
[Methods] Methods section (bilevel formulation): the paper does not provide sufficient detail on the concrete algorithm used to solve the bilevel problem (e.g., alternating optimization steps, handling of non-differentiable components, or integration with the reinforcement-learning attacker). Without this, it is unclear whether the reported gains derive from the bilevel structure itself or from implementation-specific choices that may not generalize.

minor comments (2)

[Abstract] Abstract: the phrase 'near-total immunity' is imprecise given the reported range of 0-1.89%; specify the exact conditions or families achieving 0% evasion.
[Experimental Evaluation] The manuscript should include ablation studies isolating the contribution of the bilevel co-evolution loop versus standard adversarial retraining to strengthen causal attribution of the observed improvements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and have revised the manuscript to provide the requested clarifications.

read point-by-point responses

Referee: [Experimental Evaluation] Experimental Evaluation section: the central claim of consistent near-total immunity rests on results from only three malware families in the MAB-malware framework. The manuscript must clarify selection criteria for these families and report whether they were chosen a priori or post-hoc, as the latter would undermine the cross-family generalization asserted in the abstract.

Authors: We appreciate the referee highlighting this point. The families Mokes, Strab, and DCRat were selected a priori to capture behavioral diversity in the MAB-malware simulation (differing evasion tactics, payload structures, and prevalence). Selection occurred before experiments began and was not influenced by results. We have added a dedicated paragraph in the Experimental Evaluation section explaining the criteria, including why these families support cross-family generalization claims. This revision directly addresses the concern. revision: yes
Referee: [Methods] Methods section (bilevel formulation): the paper does not provide sufficient detail on the concrete algorithm used to solve the bilevel problem (e.g., alternating optimization steps, handling of non-differentiable components, or integration with the reinforcement-learning attacker). Without this, it is unclear whether the reported gains derive from the bilevel structure itself or from implementation-specific choices that may not generalize.

Authors: We agree that the original Methods section lacked sufficient algorithmic detail. In the revision we have expanded this section to describe the concrete solver: an alternating optimization procedure in which the defender (upper level) iteratively optimizes the detection model parameters while anticipating the attacker's best response, and the attacker (lower level) employs reinforcement learning to maximize evasion success. Non-differentiable components are handled via a hybrid scheme that uses gradient-based updates where possible and surrogate or gradient-free methods (including RL policy gradients) otherwise. Integration with the RL attacker occurs through repeated co-evolution rounds, with the defender's objective explicitly incorporating the attacker's learned policy. Pseudocode for the full procedure has been added. These changes make clear that the reported improvements arise from the bilevel co-evolutionary structure rather than implementation artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a bilevel optimization framework to model defender-attacker co-evolution in malware detection and evaluates it empirically via MAB-malware simulations on three specific families (Mokes, Strab, DCRat). Reported outcomes—evasion rates reduced to 0-1.89% and query complexity increased by up to two orders of magnitude—are direct experimental measurements against external baselines, with no derivation chain, parameter fitting, or self-citation that reduces the central claims to inputs by construction. The approach is self-contained and relies on independent simulation results rather than tautological redefinitions or renamed known patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the approach relies on standard bilevel optimization from the literature and the external MAB-malware framework.

pith-pipeline@v0.9.0 · 5505 in / 1107 out tokens · 40355 ms · 2026-05-08T11:21:56.018545+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 4 canonical work pages

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...
[2]

EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models

Anderson, B. and Roth, P. (2018). Ember: An open dataset for training static pe malware machine learning models. arXiv preprint arXiv:1804.04637

work page Pith review arXiv 2018
[3]

Aryal, K., Gupta, M., Abdelsalam, M., Kunwar, P., and Thuraisingham, B. (2024). A survey on adversarial attacks for malware analysis. IEEE Access

2024
[4]

Aslan, \"O . A. and Samet, R. (2020). A comprehensive review on malware detection approaches. IEEE access , 8:6249--6271

2020
[5]

B \'a lik, D., Jure c ek, M., and Stamp, M. (2025). Rawmal-tf: Raw malware dataset labeled by type and family. arXiv preprint arXiv:2506.23909

work page arXiv 2025
[6]

Ban, Y., Kim, M., and Cho, H. (2024). An empirical study on the effectiveness of adversarial examples in malware detection. Computer Modeling in Engineering & Sciences (CMES) , 139(3)

2024
[7]

Dritsoula, L., Loiseau, P., and Musacchio, J. (2017). A game-theoretic analysis of adversarial classification. IEEE Transactions on Information Forensics and Security , 12(12):3094--3109

2017
[8]

R., Li, W., Chai, Y., Pacheco, J., and Chen, H

Ebrahimi, M. R., Li, W., Chai, Y., Pacheco, J., and Chen, H. (2022). An adversarial reinforcement learning framework for robust machine learning-based malware detection. In 2022 IEEE International Conference on Data Mining Workshops (ICDMW) , pages 567--576. IEEE

2022
[9]

and Jureček, M

Jurečková, O. and Jureček, M. (2026). Detecting and explaining malware family evolution using rule-based drift analysis. In Proceedings of the 12th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP , pages 366--374. SciTePress

2026
[10]

The number of the year: Kaspersky detected half a million malicious files daily in 2025

Kaspersky (2025). The number of the year: Kaspersky detected half a million malicious files daily in 2025. https://www.kaspersky.com/about/press-releases/the-number-of-the-year-kaspersky-detected-half-a-million-malicious-files-daily-in-2025. Accessed: 2026-04-23

2025
[11]

Machine learning for malware detection

Kaspersky Lab (2020). Machine learning for malware detection. Technical report, Kaspersky Lab. Whitepaper

2020
[12]

and Jure c ek, M

Koz \'a k, M. and Jure c ek, M. (2023). Combining generators of adversarial malware examples to increase evasion rate. arXiv preprint arXiv:2304.07360

work page arXiv 2023
[13]

Koz \'a k, M., Jure c ek, M., Stamp, M., and Troia, F. D. (2024). Creating valid adversarial examples of malware. Journal of Computer Virology and Hacking Techniques , 20(4):607--621

2024
[14]

Louth \'a nov \'a , P., Koz \'a k, M., Jure c ek, M., Stamp, M., and Di Troia, F. (2024). A comparison of adversarial malware generators: P. louth \'a nov \'a et al. Journal of Computer Virology and Hacking Techniques , 20(4):623--639

2024
[15]

Sinha, A., Fang, F., An, B., Kiekintveld, C., and Tambe, M. (2018). Stackelberg security games: Looking beyond a decade of success. IJCAI

2018
[16]

Song, W., Li, X., Afroz, S., Garg, D., Kuznetsov, D., and Yin, H. (2020). Automatic generation of adversarial examples for interpreting malware classifiers. ArXiv , abs/2003.03100

work page arXiv 2020
[17]

E., and Johns, J

Suciu, O., Coull, S. E., and Johns, J. (2019). Exploring adversarial examples in malware detection. In 2019 IEEE Security and Privacy Workshops (SPW) , pages 8--14. IEEE

2019
[18]

Wilczy \'n ski, A., Jak \'o bik, A., and Ko odziej, J. (2016). Stackelberg security games: Models, applications and computational aspects. Journal of Telecommunications and Information Technology , (3):70--79

2016
[19]

Zhuang, S., Zhang, W., Liu, F., Sun, J., Liu, Y., Geng, L., and Ma, W. (2025). A robust ensemble malware detector against powerful adversaries. In International Conference on Computational Science , pages 17--32. Springer

2025