A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox

Aisha Ali-Gombe; Justin Woodring; Lamine Noureddine; Mst Eshita Khatun; Sideeq Bello; Zhiyong Sui

arxiv: 2605.21821 · v1 · pith:QUUVRVNUnew · submitted 2026-05-20 · 💻 cs.CR

A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox

Zhiyong Sui , Lamine Noureddine , Mst Eshita Khatun , Sideeq Bello , Justin Woodring , Aisha Ali-Gombe This is my paper

Pith reviewed 2026-05-22 08:23 UTC · model grok-4.3

classification 💻 cs.CR

keywords malware analysissandbox evasionlarge language modelsYARA rulesbypass rulesautomated analysisevasion detection

0 comments

The pith

Large language models can generate YARA rules that bypass malware evasion checks in sandboxes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that large language models can automatically produce YARA rules capable of bypassing the environment detection mechanisms that malware uses to evade sandbox analysis. A sympathetic reader would care because it offers a way to handle the rapid evolution of evasion techniques without relying on labor-intensive manual reverse engineering for every new method. The approach works by feeding execution traces from prematurely terminated malware samples into the models, which apply multiple reasoning strategies to craft targeted bypass rules, then refines them through sanitization and iteration. If successful, this enables the identification of additional malware families and the observation of behaviors that standard platforms miss.

Core claim

The authors claim that their ABLE system leverages large language models to analyze malware execution traces and generate bypass YARA rules, achieving a 79% success rate across 334 samples from four models, where iterative refinement aids 29.5% of cases, and resulting in 47% more family classifications than existing platforms while uncovering previously hidden behaviors.

What carries the argument

The ABLE pipeline, which combines LLM reasoning on execution traces with an auto-sanitization pipeline and feedback-driven iterative refinement to produce functional bypass rules.

If this is right

Sandboxes can process more malware samples effectively without custom manual rules for each evasion technique.
Analysts gain visibility into malware families that were previously misclassified or undetected.
Hidden malicious behaviors become observable in sandbox reports for a larger portion of samples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be adapted to generate rules for other types of security analysis tools that face similar evasion issues.
Testing across additional model types might reveal whether success rates hold beyond the open-weight models evaluated here.
Combining this generation process with existing signature databases could reduce the need for repeated refinements over time.

Load-bearing premise

The generated rules from the language models will consistently and reliably bypass the specific evasion checks in actual sandbox environments without introducing errors or missing critical evasion mechanisms.

What would settle it

Observing whether applying the output YARA rules to the malware samples in a real analysis sandbox results in the malicious payloads executing as expected rather than being suppressed by the evasion detection.

Figures

Figures reproduced from arXiv: 2605.21821 by Aisha Ali-Gombe, Justin Woodring, Lamine Noureddine, Mst Eshita Khatun, Sideeq Bello, Zhiyong Sui.

**Figure 1.** Figure 1: Static analysis of StealC malware. 2.3. Motivating Example: StealC Malware We illustrate the evasion bypass challenge using StealC, an information stealer malware distributed via a decentralized MaaS model [18] [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Dynamic analysis of StealC. The clean trace (c’) shows the call to [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of ABLE the engine analyzes the failure and uses this feedback to guide new LLM-based rules in subsequent iterations. 3.1. LLM-Guided YARA Rule Generation ABLE uses LLMs to analyze execution traces and generate YARA rules with bypass actions through structured prompting. Recent studies have shown that LLMs trained on large corpora containing code and decompilation knowledge excel in code analysi… view at source ↗

**Figure 4.** Figure 4: Prompt template structure. Components 1, 2, 4 are [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: YARA rule auto-sanitization with self-correction [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: This figures illustrate the workflow example of ABLE on a StealC malware. [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

read the original abstract

Sandbox evasion remains a critical challenge for automated malware analysis, as modern malware employs environment checks to detect analysis platforms and suppress malicious behavior. Existing approaches rely on manually crafted bypass rules that require deep reverse engineering of each evasion mechanism -an approach that cannot scale against rapidly evolving evasion techniques. In this paper, we leverage large language models (LLMs) to automatically generate YARA rules that bypass evasion checks in sandbox environments. We propose ABLE, which analyzes execution traces from malware terminated due to potentially evasive behavior and employs multiple reasoning strategies to generate targeted bypass rules. To address syntactic errors and improve the efficacy of the bypass rules in the LLM outputs, we introduce an auto-sanitization pipeline and feedback-driven iterative refinement. We evaluate ABLE on 334 real-world malware samples across four open-weight LLMs. ABLE achieves a 79% bypass success rate, with iterative refinement contributing 29.5% of successful cases. Compared to existing analysis platforms, ABLE identifies 47% more malware family classifications and exposes previously hidden behaviors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds an LLM pipeline to turn malware traces into YARA rules for sandbox bypass and reports 79% success on 334 samples, but the concrete mechanism linking static rules to runtime evasion defeat still needs clearer validation.

read the letter

The main takeaway is that ABLE feeds execution traces from early-terminated malware into LLMs, uses several reasoning strategies to produce YARA rules, then applies auto-sanitization and iterative feedback to improve them. On 334 real samples it reaches 79% bypass success, with the refinement step accounting for roughly 30% of the wins, and it surfaces 47% more family labels than standard platforms while revealing behaviors that were previously hidden.

Referee Report

2 major / 2 minor

Summary. The paper proposes ABLE, an LLM-based system that analyzes malware execution traces to automatically generate YARA rules for bypassing sandbox evasion checks. It introduces auto-sanitization and iterative refinement to improve rule quality, and evaluates the approach on 334 real-world malware samples using four open-weight LLMs, reporting a 79% bypass success rate (with 29.5% of successes attributed to refinement) along with 47% more family classifications and exposure of hidden behaviors compared to existing platforms.

Significance. If the empirical results hold under rigorous validation, the work could offer a scalable alternative to manual reverse engineering for generating evasion bypasses, potentially improving automated malware analysis throughput and behavioral visibility. The combination of LLMs with feedback-driven refinement represents a practical application of generative models to a domain traditionally reliant on expert-crafted rules.

major comments (2)

[Evaluation] Evaluation section: The headline 79% bypass success rate (and the 29.5% contribution from iterative refinement) is defined as the malware samples exhibiting malicious behavior rather than terminating early after rule application. However, the manuscript does not specify the sandbox platform used, the precise integration mechanism for applying the generated YARA rules (a static pattern-matching tool) to neutralize runtime environment checks performed by the malware, or the concrete success criteria (e.g., increased API call volume, dropped files, or C2 traffic). This leaves open whether the metric reflects genuine functional bypass or measurement artifacts.
[Methods] Methods and § on rule generation: The claim that ABLE produces rules that 'reliably bypass evasion checks' rests on the assumption that LLM-generated YARA patterns, after sanitization, correctly target and disable the specific environment-detection logic in the samples. No error analysis or case studies are provided showing that the rules address the actual evasion mechanisms (e.g., timing checks, hardware artifacts) rather than producing overly permissive or irrelevant patterns.

minor comments (2)

[Abstract] Abstract: The phrase 'exposes previously hidden behaviors' is not quantified; a concrete metric (e.g., additional API calls or network connections observed) would strengthen the comparison to existing platforms.
[Evaluation] Sample selection: The criteria for choosing the 334 malware samples and the distribution across families or evasion techniques are not detailed, making it difficult to assess generalizability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. The comments highlight important aspects of clarity in evaluation and validation of rule quality. We respond to each major comment below and will incorporate revisions to address them in the next version of the paper.

read point-by-point responses

Referee: [Evaluation] Evaluation section: The headline 79% bypass success rate (and the 29.5% contribution from iterative refinement) is defined as the malware samples exhibiting malicious behavior rather than terminating early after rule application. However, the manuscript does not specify the sandbox platform used, the precise integration mechanism for applying the generated YARA rules (a static pattern-matching tool) to neutralize runtime environment checks performed by the malware, or the concrete success criteria (e.g., increased API call volume, dropped files, or C2 traffic). This leaves open whether the metric reflects genuine functional bypass or measurement artifacts.

Authors: We agree that greater specificity is required to substantiate the evaluation metric. In the revised manuscript, we will explicitly describe the sandbox platform used for trace collection and rule testing, detail the integration process by which the generated YARA rules are applied within the sandbox to intercept environment checks, and clarify the success criteria, including observable indicators such as increased API call volume, dropped files, and C2 traffic. These additions will confirm that the 79% rate measures functional bypass rather than artifacts. revision: yes
Referee: [Methods] Methods and § on rule generation: The claim that ABLE produces rules that 'reliably bypass evasion checks' rests on the assumption that LLM-generated YARA patterns, after sanitization, correctly target and disable the specific environment-detection logic in the samples. No error analysis or case studies are provided showing that the rules address the actual evasion mechanisms (e.g., timing checks, hardware artifacts) rather than producing overly permissive or irrelevant patterns.

Authors: We acknowledge the value of qualitative validation alongside quantitative results. While the success rate across 334 samples supports the overall approach, we agree that error analysis and case studies would strengthen claims about rule targeting. In the revision, we will add a dedicated subsection with error analysis and representative case studies. These will show how specific sanitized and refined YARA rules address evasion mechanisms such as timing checks or hardware artifacts, including before-and-after behavioral comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical success rates measured on external malware samples

full rationale

The paper presents an empirical pipeline: execution traces from real malware samples are fed to LLMs to produce YARA rules, followed by sanitization and iterative refinement, then success is measured by whether the rules allow the 334 samples to exhibit malicious behavior in a sandbox. No equations, fitted parameters, or self-referential definitions appear in the abstract or described method. The 79% bypass rate and 29.5% refinement contribution are reported as direct experimental outcomes on external samples rather than quantities derived from the method itself by construction. No self-citation load-bearing steps or uniqueness theorems are invoked to justify core claims. The derivation chain remains independent of its measured outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The approach depends on the premise that LLMs can produce usable security rules from traces and that the sanitization/refinement loop corrects errors without systematic bias.

axioms (1)

domain assumption LLM-generated YARA rules can be made syntactically valid and functionally effective via post-processing and iteration
Invoked to justify the auto-sanitization pipeline and feedback loop as sufficient for practical use.

invented entities (1)

ABLE framework no independent evidence
purpose: End-to-end system for trace-to-bypass-rule generation using LLMs
New named system introduced to organize the described components and evaluation.

pith-pipeline@v0.9.0 · 5733 in / 1262 out tokens · 41731 ms · 2026-05-22T08:23:39.894175+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose ABLE, which analyzes execution traces from malware terminated due to potentially evasive behavior and employs multiple reasoning strategies to generate targeted bypass rules.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ABLE achieves a 79% bypass success rate, with iterative refinement contributing 29.5% of successful cases.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

107 extracted references · 107 canonical work pages · 6 internal anchors

[1]

https://any.run,

ANY .RUN: Interactive online malware sandbox. https://any.run,

work page
[2]

Cloud-based interactive malware analysis platform

work page
[3]

MalwareBazaar malware sam- ple

abuse.ch. MalwareBazaar malware sam- ple. https : / / bazaar . abuse . ch/, 2023. SHA256: caf00150589120b59ea0145206e2aacad383d3cc18431674 fd58cc84f49b0e25

work page 2023
[4]

abuse.ch: Fighting malware and botnets

abuse.ch. abuse.ch: Fighting malware and botnets. https://abuse.ch/,

work page
[5]

Non-profit threat intelligence organization

work page
[6]

Malware dynamic analysis evasion techniques: A survey

Amir Afianian, Salman Niksefat, Babak Sadeghiyan, and David Baptiste. Malware dynamic analysis evasion techniques: A survey. ACM Computing Surveys (CSUR), 52(6):1–28, 2019

work page 2019
[7]

When malware is packin’heat; limits of machine learning classifiers based on static analysis features

Hojjat Aghakhani, Fabio Gritti, Francesco Mecca, Martina Lin- dorfer, Stefano Ortolani, Davide Balzarotti, Giovanni Vigna, and Christopher Kruegel. When malware is packin’heat; limits of machine learning classifiers based on static analysis features. In Network and Distributed System Security Symposium. Internet So- ciety, 2020

work page 2020
[8]

Exploring llms for malware detection: Review, framework design, and countermeasure approaches.arXiv preprint arXiv:2409.07587, 2024

Jamal Al-Karaki, Muhammad Al-Zafar Khan, and Marwan Omar. Exploring llms for malware detection: Review, framework design, and countermeasure approaches.arXiv preprint arXiv:2409.07587, 2024

work page arXiv 2024
[9]

Opseq: Android malware fingerprinting

Aisha Ali-Gombe, Irfan Ahmed, Golden G Richard III, and Vassil Roussev. Opseq: Android malware fingerprinting. InProceedings of the 5th Program Protection and Reverse Engineering Workshop, pages 1–12, 2015

work page 2015
[10]

Aspectdroid: Android app analysis system

Aisha Ali-Gombe, Irfan Ahmed, Golden G Richard III, and Vassil Roussev. Aspectdroid: Android app analysis system. InProceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pages 145–147, 2016

work page 2016
[11]

Toward a more dependable hybrid analysis of android malware using aspect-oriented programming

Aisha I Ali-Gombe, Brendan Saltaformaggio, Dongyan Xu, Golden G Richard III, et al. Toward a more dependable hybrid analysis of android malware using aspect-oriented programming. computers & security, 73:235–248, 2018

work page 2018
[12]

Behavioral analysis of ai- generated malware: New frontiers in threat detection

Ammar Almomani, Samer Aoudi, Ahmad Al-Qerem, Amjad Ald- weesh, and Mouhammd Alkasassbeh. Behavioral analysis of ai- generated malware: New frontiers in threat detection. InExamining Cybersecurity Risks Produced by Generative AI, pages 211–234. IGI Global Scientific Publishing, 2025

work page 2025
[13]

Evading machine learning malware detection.black Hat, 2017:1–6, 2017

Hyrum S Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. Evading machine learning malware detection.black Hat, 2017:1–6, 2017

work page 2017
[14]

Claude opus 4.1

Anthropic. Claude opus 4.1. https://www.anthropic.com/news/ claude-opus-4-1, 2025

work page 2025
[15]

Hancitor (aka chanitor) observed using multiple attack approaches

Ankit Anubhav and Dileep Jallepalli. Hancitor (aka chanitor) observed using multiple attack approaches. Mandiant, Google Cloud, 2016. URL: https://cloud.google.com/blog/topics/threat- intelligence/hancitor-aka-chanit/

work page 2016
[16]

FormBook malware trend analysis

ANY .RUN. FormBook malware trend analysis. https://any.run/ malware-trends/formbook, 2023

work page 2023
[17]

Malware analysis report: e536afc7f63611d1bbea4305f958661e.exe (MD5: E536afc7f63611d1bbea4305f958661e), 2023

ANY .RUN. Malware analysis report: e536afc7f63611d1bbea4305f958661e.exe (MD5: E536afc7f63611d1bbea4305f958661e), 2023. URL: https://app.any.run/tasks/de97abb5-3aaf-40cc-b4d5-2d4a78997f09/

work page 2023
[18]

The android malware static analysis: techniques, limitations, and open challenges

Khaled Bakour, H Murat ¨Unver, and Razan Ghanem. The android malware static analysis: techniques, limitations, and open challenges. In2018 3rd International Conference on Computer Science and Engineering (UBMK), pages 586–593. Ieee, 2018

work page 2018
[19]

A reverse engineering education needs analysis survey.arXiv preprint arXiv:2212.07531, 2022

Charles R Barone IV , Robert Serafin, Ilya Shavrov, Ibrahim Baggili, Aisha Ali-Gombe, Golden G Richard III, and Andrew Case. A reverse engineering education needs analysis survey.arXiv preprint arXiv:2212.07531, 2022

work page arXiv 2022
[20]

Stealc: A copycat of vidar and raccoon infostealers gaining in popularity – part 2

Pierre Le Bourhis, Quentin Bourgue, and Sekoia TDR. Stealc: A copycat of vidar and raccoon infostealers gaining in popularity – part 2. Sekoia.io Blog, 2023. URL: https://blog.sekoia.io/stealc- a-copycat-of-vidar-and-raccoon-infostealers-gaining-in-popularity- part-2/

work page 2023
[21]

Language models are few- shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few- shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

work page 1901
[22]

A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web

Alexei Bulazel and B ¨ulent Yener. A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web. InProceedings of the 1st Reversing and Offensive-oriented Trends Symposium, pages 1–21, 2017

work page 2017
[23]

Statos: A portable tool for secure malware analysis and sample acquisition in low resource environments.Ar- ray, 26:100391, 2025

Alexander Cameron, Abu Alam, Nasreen Anjum, Javed Ali Khan, and Alexios Mylonas. Statos: A portable tool for secure malware analysis and sample acquisition in low resource environments.Ar- ray, 26:100391, 2025

work page 2025
[24]

Llm-cloudsec: Large language model empowered automatic and deep vulnerability analysis for intelligent clouds

Daipeng Cao and W Jun. Llm-cloudsec: Large language model empowered automatic and deep vulnerability analysis for intelligent clouds. InIEEE INFOCOM 2024-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pages 1–6. IEEE, 2024

work page 2024
[25]

Advanced or not? a comparative study of the use of anti- debugging and anti-vm techniques in generic and targeted malware

Ping Chen, Christophe Huygens, Lieven Desmet, and Wouter Joosen. Advanced or not? a comparative study of the use of anti- debugging and anti-vm techniques in generic and targeted malware. InIFIP International Conference on ICT Systems Security and Privacy Protection, pages 323–336. Springer, 2016

work page 2016
[26]

Droidhook: a novel api- hook based android malware dynamic analysis sandbox.Automated Software Engineering, 30(1):10, 2023

Yuning Cui, Yi Sun, and Zhaowen Lin. Droidhook: a novel api- hook based android malware dynamic analysis sandbox.Automated Software Engineering, 30(1):10, 2023

work page 2023
[27]

Francisco Handrick da Costa, Ismael Medeiros, Thales Menezes, Jo˜ao Victor da Silva, Ingrid Lorraine da Silva, Rodrigo Bonif ´acio, Krishna Narasimhan, and M´arcio Ribeiro. Exploring the use of static and dynamic analysis to improve the performance of the mining sandbox approach for android malware identification.Journal of Systems and Software, 183:111092, 2022

work page 2022
[28]

A bazar of tricks: Following team9’s development cycles

Daniel Frank, Mary Zhao and Assaf Dahan. A bazar of tricks: Following team9’s development cycles. Cybereason Nocturnus. URL: https://www.cybereason.com/blog/research/a-bazar-of-tricks- following-team9s-development-cycles

work page
[29]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-AI, Daya Guo, Qihao Liu, Zhenda Fan, Borong Liang, Aixin Huang, Zhewen Ruan, Wangding Shang, Zhaowei Zhao, Wangsheng Ren, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025. URL: https://arxiv.org/abs/2501.12948

work page internal anchor Pith review Pith/arXiv arXiv 2025
[30]

Egregor ransomware the raas successor to maze, 2021

NHS England Digital. Egregor ransomware the raas successor to maze, 2021. URL: https://digital.nhs.uk/cyber-alerts/2020/cc-3681

work page 2021
[31]

Artificial intelligence-based malware detection, analysis, and mitigation.Symmetry, 15(3):677, 2023

Amir Djenna, Ahmed Bouridane, Saddaf Rubab, and Ibrahim Moussa Marou. Artificial intelligence-based malware detection, analysis, and mitigation.Symmetry, 15(3):677, 2023

work page 2023
[32]

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024. URL: https: //arxiv.org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[33]

A survey on automated dynamic malware-analysis tech- niques and tools.ACM Computing Surveys, 44(2):1–42, 2012

Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. A survey on automated dynamic malware-analysis tech- niques and tools.ACM Computing Surveys, 44(2):1–42, 2012. doi:10.1145/2089125.2089126

work page doi:10.1145/2089125.2089126 2012
[34]

Malware analysis: Raccoon stealer v2.0

eSentire Threat Response Unit. Malware analysis: Raccoon stealer v2.0. eSentire Threat Intelligence, September 2022. URL: https: / / www. esentire . com / blog / esentire - threat - intelligence - malware - analysis-raccoon-stealer-v2-0

work page 2022
[35]

Llm-maldetect: A large language model-based method for android malware detection.IEEE Access, 2025

Ruirui Feng, Hui Chen, Shuo Wang, Md Monjurul Karim, and Qingshan Jiang. Llm-maldetect: A large language model-based method for android malware detection.IEEE Access, 2025

work page 2025
[36]

Anastasia: Android malware detection using static analysis of applications

Hossein Fereidooni, Mauro Conti, Danfeng Yao, and Alessandro Sperduti. Anastasia: Android malware detection using static analysis of applications. In2016 8th IFIP international conference on new technologies, mobility and security (NTMS), pages 1–5. IEEE, 2016

work page 2016
[37]

Matthew Gaber, Mohiuddin Ahmed, and Helge Janicke. Defeating evasive malware with peekaboo: Extracting authentic malware be- havior with dynamic binary instrumentation.Journal of Information Security and Applications, 95:104290, 2025

work page 2025
[38]

A systematical and longitudinal study of evasive behaviors in windows malware.Computers & security, 113:102550, 2022

Nicola Galloro, Mario Polino, Michele Carminati, Andrea Con- tinella, and Stefano Zanero. A systematical and longitudinal study of evasive behaviors in windows malware.Computers & security, 113:102550, 2022

work page 2022
[39]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L ´eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram ´e, et al. Gemma 2: Improving open language models at a practical size.arXiv preprint arXiv:2408.00118, 2024. URL: https://arxiv.org/abs/2408. 00118

work page internal anchor Pith review Pith/arXiv arXiv 2024
[40]

En- viral: Fuzzing the environment for evasive malware analysis

Floris Gorter, Cristiano Giuffrida, and Erik Van Der Kouwe. En- viral: Fuzzing the environment for evasive malware analysis. In Proceedings of the 16th European Workshop on System Security, pages 8–14, 2023

work page 2023
[41]

Cuckoo sandbox: open source automated malware anal- ysis

Claudio Guarnieri, Alessio Tanasi, Jurriaan Bremer, and Mark Schloesser. Cuckoo sandbox: open source automated malware anal- ysis. Black Hat USA, 2013. URL: https://media.blackhat.com/us- 13/US-13-Bremer-Mo-Malware-Mo-Problems-Cuckoo-Sandbox- WP.pdf

work page 2013
[42]

Triage: Automated malware analysis sandbox

Hatching. Triage: Automated malware analysis sandbox. https: //tria.ge/, 2024. Cloud-based malware analysis platform

work page 2024
[43]

On benchmarking code llms for android malware analysis

Yiling He, Hongyu She, Xingzhi Qian, Xinran Zheng, Zhuo Chen, Zhan Qin, and Lorenzo Cavallaro. On benchmarking code llms for android malware analysis. InProceedings of the 34th ACM SIG- SOFT International Symposium on Software Testing and Analysis, pages 153–160, 2025

work page 2025
[44]

Object allocation pattern as an indicator for maliciousness-an exploratory analysis

Adamu Hussaini, Bassam Zahran, and Aisha Ali-Gombe. Object allocation pattern as an indicator for maliciousness-an exploratory analysis. InProceedings of the Eleventh ACM Conference on Data and Application Security and Privacy, pages 313–315, 2021

work page 2021
[45]

A method for automatic android malware detection based on static analysis and deep learning.IEEE Access, 10:117334–117352, 2022

M ¨ulhem ˙Ibrahim, Bayan Issa, and Muhammed Basheer Jasser. A method for automatic android malware detection based on static analysis and deep learning.IEEE Access, 10:117334–117352, 2022

work page 2022
[46]

Intezer analyze: Genetic malware analysis

Intezer. Intezer analyze: Genetic malware analysis. https://www. intezer.com/, 2024. Code similarity and malware analysis platform

work page 2024
[47]

Dynamic analysis for iot malware detection with convolution neural network model.Ieee Access, 8:96899–96911, 2020

Jueun Jeon, Jong Hyuk Park, and Young-Sik Jeong. Dynamic analysis for iot malware detection with convolution neural network model.Ieee Access, 8:96899–96911, 2020

work page 2020
[48]

Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D

Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, and Bernhard Sch ¨olkopf. Can large language models infer causation from correlation?arXiv preprint arXiv:2306.05836, 2023

work page arXiv 2023
[49]

Joe Sandbox: Deep malware analysis

Joe Security LLC. Joe Sandbox: Deep malware analysis. https:// www.joesecurity.org, 2024. Commercial malware analysis sandbox

work page 2024
[50]

From shamoon to stonedrill: Wipers attacking saudi organizations and beyond

Kaspersky Lab. From shamoon to stonedrill: Wipers attacking saudi organizations and beyond. Kaspersky, 2017. URL: https:// media.kasperskycontenthub.com/wp-content/uploads/sites/43/2018/ 03/07180722/Report Shamoon StoneDrill final.pdf

work page 2017
[51]

Sama: A comprehensive smart automated malware analyzer empowered by chatgpt integration

Mahmoud A Khalifa, Iman Almomani, and Walid El-Shafai. Sama: A comprehensive smart automated malware analyzer empowered by chatgpt integration. In2024 IEEE 30th International Conference on Telecommunications (ICT), pages 1–6. IEEE, 2024

work page 2024
[52]

Androbyte: Llm-driven privacy analysis through byte- code summarization and dynamic dataflow call graph generation

Mst Eshita Khatun, Lamine Noureddine, Zhiyong Sui, and Aisha Ali-Gombe. Androbyte: Llm-driven privacy analysis through byte- code summarization and dynamic dataflow call graph generation. arXiv preprint arXiv:2510.15112, 2025

work page arXiv 2025
[53]

Logs in, patches out: Automated vulnerability repair via{Tree-of- Thought}{LLM}analysis

Youngjoon Kim, Sunguk Shin, Hyoungshick Kim, and Jiwon Yoon. Logs in, patches out: Automated vulnerability repair via{Tree-of- Thought}{LLM}analysis. In34th USENIX Security Symposium (USENIX Security 25), pages 4401–4419, 2025

work page 2025
[54]

Malgene: Automatic extraction of malware analysis evasion signature

Dhilung Kirat and Giovanni Vigna. Malgene: Automatic extraction of malware analysis evasion signature. InProceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 769–780, 2015

work page 2015
[55]

Evasive malware exposed and deconstructed

Christopher Kruegel. Evasive malware exposed and deconstructed. InRSA Conference USA, 2015. Session CRWD-T08

work page 2015
[56]

A case study of llm for automated vulnerability repair: Assessing impact of reasoning and patch validation feedback

Ummay Kulsum, Haotian Zhu, Bowen Xu, and Marcelo d’Amorim. A case study of llm for automated vulnerability repair: Assessing impact of reasoning and patch validation feedback. InProceedings of the 1st ACM International Conference on AI-Powered Software, pages 103–111, 2024

work page 2024
[57]

Rex86: A local large language model for assisting in x86 assembly reverse engineering

Darrin Lea, James Ghawaly, Golden Richard, Aisha Ali-Gombe, and Andrew Case. Rex86: A local large language model for assisting in x86 assembly reverse engineering. In2025 IEEE Annual Computer Security Applications Conference (ACSAC), pages 108–122. IEEE, 2025

work page 2025
[58]

Dmalnet: Dynamic malware analysis based on api feature engineering and graph learning.Computers & Security, 122:102872, 2022

Ce Li, Zijun Cheng, He Zhu, Leiqi Wang, Qiujian Lv, Yan Wang, Ning Li, and Degang Sun. Dmalnet: Dynamic malware analysis based on api feature engineering and graph learning.Computers & Security, 122:102872, 2022

work page 2022
[59]

Llm-based vulnerability detection

Hongping Li and Li Shan. Llm-based vulnerability detection. In2023 International Conference on Human-Centered Cognitive Systems (HCCS), pages 1–4. IEEE, 2023

work page 2023
[60]

Exploring and evaluating hallucinations in llm-powered code generation.arXiv preprint arXiv:2404.00971, 2024

Fang Liu, Yang Liu, Lin Shi, Houkun Huang, Ruifeng Wang, Zhen Yang, Li Zhang, Zhongqi Li, and Yuchi Ma. Exploring and evaluating hallucinations in llm-powered code generation.arXiv preprint arXiv:2404.00971, 2024

work page arXiv 2024
[61]

Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs

Wei Ma, Shangqing Liu, Zhihao Lin, Wenhan Wang, Qiang Hu, Ye Liu, Cen Zhang, Liming Nie, Li Li, and Yang Liu. Lms: Understanding code syntax and semantics for code analysis.arXiv preprint arXiv:2305.12138, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[62]

Redefining malware sandboxing: En- hancing analysis through sysmon and elk integration.IEEe Access, 12:68624–68636, 2024

Rasmi-Vlad Mahmoud, Marios Anagnostopoulos, Sergio Pastrana, and Jens Myrup Pedersen. Redefining malware sandboxing: En- hancing analysis through sysmon and elk integration.IEEe Access, 12:68624–68636, 2024

work page 2024
[63]

capa: The FLARE team’s open-source tool to identify capabilities in executable files

Mandiant. capa: The FLARE team’s open-source tool to identify capabilities in executable files. https://github.com/mandiant/capa,

work page
[64]

Open-source malware capability detection tool

work page
[65]

Spotless sandboxes: Evading malware anal- ysis systems using wear-and-tear artifacts

Najmeh Miramirkhani, Mahathi Priya Appini, Nick Nikiforakis, and Michalis Polychronakis. Spotless sandboxes: Evading malware anal- ysis systems using wear-and-tear artifacts. In2017 IEEE Symposium on Security and Privacy (SP), pages 1009–1024. IEEE, 2017

work page 2017
[66]

Borja Molina-Coronado, Antonio Ruggia, Usue Mori, Alessio Merlo, Alexander Mendiburu, and Jose Miguel-Alonso. Light up that droid! on the effectiveness of static analysis features against app obfuscation for android malware detection.Journal of Network and Computer Applications, 235:104094, 2025

work page 2025
[67]

Using an llm to help with code understanding

Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. Using an llm to help with code understanding. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13, 2024

work page 2024
[68]

Obfuscated malware detection and classification in network traffic leveraging hybrid large language models and synthetic data.Sensors (Basel, Switzerland), 25(1):202, 2025

Mehwish Naseer, Farhan Ullah, Samia Ijaz, Hamad Naeem, Amjad Alsirhani, Ghadah Naif Alwakid, and Abdullah Alomari. Obfuscated malware detection and classification in network traffic leveraging hybrid large language models and synthetic data.Sensors (Basel, Switzerland), 25(1):202, 2025

work page 2025
[69]

Muzzamil Noor, Haider Abbas, and Waleed Bin Shahid. Countering cyber threats for industrial applications: An automated approach for malware evasion detection and analysis.Journal of Network and Computer Applications, 103:249–261, 2018

work page 2018
[70]

al-khaser: Public malware techniques used in the wild: Virtual machine, emulation, debuggers, sandbox detection

Lord Noteworthy. al-khaser: Public malware techniques used in the wild: Virtual machine, emulation, debuggers, sandbox detection. https://github.com/LordNoteworthy/al-khaser, 2016. Open-source evasion toolkit, accessed: 2026

work page 2016
[71]

capemon: The monitor DLL for CAPE

Kevin O’Reilly. capemon: The monitor DLL for CAPE. https: //github.com/kevoreilly/capemon, 2024. Accessed: 2026-02-05

work page 2024
[72]

CAPE Sandbox: Malware configuration and payload extraction

Kevin O’Reilly and CAPE Contributors. CAPE Sandbox: Malware configuration and payload extraction. https://github.com/kevoreilly/ CAPEv2, 2024. Open-source malware sandbox, accessed: 2026

work page 2024
[73]

Mars stealer: Exclusive new threat research

Arnold Osipov. Mars stealer: Exclusive new threat research. Mor- phisec Labs, March 2022. URL: https://www.morphisec.com/blog/ threat-research-mars-stealer

work page 2022
[74]

Spear phishing attacks target organi- zations in ukraine, payloads include the document stealer outsteel and the downloader saintbot

Palo Alto Networks Unit 42. Spear phishing attacks target organi- zations in ukraine, payloads include the document stealer outsteel and the downloader saintbot. URL: https://unit42.paloaltonetworks. com/ukraine-targeted-outsteel-saintbot/

work page
[75]

Automatic detection and bypassing of anti-debugging techniques for microsoft windows environments.Advances in Electrical and Computer Engineering, 19(2):23–28, 2019

Juhyun Park, Yun-Hwan Jang, Soohwa Hong, and Yongsu Park. Automatic detection and bypassing of anti-debugging techniques for microsoft windows environments.Advances in Electrical and Computer Engineering, 19(2):23–28, 2019

work page 2019
[76]

Im- proving the robustness of ai-based malware detection using adver- sarial machine learning.Algorithms, 14(10):297, 2021

Shruti Patil, Vijayakumar Varadarajan, Devika Walimbe, Siddharth Gulechha, Sushant Shenoy, Aditya Raina, and Ketan Kotecha. Im- proving the robustness of ai-based malware detection using adver- sarial machine learning.Algorithms, 14(10):297, 2021

work page 2021
[77]

As- sessing llms in malicious code deobfuscation of real-world malware campaigns.Expert Systems with Applications, 256:124912, 2024

Constantinos Patsakis, Fran Casino, and Nikolaos Lykousas. As- sessing llms in malicious code deobfuscation of real-world malware campaigns.Expert Systems with Applications, 256:124912, 2024

work page 2024
[78]

Red Teaming Language Models with Language Models

Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. Red teaming language models with language models, 2022.URL https://arxiv. org/abs/2202.03286, 15, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[79]

Spvexec and spvluexec- a novel realtime defensive tool for stealthy malware infection

Nicholas Phillips and Aisha Ali-Gombe. Spvexec and spvluexec- a novel realtime defensive tool for stealthy malware infection. International Journal On Advances in Security, pages 72–85, 2023

work page 2023
[80]

Nicholas Phillips and A Ali Gombe. Longitudinal study of per- sistence vectors (pvs) in windows malware: Evolution, complexity, and stealthiness.SECURWARE 2022, The Sixteenth International Conference on Emerging Security Information, Systems and Tech- nologies, pages 28–34, 2022

work page 2022

Showing first 80 references.

[1] [1]

https://any.run,

ANY .RUN: Interactive online malware sandbox. https://any.run,

work page

[2] [2]

Cloud-based interactive malware analysis platform

work page

[3] [3]

MalwareBazaar malware sam- ple

abuse.ch. MalwareBazaar malware sam- ple. https : / / bazaar . abuse . ch/, 2023. SHA256: caf00150589120b59ea0145206e2aacad383d3cc18431674 fd58cc84f49b0e25

work page 2023

[4] [4]

abuse.ch: Fighting malware and botnets

abuse.ch. abuse.ch: Fighting malware and botnets. https://abuse.ch/,

work page

[5] [5]

Non-profit threat intelligence organization

work page

[6] [6]

Malware dynamic analysis evasion techniques: A survey

Amir Afianian, Salman Niksefat, Babak Sadeghiyan, and David Baptiste. Malware dynamic analysis evasion techniques: A survey. ACM Computing Surveys (CSUR), 52(6):1–28, 2019

work page 2019

[7] [7]

When malware is packin’heat; limits of machine learning classifiers based on static analysis features

Hojjat Aghakhani, Fabio Gritti, Francesco Mecca, Martina Lin- dorfer, Stefano Ortolani, Davide Balzarotti, Giovanni Vigna, and Christopher Kruegel. When malware is packin’heat; limits of machine learning classifiers based on static analysis features. In Network and Distributed System Security Symposium. Internet So- ciety, 2020

work page 2020

[8] [8]

Exploring llms for malware detection: Review, framework design, and countermeasure approaches.arXiv preprint arXiv:2409.07587, 2024

Jamal Al-Karaki, Muhammad Al-Zafar Khan, and Marwan Omar. Exploring llms for malware detection: Review, framework design, and countermeasure approaches.arXiv preprint arXiv:2409.07587, 2024

work page arXiv 2024

[9] [9]

Opseq: Android malware fingerprinting

Aisha Ali-Gombe, Irfan Ahmed, Golden G Richard III, and Vassil Roussev. Opseq: Android malware fingerprinting. InProceedings of the 5th Program Protection and Reverse Engineering Workshop, pages 1–12, 2015

work page 2015

[10] [10]

Aspectdroid: Android app analysis system

Aisha Ali-Gombe, Irfan Ahmed, Golden G Richard III, and Vassil Roussev. Aspectdroid: Android app analysis system. InProceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pages 145–147, 2016

work page 2016

[11] [11]

Toward a more dependable hybrid analysis of android malware using aspect-oriented programming

Aisha I Ali-Gombe, Brendan Saltaformaggio, Dongyan Xu, Golden G Richard III, et al. Toward a more dependable hybrid analysis of android malware using aspect-oriented programming. computers & security, 73:235–248, 2018

work page 2018

[12] [12]

Behavioral analysis of ai- generated malware: New frontiers in threat detection

Ammar Almomani, Samer Aoudi, Ahmad Al-Qerem, Amjad Ald- weesh, and Mouhammd Alkasassbeh. Behavioral analysis of ai- generated malware: New frontiers in threat detection. InExamining Cybersecurity Risks Produced by Generative AI, pages 211–234. IGI Global Scientific Publishing, 2025

work page 2025

[13] [13]

Evading machine learning malware detection.black Hat, 2017:1–6, 2017

Hyrum S Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. Evading machine learning malware detection.black Hat, 2017:1–6, 2017

work page 2017

[14] [14]

Claude opus 4.1

Anthropic. Claude opus 4.1. https://www.anthropic.com/news/ claude-opus-4-1, 2025

work page 2025

[15] [15]

Hancitor (aka chanitor) observed using multiple attack approaches

Ankit Anubhav and Dileep Jallepalli. Hancitor (aka chanitor) observed using multiple attack approaches. Mandiant, Google Cloud, 2016. URL: https://cloud.google.com/blog/topics/threat- intelligence/hancitor-aka-chanit/

work page 2016

[16] [16]

FormBook malware trend analysis

ANY .RUN. FormBook malware trend analysis. https://any.run/ malware-trends/formbook, 2023

work page 2023

[17] [17]

Malware analysis report: e536afc7f63611d1bbea4305f958661e.exe (MD5: E536afc7f63611d1bbea4305f958661e), 2023

ANY .RUN. Malware analysis report: e536afc7f63611d1bbea4305f958661e.exe (MD5: E536afc7f63611d1bbea4305f958661e), 2023. URL: https://app.any.run/tasks/de97abb5-3aaf-40cc-b4d5-2d4a78997f09/

work page 2023

[18] [18]

The android malware static analysis: techniques, limitations, and open challenges

Khaled Bakour, H Murat ¨Unver, and Razan Ghanem. The android malware static analysis: techniques, limitations, and open challenges. In2018 3rd International Conference on Computer Science and Engineering (UBMK), pages 586–593. Ieee, 2018

work page 2018

[19] [19]

A reverse engineering education needs analysis survey.arXiv preprint arXiv:2212.07531, 2022

Charles R Barone IV , Robert Serafin, Ilya Shavrov, Ibrahim Baggili, Aisha Ali-Gombe, Golden G Richard III, and Andrew Case. A reverse engineering education needs analysis survey.arXiv preprint arXiv:2212.07531, 2022

work page arXiv 2022

[20] [20]

Stealc: A copycat of vidar and raccoon infostealers gaining in popularity – part 2

Pierre Le Bourhis, Quentin Bourgue, and Sekoia TDR. Stealc: A copycat of vidar and raccoon infostealers gaining in popularity – part 2. Sekoia.io Blog, 2023. URL: https://blog.sekoia.io/stealc- a-copycat-of-vidar-and-raccoon-infostealers-gaining-in-popularity- part-2/

work page 2023

[21] [21]

Language models are few- shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few- shot learners.Advances in neural information processing systems, 33:1877–1901, 2020

work page 1901

[22] [22]

A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web

Alexei Bulazel and B ¨ulent Yener. A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web. InProceedings of the 1st Reversing and Offensive-oriented Trends Symposium, pages 1–21, 2017

work page 2017

[23] [23]

Statos: A portable tool for secure malware analysis and sample acquisition in low resource environments.Ar- ray, 26:100391, 2025

Alexander Cameron, Abu Alam, Nasreen Anjum, Javed Ali Khan, and Alexios Mylonas. Statos: A portable tool for secure malware analysis and sample acquisition in low resource environments.Ar- ray, 26:100391, 2025

work page 2025

[24] [24]

Llm-cloudsec: Large language model empowered automatic and deep vulnerability analysis for intelligent clouds

Daipeng Cao and W Jun. Llm-cloudsec: Large language model empowered automatic and deep vulnerability analysis for intelligent clouds. InIEEE INFOCOM 2024-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pages 1–6. IEEE, 2024

work page 2024

[25] [25]

Advanced or not? a comparative study of the use of anti- debugging and anti-vm techniques in generic and targeted malware

Ping Chen, Christophe Huygens, Lieven Desmet, and Wouter Joosen. Advanced or not? a comparative study of the use of anti- debugging and anti-vm techniques in generic and targeted malware. InIFIP International Conference on ICT Systems Security and Privacy Protection, pages 323–336. Springer, 2016

work page 2016

[26] [26]

Droidhook: a novel api- hook based android malware dynamic analysis sandbox.Automated Software Engineering, 30(1):10, 2023

Yuning Cui, Yi Sun, and Zhaowen Lin. Droidhook: a novel api- hook based android malware dynamic analysis sandbox.Automated Software Engineering, 30(1):10, 2023

work page 2023

[27] [27]

Francisco Handrick da Costa, Ismael Medeiros, Thales Menezes, Jo˜ao Victor da Silva, Ingrid Lorraine da Silva, Rodrigo Bonif ´acio, Krishna Narasimhan, and M´arcio Ribeiro. Exploring the use of static and dynamic analysis to improve the performance of the mining sandbox approach for android malware identification.Journal of Systems and Software, 183:111092, 2022

work page 2022

[28] [28]

A bazar of tricks: Following team9’s development cycles

Daniel Frank, Mary Zhao and Assaf Dahan. A bazar of tricks: Following team9’s development cycles. Cybereason Nocturnus. URL: https://www.cybereason.com/blog/research/a-bazar-of-tricks- following-team9s-development-cycles

work page

[29] [29]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-AI, Daya Guo, Qihao Liu, Zhenda Fan, Borong Liang, Aixin Huang, Zhewen Ruan, Wangding Shang, Zhaowei Zhao, Wangsheng Ren, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025. URL: https://arxiv.org/abs/2501.12948

work page internal anchor Pith review Pith/arXiv arXiv 2025

[30] [30]

Egregor ransomware the raas successor to maze, 2021

NHS England Digital. Egregor ransomware the raas successor to maze, 2021. URL: https://digital.nhs.uk/cyber-alerts/2020/cc-3681

work page 2021

[31] [31]

Artificial intelligence-based malware detection, analysis, and mitigation.Symmetry, 15(3):677, 2023

Amir Djenna, Ahmed Bouridane, Saddaf Rubab, and Ibrahim Moussa Marou. Artificial intelligence-based malware detection, analysis, and mitigation.Symmetry, 15(3):677, 2023

work page 2023

[32] [32]

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024. URL: https: //arxiv.org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024

[33] [33]

A survey on automated dynamic malware-analysis tech- niques and tools.ACM Computing Surveys, 44(2):1–42, 2012

Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. A survey on automated dynamic malware-analysis tech- niques and tools.ACM Computing Surveys, 44(2):1–42, 2012. doi:10.1145/2089125.2089126

work page doi:10.1145/2089125.2089126 2012

[34] [34]

Malware analysis: Raccoon stealer v2.0

eSentire Threat Response Unit. Malware analysis: Raccoon stealer v2.0. eSentire Threat Intelligence, September 2022. URL: https: / / www. esentire . com / blog / esentire - threat - intelligence - malware - analysis-raccoon-stealer-v2-0

work page 2022

[35] [35]

Llm-maldetect: A large language model-based method for android malware detection.IEEE Access, 2025

Ruirui Feng, Hui Chen, Shuo Wang, Md Monjurul Karim, and Qingshan Jiang. Llm-maldetect: A large language model-based method for android malware detection.IEEE Access, 2025

work page 2025

[36] [36]

Anastasia: Android malware detection using static analysis of applications

Hossein Fereidooni, Mauro Conti, Danfeng Yao, and Alessandro Sperduti. Anastasia: Android malware detection using static analysis of applications. In2016 8th IFIP international conference on new technologies, mobility and security (NTMS), pages 1–5. IEEE, 2016

work page 2016

[37] [37]

Matthew Gaber, Mohiuddin Ahmed, and Helge Janicke. Defeating evasive malware with peekaboo: Extracting authentic malware be- havior with dynamic binary instrumentation.Journal of Information Security and Applications, 95:104290, 2025

work page 2025

[38] [38]

A systematical and longitudinal study of evasive behaviors in windows malware.Computers & security, 113:102550, 2022

Nicola Galloro, Mario Polino, Michele Carminati, Andrea Con- tinella, and Stefano Zanero. A systematical and longitudinal study of evasive behaviors in windows malware.Computers & security, 113:102550, 2022

work page 2022

[39] [39]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L ´eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram ´e, et al. Gemma 2: Improving open language models at a practical size.arXiv preprint arXiv:2408.00118, 2024. URL: https://arxiv.org/abs/2408. 00118

work page internal anchor Pith review Pith/arXiv arXiv 2024

[40] [40]

En- viral: Fuzzing the environment for evasive malware analysis

Floris Gorter, Cristiano Giuffrida, and Erik Van Der Kouwe. En- viral: Fuzzing the environment for evasive malware analysis. In Proceedings of the 16th European Workshop on System Security, pages 8–14, 2023

work page 2023

[41] [41]

Cuckoo sandbox: open source automated malware anal- ysis

Claudio Guarnieri, Alessio Tanasi, Jurriaan Bremer, and Mark Schloesser. Cuckoo sandbox: open source automated malware anal- ysis. Black Hat USA, 2013. URL: https://media.blackhat.com/us- 13/US-13-Bremer-Mo-Malware-Mo-Problems-Cuckoo-Sandbox- WP.pdf

work page 2013

[42] [42]

Triage: Automated malware analysis sandbox

Hatching. Triage: Automated malware analysis sandbox. https: //tria.ge/, 2024. Cloud-based malware analysis platform

work page 2024

[43] [43]

On benchmarking code llms for android malware analysis

Yiling He, Hongyu She, Xingzhi Qian, Xinran Zheng, Zhuo Chen, Zhan Qin, and Lorenzo Cavallaro. On benchmarking code llms for android malware analysis. InProceedings of the 34th ACM SIG- SOFT International Symposium on Software Testing and Analysis, pages 153–160, 2025

work page 2025

[44] [44]

Object allocation pattern as an indicator for maliciousness-an exploratory analysis

Adamu Hussaini, Bassam Zahran, and Aisha Ali-Gombe. Object allocation pattern as an indicator for maliciousness-an exploratory analysis. InProceedings of the Eleventh ACM Conference on Data and Application Security and Privacy, pages 313–315, 2021

work page 2021

[45] [45]

A method for automatic android malware detection based on static analysis and deep learning.IEEE Access, 10:117334–117352, 2022

M ¨ulhem ˙Ibrahim, Bayan Issa, and Muhammed Basheer Jasser. A method for automatic android malware detection based on static analysis and deep learning.IEEE Access, 10:117334–117352, 2022

work page 2022

[46] [46]

Intezer analyze: Genetic malware analysis

Intezer. Intezer analyze: Genetic malware analysis. https://www. intezer.com/, 2024. Code similarity and malware analysis platform

work page 2024

[47] [47]

Dynamic analysis for iot malware detection with convolution neural network model.Ieee Access, 8:96899–96911, 2020

Jueun Jeon, Jong Hyuk Park, and Young-Sik Jeong. Dynamic analysis for iot malware detection with convolution neural network model.Ieee Access, 8:96899–96911, 2020

work page 2020

[48] [48]

Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D

Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, and Bernhard Sch ¨olkopf. Can large language models infer causation from correlation?arXiv preprint arXiv:2306.05836, 2023

work page arXiv 2023

[49] [49]

Joe Sandbox: Deep malware analysis

Joe Security LLC. Joe Sandbox: Deep malware analysis. https:// www.joesecurity.org, 2024. Commercial malware analysis sandbox

work page 2024

[50] [50]

From shamoon to stonedrill: Wipers attacking saudi organizations and beyond

Kaspersky Lab. From shamoon to stonedrill: Wipers attacking saudi organizations and beyond. Kaspersky, 2017. URL: https:// media.kasperskycontenthub.com/wp-content/uploads/sites/43/2018/ 03/07180722/Report Shamoon StoneDrill final.pdf

work page 2017

[51] [51]

Sama: A comprehensive smart automated malware analyzer empowered by chatgpt integration

Mahmoud A Khalifa, Iman Almomani, and Walid El-Shafai. Sama: A comprehensive smart automated malware analyzer empowered by chatgpt integration. In2024 IEEE 30th International Conference on Telecommunications (ICT), pages 1–6. IEEE, 2024

work page 2024

[52] [52]

Androbyte: Llm-driven privacy analysis through byte- code summarization and dynamic dataflow call graph generation

Mst Eshita Khatun, Lamine Noureddine, Zhiyong Sui, and Aisha Ali-Gombe. Androbyte: Llm-driven privacy analysis through byte- code summarization and dynamic dataflow call graph generation. arXiv preprint arXiv:2510.15112, 2025

work page arXiv 2025

[53] [53]

Logs in, patches out: Automated vulnerability repair via{Tree-of- Thought}{LLM}analysis

Youngjoon Kim, Sunguk Shin, Hyoungshick Kim, and Jiwon Yoon. Logs in, patches out: Automated vulnerability repair via{Tree-of- Thought}{LLM}analysis. In34th USENIX Security Symposium (USENIX Security 25), pages 4401–4419, 2025

work page 2025

[54] [54]

Malgene: Automatic extraction of malware analysis evasion signature

Dhilung Kirat and Giovanni Vigna. Malgene: Automatic extraction of malware analysis evasion signature. InProceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 769–780, 2015

work page 2015

[55] [55]

Evasive malware exposed and deconstructed

Christopher Kruegel. Evasive malware exposed and deconstructed. InRSA Conference USA, 2015. Session CRWD-T08

work page 2015

[56] [56]

A case study of llm for automated vulnerability repair: Assessing impact of reasoning and patch validation feedback

Ummay Kulsum, Haotian Zhu, Bowen Xu, and Marcelo d’Amorim. A case study of llm for automated vulnerability repair: Assessing impact of reasoning and patch validation feedback. InProceedings of the 1st ACM International Conference on AI-Powered Software, pages 103–111, 2024

work page 2024

[57] [57]

Rex86: A local large language model for assisting in x86 assembly reverse engineering

Darrin Lea, James Ghawaly, Golden Richard, Aisha Ali-Gombe, and Andrew Case. Rex86: A local large language model for assisting in x86 assembly reverse engineering. In2025 IEEE Annual Computer Security Applications Conference (ACSAC), pages 108–122. IEEE, 2025

work page 2025

[58] [58]

Dmalnet: Dynamic malware analysis based on api feature engineering and graph learning.Computers & Security, 122:102872, 2022

Ce Li, Zijun Cheng, He Zhu, Leiqi Wang, Qiujian Lv, Yan Wang, Ning Li, and Degang Sun. Dmalnet: Dynamic malware analysis based on api feature engineering and graph learning.Computers & Security, 122:102872, 2022

work page 2022

[59] [59]

Llm-based vulnerability detection

Hongping Li and Li Shan. Llm-based vulnerability detection. In2023 International Conference on Human-Centered Cognitive Systems (HCCS), pages 1–4. IEEE, 2023

work page 2023

[60] [60]

Exploring and evaluating hallucinations in llm-powered code generation.arXiv preprint arXiv:2404.00971, 2024

Fang Liu, Yang Liu, Lin Shi, Houkun Huang, Ruifeng Wang, Zhen Yang, Li Zhang, Zhongqi Li, and Yuchi Ma. Exploring and evaluating hallucinations in llm-powered code generation.arXiv preprint arXiv:2404.00971, 2024

work page arXiv 2024

[61] [61]

Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs

Wei Ma, Shangqing Liu, Zhihao Lin, Wenhan Wang, Qiang Hu, Ye Liu, Cen Zhang, Liming Nie, Li Li, and Yang Liu. Lms: Understanding code syntax and semantics for code analysis.arXiv preprint arXiv:2305.12138, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[62] [62]

Redefining malware sandboxing: En- hancing analysis through sysmon and elk integration.IEEe Access, 12:68624–68636, 2024

Rasmi-Vlad Mahmoud, Marios Anagnostopoulos, Sergio Pastrana, and Jens Myrup Pedersen. Redefining malware sandboxing: En- hancing analysis through sysmon and elk integration.IEEe Access, 12:68624–68636, 2024

work page 2024

[63] [63]

capa: The FLARE team’s open-source tool to identify capabilities in executable files

Mandiant. capa: The FLARE team’s open-source tool to identify capabilities in executable files. https://github.com/mandiant/capa,

work page

[64] [64]

Open-source malware capability detection tool

work page

[65] [65]

Spotless sandboxes: Evading malware anal- ysis systems using wear-and-tear artifacts

Najmeh Miramirkhani, Mahathi Priya Appini, Nick Nikiforakis, and Michalis Polychronakis. Spotless sandboxes: Evading malware anal- ysis systems using wear-and-tear artifacts. In2017 IEEE Symposium on Security and Privacy (SP), pages 1009–1024. IEEE, 2017

work page 2017

[66] [66]

Borja Molina-Coronado, Antonio Ruggia, Usue Mori, Alessio Merlo, Alexander Mendiburu, and Jose Miguel-Alonso. Light up that droid! on the effectiveness of static analysis features against app obfuscation for android malware detection.Journal of Network and Computer Applications, 235:104094, 2025

work page 2025

[67] [67]

Using an llm to help with code understanding

Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. Using an llm to help with code understanding. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13, 2024

work page 2024

[68] [68]

Obfuscated malware detection and classification in network traffic leveraging hybrid large language models and synthetic data.Sensors (Basel, Switzerland), 25(1):202, 2025

Mehwish Naseer, Farhan Ullah, Samia Ijaz, Hamad Naeem, Amjad Alsirhani, Ghadah Naif Alwakid, and Abdullah Alomari. Obfuscated malware detection and classification in network traffic leveraging hybrid large language models and synthetic data.Sensors (Basel, Switzerland), 25(1):202, 2025

work page 2025

[69] [69]

Muzzamil Noor, Haider Abbas, and Waleed Bin Shahid. Countering cyber threats for industrial applications: An automated approach for malware evasion detection and analysis.Journal of Network and Computer Applications, 103:249–261, 2018

work page 2018

[70] [70]

al-khaser: Public malware techniques used in the wild: Virtual machine, emulation, debuggers, sandbox detection

Lord Noteworthy. al-khaser: Public malware techniques used in the wild: Virtual machine, emulation, debuggers, sandbox detection. https://github.com/LordNoteworthy/al-khaser, 2016. Open-source evasion toolkit, accessed: 2026

work page 2016

[71] [71]

capemon: The monitor DLL for CAPE

Kevin O’Reilly. capemon: The monitor DLL for CAPE. https: //github.com/kevoreilly/capemon, 2024. Accessed: 2026-02-05

work page 2024

[72] [72]

CAPE Sandbox: Malware configuration and payload extraction

Kevin O’Reilly and CAPE Contributors. CAPE Sandbox: Malware configuration and payload extraction. https://github.com/kevoreilly/ CAPEv2, 2024. Open-source malware sandbox, accessed: 2026

work page 2024

[73] [73]

Mars stealer: Exclusive new threat research

Arnold Osipov. Mars stealer: Exclusive new threat research. Mor- phisec Labs, March 2022. URL: https://www.morphisec.com/blog/ threat-research-mars-stealer

work page 2022

[74] [74]

Spear phishing attacks target organi- zations in ukraine, payloads include the document stealer outsteel and the downloader saintbot

Palo Alto Networks Unit 42. Spear phishing attacks target organi- zations in ukraine, payloads include the document stealer outsteel and the downloader saintbot. URL: https://unit42.paloaltonetworks. com/ukraine-targeted-outsteel-saintbot/

work page

[75] [75]

Automatic detection and bypassing of anti-debugging techniques for microsoft windows environments.Advances in Electrical and Computer Engineering, 19(2):23–28, 2019

Juhyun Park, Yun-Hwan Jang, Soohwa Hong, and Yongsu Park. Automatic detection and bypassing of anti-debugging techniques for microsoft windows environments.Advances in Electrical and Computer Engineering, 19(2):23–28, 2019

work page 2019

[76] [76]

Im- proving the robustness of ai-based malware detection using adver- sarial machine learning.Algorithms, 14(10):297, 2021

Shruti Patil, Vijayakumar Varadarajan, Devika Walimbe, Siddharth Gulechha, Sushant Shenoy, Aditya Raina, and Ketan Kotecha. Im- proving the robustness of ai-based malware detection using adver- sarial machine learning.Algorithms, 14(10):297, 2021

work page 2021

[77] [77]

As- sessing llms in malicious code deobfuscation of real-world malware campaigns.Expert Systems with Applications, 256:124912, 2024

Constantinos Patsakis, Fran Casino, and Nikolaos Lykousas. As- sessing llms in malicious code deobfuscation of real-world malware campaigns.Expert Systems with Applications, 256:124912, 2024

work page 2024

[78] [78]

Red Teaming Language Models with Language Models

Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. Red teaming language models with language models, 2022.URL https://arxiv. org/abs/2202.03286, 15, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[79] [79]

Spvexec and spvluexec- a novel realtime defensive tool for stealthy malware infection

Nicholas Phillips and Aisha Ali-Gombe. Spvexec and spvluexec- a novel realtime defensive tool for stealthy malware infection. International Journal On Advances in Security, pages 72–85, 2023

work page 2023

[80] [80]

Nicholas Phillips and A Ali Gombe. Longitudinal study of per- sistence vectors (pvs) in windows malware: Evolution, complexity, and stealthiness.SECURWARE 2022, The Sixteenth International Conference on Emerging Security Information, Systems and Tech- nologies, pages 28–34, 2022

work page 2022