A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox
Pith reviewed 2026-05-22 08:23 UTC · model grok-4.3
The pith
Large language models can generate YARA rules that bypass malware evasion checks in sandboxes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that their ABLE system leverages large language models to analyze malware execution traces and generate bypass YARA rules, achieving a 79% success rate across 334 samples from four models, where iterative refinement aids 29.5% of cases, and resulting in 47% more family classifications than existing platforms while uncovering previously hidden behaviors.
What carries the argument
The ABLE pipeline, which combines LLM reasoning on execution traces with an auto-sanitization pipeline and feedback-driven iterative refinement to produce functional bypass rules.
If this is right
- Sandboxes can process more malware samples effectively without custom manual rules for each evasion technique.
- Analysts gain visibility into malware families that were previously misclassified or undetected.
- Hidden malicious behaviors become observable in sandbox reports for a larger portion of samples.
Where Pith is reading between the lines
- The method could be adapted to generate rules for other types of security analysis tools that face similar evasion issues.
- Testing across additional model types might reveal whether success rates hold beyond the open-weight models evaluated here.
- Combining this generation process with existing signature databases could reduce the need for repeated refinements over time.
Load-bearing premise
The generated rules from the language models will consistently and reliably bypass the specific evasion checks in actual sandbox environments without introducing errors or missing critical evasion mechanisms.
What would settle it
Observing whether applying the output YARA rules to the malware samples in a real analysis sandbox results in the malicious payloads executing as expected rather than being suppressed by the evasion detection.
Figures
read the original abstract
Sandbox evasion remains a critical challenge for automated malware analysis, as modern malware employs environment checks to detect analysis platforms and suppress malicious behavior. Existing approaches rely on manually crafted bypass rules that require deep reverse engineering of each evasion mechanism -an approach that cannot scale against rapidly evolving evasion techniques. In this paper, we leverage large language models (LLMs) to automatically generate YARA rules that bypass evasion checks in sandbox environments. We propose ABLE, which analyzes execution traces from malware terminated due to potentially evasive behavior and employs multiple reasoning strategies to generate targeted bypass rules. To address syntactic errors and improve the efficacy of the bypass rules in the LLM outputs, we introduce an auto-sanitization pipeline and feedback-driven iterative refinement. We evaluate ABLE on 334 real-world malware samples across four open-weight LLMs. ABLE achieves a 79% bypass success rate, with iterative refinement contributing 29.5% of successful cases. Compared to existing analysis platforms, ABLE identifies 47% more malware family classifications and exposes previously hidden behaviors.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ABLE, an LLM-based system that analyzes malware execution traces to automatically generate YARA rules for bypassing sandbox evasion checks. It introduces auto-sanitization and iterative refinement to improve rule quality, and evaluates the approach on 334 real-world malware samples using four open-weight LLMs, reporting a 79% bypass success rate (with 29.5% of successes attributed to refinement) along with 47% more family classifications and exposure of hidden behaviors compared to existing platforms.
Significance. If the empirical results hold under rigorous validation, the work could offer a scalable alternative to manual reverse engineering for generating evasion bypasses, potentially improving automated malware analysis throughput and behavioral visibility. The combination of LLMs with feedback-driven refinement represents a practical application of generative models to a domain traditionally reliant on expert-crafted rules.
major comments (2)
- [Evaluation] Evaluation section: The headline 79% bypass success rate (and the 29.5% contribution from iterative refinement) is defined as the malware samples exhibiting malicious behavior rather than terminating early after rule application. However, the manuscript does not specify the sandbox platform used, the precise integration mechanism for applying the generated YARA rules (a static pattern-matching tool) to neutralize runtime environment checks performed by the malware, or the concrete success criteria (e.g., increased API call volume, dropped files, or C2 traffic). This leaves open whether the metric reflects genuine functional bypass or measurement artifacts.
- [Methods] Methods and § on rule generation: The claim that ABLE produces rules that 'reliably bypass evasion checks' rests on the assumption that LLM-generated YARA patterns, after sanitization, correctly target and disable the specific environment-detection logic in the samples. No error analysis or case studies are provided showing that the rules address the actual evasion mechanisms (e.g., timing checks, hardware artifacts) rather than producing overly permissive or irrelevant patterns.
minor comments (2)
- [Abstract] Abstract: The phrase 'exposes previously hidden behaviors' is not quantified; a concrete metric (e.g., additional API calls or network connections observed) would strengthen the comparison to existing platforms.
- [Evaluation] Sample selection: The criteria for choosing the 334 malware samples and the distribution across families or evasion techniques are not detailed, making it difficult to assess generalizability.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback on our manuscript. The comments highlight important aspects of clarity in evaluation and validation of rule quality. We respond to each major comment below and will incorporate revisions to address them in the next version of the paper.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: The headline 79% bypass success rate (and the 29.5% contribution from iterative refinement) is defined as the malware samples exhibiting malicious behavior rather than terminating early after rule application. However, the manuscript does not specify the sandbox platform used, the precise integration mechanism for applying the generated YARA rules (a static pattern-matching tool) to neutralize runtime environment checks performed by the malware, or the concrete success criteria (e.g., increased API call volume, dropped files, or C2 traffic). This leaves open whether the metric reflects genuine functional bypass or measurement artifacts.
Authors: We agree that greater specificity is required to substantiate the evaluation metric. In the revised manuscript, we will explicitly describe the sandbox platform used for trace collection and rule testing, detail the integration process by which the generated YARA rules are applied within the sandbox to intercept environment checks, and clarify the success criteria, including observable indicators such as increased API call volume, dropped files, and C2 traffic. These additions will confirm that the 79% rate measures functional bypass rather than artifacts. revision: yes
-
Referee: [Methods] Methods and § on rule generation: The claim that ABLE produces rules that 'reliably bypass evasion checks' rests on the assumption that LLM-generated YARA patterns, after sanitization, correctly target and disable the specific environment-detection logic in the samples. No error analysis or case studies are provided showing that the rules address the actual evasion mechanisms (e.g., timing checks, hardware artifacts) rather than producing overly permissive or irrelevant patterns.
Authors: We acknowledge the value of qualitative validation alongside quantitative results. While the success rate across 334 samples supports the overall approach, we agree that error analysis and case studies would strengthen claims about rule targeting. In the revision, we will add a dedicated subsection with error analysis and representative case studies. These will show how specific sanitized and refined YARA rules address evasion mechanisms such as timing checks or hardware artifacts, including before-and-after behavioral comparisons. revision: yes
Circularity Check
No circularity: empirical success rates measured on external malware samples
full rationale
The paper presents an empirical pipeline: execution traces from real malware samples are fed to LLMs to produce YARA rules, followed by sanitization and iterative refinement, then success is measured by whether the rules allow the 334 samples to exhibit malicious behavior in a sandbox. No equations, fitted parameters, or self-referential definitions appear in the abstract or described method. The 79% bypass rate and 29.5% refinement contribution are reported as direct experimental outcomes on external samples rather than quantities derived from the method itself by construction. No self-citation load-bearing steps or uniqueness theorems are invoked to justify core claims. The derivation chain remains independent of its measured outputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM-generated YARA rules can be made syntactically valid and functionally effective via post-processing and iteration
invented entities (1)
-
ABLE framework
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose ABLE, which analyzes execution traces from malware terminated due to potentially evasive behavior and employs multiple reasoning strategies to generate targeted bypass rules.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ABLE achieves a 79% bypass success rate, with iterative refinement contributing 29.5% of successful cases.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
- [1]
-
[2]
Cloud-based interactive malware analysis platform
-
[3]
MalwareBazaar malware sam- ple
abuse.ch. MalwareBazaar malware sam- ple. https : / / bazaar . abuse . ch/, 2023. SHA256: caf00150589120b59ea0145206e2aacad383d3cc18431674 fd58cc84f49b0e25
work page 2023
-
[4]
abuse.ch: Fighting malware and botnets
abuse.ch. abuse.ch: Fighting malware and botnets. https://abuse.ch/,
-
[5]
Non-profit threat intelligence organization
-
[6]
Malware dynamic analysis evasion techniques: A survey
Amir Afianian, Salman Niksefat, Babak Sadeghiyan, and David Baptiste. Malware dynamic analysis evasion techniques: A survey. ACM Computing Surveys (CSUR), 52(6):1–28, 2019
work page 2019
-
[7]
Hojjat Aghakhani, Fabio Gritti, Francesco Mecca, Martina Lin- dorfer, Stefano Ortolani, Davide Balzarotti, Giovanni Vigna, and Christopher Kruegel. When malware is packin’heat; limits of machine learning classifiers based on static analysis features. In Network and Distributed System Security Symposium. Internet So- ciety, 2020
work page 2020
-
[8]
Jamal Al-Karaki, Muhammad Al-Zafar Khan, and Marwan Omar. Exploring llms for malware detection: Review, framework design, and countermeasure approaches.arXiv preprint arXiv:2409.07587, 2024
-
[9]
Opseq: Android malware fingerprinting
Aisha Ali-Gombe, Irfan Ahmed, Golden G Richard III, and Vassil Roussev. Opseq: Android malware fingerprinting. InProceedings of the 5th Program Protection and Reverse Engineering Workshop, pages 1–12, 2015
work page 2015
-
[10]
Aspectdroid: Android app analysis system
Aisha Ali-Gombe, Irfan Ahmed, Golden G Richard III, and Vassil Roussev. Aspectdroid: Android app analysis system. InProceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pages 145–147, 2016
work page 2016
-
[11]
Toward a more dependable hybrid analysis of android malware using aspect-oriented programming
Aisha I Ali-Gombe, Brendan Saltaformaggio, Dongyan Xu, Golden G Richard III, et al. Toward a more dependable hybrid analysis of android malware using aspect-oriented programming. computers & security, 73:235–248, 2018
work page 2018
-
[12]
Behavioral analysis of ai- generated malware: New frontiers in threat detection
Ammar Almomani, Samer Aoudi, Ahmad Al-Qerem, Amjad Ald- weesh, and Mouhammd Alkasassbeh. Behavioral analysis of ai- generated malware: New frontiers in threat detection. InExamining Cybersecurity Risks Produced by Generative AI, pages 211–234. IGI Global Scientific Publishing, 2025
work page 2025
-
[13]
Evading machine learning malware detection.black Hat, 2017:1–6, 2017
Hyrum S Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. Evading machine learning malware detection.black Hat, 2017:1–6, 2017
work page 2017
-
[14]
Anthropic. Claude opus 4.1. https://www.anthropic.com/news/ claude-opus-4-1, 2025
work page 2025
-
[15]
Hancitor (aka chanitor) observed using multiple attack approaches
Ankit Anubhav and Dileep Jallepalli. Hancitor (aka chanitor) observed using multiple attack approaches. Mandiant, Google Cloud, 2016. URL: https://cloud.google.com/blog/topics/threat- intelligence/hancitor-aka-chanit/
work page 2016
-
[16]
FormBook malware trend analysis
ANY .RUN. FormBook malware trend analysis. https://any.run/ malware-trends/formbook, 2023
work page 2023
-
[17]
ANY .RUN. Malware analysis report: e536afc7f63611d1bbea4305f958661e.exe (MD5: E536afc7f63611d1bbea4305f958661e), 2023. URL: https://app.any.run/tasks/de97abb5-3aaf-40cc-b4d5-2d4a78997f09/
work page 2023
-
[18]
The android malware static analysis: techniques, limitations, and open challenges
Khaled Bakour, H Murat ¨Unver, and Razan Ghanem. The android malware static analysis: techniques, limitations, and open challenges. In2018 3rd International Conference on Computer Science and Engineering (UBMK), pages 586–593. Ieee, 2018
work page 2018
-
[19]
A reverse engineering education needs analysis survey.arXiv preprint arXiv:2212.07531, 2022
Charles R Barone IV , Robert Serafin, Ilya Shavrov, Ibrahim Baggili, Aisha Ali-Gombe, Golden G Richard III, and Andrew Case. A reverse engineering education needs analysis survey.arXiv preprint arXiv:2212.07531, 2022
-
[20]
Stealc: A copycat of vidar and raccoon infostealers gaining in popularity – part 2
Pierre Le Bourhis, Quentin Bourgue, and Sekoia TDR. Stealc: A copycat of vidar and raccoon infostealers gaining in popularity – part 2. Sekoia.io Blog, 2023. URL: https://blog.sekoia.io/stealc- a-copycat-of-vidar-and-raccoon-infostealers-gaining-in-popularity- part-2/
work page 2023
-
[21]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few- shot learners.Advances in neural information processing systems, 33:1877–1901, 2020
work page 1901
-
[22]
A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web
Alexei Bulazel and B ¨ulent Yener. A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web. InProceedings of the 1st Reversing and Offensive-oriented Trends Symposium, pages 1–21, 2017
work page 2017
-
[23]
Alexander Cameron, Abu Alam, Nasreen Anjum, Javed Ali Khan, and Alexios Mylonas. Statos: A portable tool for secure malware analysis and sample acquisition in low resource environments.Ar- ray, 26:100391, 2025
work page 2025
-
[24]
Daipeng Cao and W Jun. Llm-cloudsec: Large language model empowered automatic and deep vulnerability analysis for intelligent clouds. InIEEE INFOCOM 2024-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pages 1–6. IEEE, 2024
work page 2024
-
[25]
Ping Chen, Christophe Huygens, Lieven Desmet, and Wouter Joosen. Advanced or not? a comparative study of the use of anti- debugging and anti-vm techniques in generic and targeted malware. InIFIP International Conference on ICT Systems Security and Privacy Protection, pages 323–336. Springer, 2016
work page 2016
-
[26]
Yuning Cui, Yi Sun, and Zhaowen Lin. Droidhook: a novel api- hook based android malware dynamic analysis sandbox.Automated Software Engineering, 30(1):10, 2023
work page 2023
-
[27]
Francisco Handrick da Costa, Ismael Medeiros, Thales Menezes, Jo˜ao Victor da Silva, Ingrid Lorraine da Silva, Rodrigo Bonif ´acio, Krishna Narasimhan, and M´arcio Ribeiro. Exploring the use of static and dynamic analysis to improve the performance of the mining sandbox approach for android malware identification.Journal of Systems and Software, 183:111092, 2022
work page 2022
-
[28]
A bazar of tricks: Following team9’s development cycles
Daniel Frank, Mary Zhao and Assaf Dahan. A bazar of tricks: Following team9’s development cycles. Cybereason Nocturnus. URL: https://www.cybereason.com/blog/research/a-bazar-of-tricks- following-team9s-development-cycles
-
[29]
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI, Daya Guo, Qihao Liu, Zhenda Fan, Borong Liang, Aixin Huang, Zhewen Ruan, Wangding Shang, Zhaowei Zhao, Wangsheng Ren, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025. URL: https://arxiv.org/abs/2501.12948
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[30]
Egregor ransomware the raas successor to maze, 2021
NHS England Digital. Egregor ransomware the raas successor to maze, 2021. URL: https://digital.nhs.uk/cyber-alerts/2020/cc-3681
work page 2021
-
[31]
Artificial intelligence-based malware detection, analysis, and mitigation.Symmetry, 15(3):677, 2023
Amir Djenna, Ahmed Bouridane, Saddaf Rubab, and Ibrahim Moussa Marou. Artificial intelligence-based malware detection, analysis, and mitigation.Symmetry, 15(3):677, 2023
work page 2023
-
[32]
Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024. URL: https: //arxiv.org/abs/2407.21783
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. A survey on automated dynamic malware-analysis tech- niques and tools.ACM Computing Surveys, 44(2):1–42, 2012. doi:10.1145/2089125.2089126
-
[34]
Malware analysis: Raccoon stealer v2.0
eSentire Threat Response Unit. Malware analysis: Raccoon stealer v2.0. eSentire Threat Intelligence, September 2022. URL: https: / / www. esentire . com / blog / esentire - threat - intelligence - malware - analysis-raccoon-stealer-v2-0
work page 2022
-
[35]
Llm-maldetect: A large language model-based method for android malware detection.IEEE Access, 2025
Ruirui Feng, Hui Chen, Shuo Wang, Md Monjurul Karim, and Qingshan Jiang. Llm-maldetect: A large language model-based method for android malware detection.IEEE Access, 2025
work page 2025
-
[36]
Anastasia: Android malware detection using static analysis of applications
Hossein Fereidooni, Mauro Conti, Danfeng Yao, and Alessandro Sperduti. Anastasia: Android malware detection using static analysis of applications. In2016 8th IFIP international conference on new technologies, mobility and security (NTMS), pages 1–5. IEEE, 2016
work page 2016
-
[37]
Matthew Gaber, Mohiuddin Ahmed, and Helge Janicke. Defeating evasive malware with peekaboo: Extracting authentic malware be- havior with dynamic binary instrumentation.Journal of Information Security and Applications, 95:104290, 2025
work page 2025
-
[38]
Nicola Galloro, Mario Polino, Michele Carminati, Andrea Con- tinella, and Stefano Zanero. A systematical and longitudinal study of evasive behaviors in windows malware.Computers & security, 113:102550, 2022
work page 2022
-
[39]
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L ´eonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ram ´e, et al. Gemma 2: Improving open language models at a practical size.arXiv preprint arXiv:2408.00118, 2024. URL: https://arxiv.org/abs/2408. 00118
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[40]
En- viral: Fuzzing the environment for evasive malware analysis
Floris Gorter, Cristiano Giuffrida, and Erik Van Der Kouwe. En- viral: Fuzzing the environment for evasive malware analysis. In Proceedings of the 16th European Workshop on System Security, pages 8–14, 2023
work page 2023
-
[41]
Cuckoo sandbox: open source automated malware anal- ysis
Claudio Guarnieri, Alessio Tanasi, Jurriaan Bremer, and Mark Schloesser. Cuckoo sandbox: open source automated malware anal- ysis. Black Hat USA, 2013. URL: https://media.blackhat.com/us- 13/US-13-Bremer-Mo-Malware-Mo-Problems-Cuckoo-Sandbox- WP.pdf
work page 2013
-
[42]
Triage: Automated malware analysis sandbox
Hatching. Triage: Automated malware analysis sandbox. https: //tria.ge/, 2024. Cloud-based malware analysis platform
work page 2024
-
[43]
On benchmarking code llms for android malware analysis
Yiling He, Hongyu She, Xingzhi Qian, Xinran Zheng, Zhuo Chen, Zhan Qin, and Lorenzo Cavallaro. On benchmarking code llms for android malware analysis. InProceedings of the 34th ACM SIG- SOFT International Symposium on Software Testing and Analysis, pages 153–160, 2025
work page 2025
-
[44]
Object allocation pattern as an indicator for maliciousness-an exploratory analysis
Adamu Hussaini, Bassam Zahran, and Aisha Ali-Gombe. Object allocation pattern as an indicator for maliciousness-an exploratory analysis. InProceedings of the Eleventh ACM Conference on Data and Application Security and Privacy, pages 313–315, 2021
work page 2021
-
[45]
M ¨ulhem ˙Ibrahim, Bayan Issa, and Muhammed Basheer Jasser. A method for automatic android malware detection based on static analysis and deep learning.IEEE Access, 10:117334–117352, 2022
work page 2022
-
[46]
Intezer analyze: Genetic malware analysis
Intezer. Intezer analyze: Genetic malware analysis. https://www. intezer.com/, 2024. Code similarity and malware analysis platform
work page 2024
-
[47]
Jueun Jeon, Jong Hyuk Park, and Young-Sik Jeong. Dynamic analysis for iot malware detection with convolution neural network model.Ieee Access, 8:96899–96911, 2020
work page 2020
-
[48]
Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D
Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, and Bernhard Sch ¨olkopf. Can large language models infer causation from correlation?arXiv preprint arXiv:2306.05836, 2023
-
[49]
Joe Sandbox: Deep malware analysis
Joe Security LLC. Joe Sandbox: Deep malware analysis. https:// www.joesecurity.org, 2024. Commercial malware analysis sandbox
work page 2024
-
[50]
From shamoon to stonedrill: Wipers attacking saudi organizations and beyond
Kaspersky Lab. From shamoon to stonedrill: Wipers attacking saudi organizations and beyond. Kaspersky, 2017. URL: https:// media.kasperskycontenthub.com/wp-content/uploads/sites/43/2018/ 03/07180722/Report Shamoon StoneDrill final.pdf
work page 2017
-
[51]
Sama: A comprehensive smart automated malware analyzer empowered by chatgpt integration
Mahmoud A Khalifa, Iman Almomani, and Walid El-Shafai. Sama: A comprehensive smart automated malware analyzer empowered by chatgpt integration. In2024 IEEE 30th International Conference on Telecommunications (ICT), pages 1–6. IEEE, 2024
work page 2024
-
[52]
Mst Eshita Khatun, Lamine Noureddine, Zhiyong Sui, and Aisha Ali-Gombe. Androbyte: Llm-driven privacy analysis through byte- code summarization and dynamic dataflow call graph generation. arXiv preprint arXiv:2510.15112, 2025
-
[53]
Logs in, patches out: Automated vulnerability repair via{Tree-of- Thought}{LLM}analysis
Youngjoon Kim, Sunguk Shin, Hyoungshick Kim, and Jiwon Yoon. Logs in, patches out: Automated vulnerability repair via{Tree-of- Thought}{LLM}analysis. In34th USENIX Security Symposium (USENIX Security 25), pages 4401–4419, 2025
work page 2025
-
[54]
Malgene: Automatic extraction of malware analysis evasion signature
Dhilung Kirat and Giovanni Vigna. Malgene: Automatic extraction of malware analysis evasion signature. InProceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 769–780, 2015
work page 2015
-
[55]
Evasive malware exposed and deconstructed
Christopher Kruegel. Evasive malware exposed and deconstructed. InRSA Conference USA, 2015. Session CRWD-T08
work page 2015
-
[56]
Ummay Kulsum, Haotian Zhu, Bowen Xu, and Marcelo d’Amorim. A case study of llm for automated vulnerability repair: Assessing impact of reasoning and patch validation feedback. InProceedings of the 1st ACM International Conference on AI-Powered Software, pages 103–111, 2024
work page 2024
-
[57]
Rex86: A local large language model for assisting in x86 assembly reverse engineering
Darrin Lea, James Ghawaly, Golden Richard, Aisha Ali-Gombe, and Andrew Case. Rex86: A local large language model for assisting in x86 assembly reverse engineering. In2025 IEEE Annual Computer Security Applications Conference (ACSAC), pages 108–122. IEEE, 2025
work page 2025
-
[58]
Ce Li, Zijun Cheng, He Zhu, Leiqi Wang, Qiujian Lv, Yan Wang, Ning Li, and Degang Sun. Dmalnet: Dynamic malware analysis based on api feature engineering and graph learning.Computers & Security, 122:102872, 2022
work page 2022
-
[59]
Llm-based vulnerability detection
Hongping Li and Li Shan. Llm-based vulnerability detection. In2023 International Conference on Human-Centered Cognitive Systems (HCCS), pages 1–4. IEEE, 2023
work page 2023
-
[60]
Fang Liu, Yang Liu, Lin Shi, Houkun Huang, Ruifeng Wang, Zhen Yang, Li Zhang, Zhongqi Li, and Yuchi Ma. Exploring and evaluating hallucinations in llm-powered code generation.arXiv preprint arXiv:2404.00971, 2024
-
[61]
Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs
Wei Ma, Shangqing Liu, Zhihao Lin, Wenhan Wang, Qiang Hu, Ye Liu, Cen Zhang, Liming Nie, Li Li, and Yang Liu. Lms: Understanding code syntax and semantics for code analysis.arXiv preprint arXiv:2305.12138, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[62]
Rasmi-Vlad Mahmoud, Marios Anagnostopoulos, Sergio Pastrana, and Jens Myrup Pedersen. Redefining malware sandboxing: En- hancing analysis through sysmon and elk integration.IEEe Access, 12:68624–68636, 2024
work page 2024
-
[63]
capa: The FLARE team’s open-source tool to identify capabilities in executable files
Mandiant. capa: The FLARE team’s open-source tool to identify capabilities in executable files. https://github.com/mandiant/capa,
-
[64]
Open-source malware capability detection tool
-
[65]
Spotless sandboxes: Evading malware anal- ysis systems using wear-and-tear artifacts
Najmeh Miramirkhani, Mahathi Priya Appini, Nick Nikiforakis, and Michalis Polychronakis. Spotless sandboxes: Evading malware anal- ysis systems using wear-and-tear artifacts. In2017 IEEE Symposium on Security and Privacy (SP), pages 1009–1024. IEEE, 2017
work page 2017
-
[66]
Borja Molina-Coronado, Antonio Ruggia, Usue Mori, Alessio Merlo, Alexander Mendiburu, and Jose Miguel-Alonso. Light up that droid! on the effectiveness of static analysis features against app obfuscation for android malware detection.Journal of Network and Computer Applications, 235:104094, 2025
work page 2025
-
[67]
Using an llm to help with code understanding
Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. Using an llm to help with code understanding. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13, 2024
work page 2024
-
[68]
Mehwish Naseer, Farhan Ullah, Samia Ijaz, Hamad Naeem, Amjad Alsirhani, Ghadah Naif Alwakid, and Abdullah Alomari. Obfuscated malware detection and classification in network traffic leveraging hybrid large language models and synthetic data.Sensors (Basel, Switzerland), 25(1):202, 2025
work page 2025
-
[69]
Muzzamil Noor, Haider Abbas, and Waleed Bin Shahid. Countering cyber threats for industrial applications: An automated approach for malware evasion detection and analysis.Journal of Network and Computer Applications, 103:249–261, 2018
work page 2018
-
[70]
Lord Noteworthy. al-khaser: Public malware techniques used in the wild: Virtual machine, emulation, debuggers, sandbox detection. https://github.com/LordNoteworthy/al-khaser, 2016. Open-source evasion toolkit, accessed: 2026
work page 2016
-
[71]
capemon: The monitor DLL for CAPE
Kevin O’Reilly. capemon: The monitor DLL for CAPE. https: //github.com/kevoreilly/capemon, 2024. Accessed: 2026-02-05
work page 2024
-
[72]
CAPE Sandbox: Malware configuration and payload extraction
Kevin O’Reilly and CAPE Contributors. CAPE Sandbox: Malware configuration and payload extraction. https://github.com/kevoreilly/ CAPEv2, 2024. Open-source malware sandbox, accessed: 2026
work page 2024
-
[73]
Mars stealer: Exclusive new threat research
Arnold Osipov. Mars stealer: Exclusive new threat research. Mor- phisec Labs, March 2022. URL: https://www.morphisec.com/blog/ threat-research-mars-stealer
work page 2022
-
[74]
Palo Alto Networks Unit 42. Spear phishing attacks target organi- zations in ukraine, payloads include the document stealer outsteel and the downloader saintbot. URL: https://unit42.paloaltonetworks. com/ukraine-targeted-outsteel-saintbot/
-
[75]
Juhyun Park, Yun-Hwan Jang, Soohwa Hong, and Yongsu Park. Automatic detection and bypassing of anti-debugging techniques for microsoft windows environments.Advances in Electrical and Computer Engineering, 19(2):23–28, 2019
work page 2019
-
[76]
Shruti Patil, Vijayakumar Varadarajan, Devika Walimbe, Siddharth Gulechha, Sushant Shenoy, Aditya Raina, and Ketan Kotecha. Im- proving the robustness of ai-based malware detection using adver- sarial machine learning.Algorithms, 14(10):297, 2021
work page 2021
-
[77]
Constantinos Patsakis, Fran Casino, and Nikolaos Lykousas. As- sessing llms in malicious code deobfuscation of real-world malware campaigns.Expert Systems with Applications, 256:124912, 2024
work page 2024
-
[78]
Red Teaming Language Models with Language Models
Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. Red teaming language models with language models, 2022.URL https://arxiv. org/abs/2202.03286, 15, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[79]
Spvexec and spvluexec- a novel realtime defensive tool for stealthy malware infection
Nicholas Phillips and Aisha Ali-Gombe. Spvexec and spvluexec- a novel realtime defensive tool for stealthy malware infection. International Journal On Advances in Security, pages 72–85, 2023
work page 2023
-
[80]
Nicholas Phillips and A Ali Gombe. Longitudinal study of per- sistence vectors (pvs) in windows malware: Evolution, complexity, and stealthiness.SECURWARE 2022, The Sixteenth International Conference on Emerging Security Information, Systems and Tech- nologies, pages 28–34, 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.