RansomTrack: A Hybrid Behavioral Analysis Framework for Ransomware Detection
Pith reviewed 2026-05-10 17:00 UTC · model grok-4.3
The pith
A hybrid static-dynamic analysis framework detects ransomware with 96% accuracy in under 10 seconds.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RansomTrack is a hybrid behavioral analysis framework that extracts static features using the Radare2 sandbox and dynamic behaviors such as memory protection changes, mutex creation, registry access, and network activity using the Frida toolkit. On a publicly released dataset covering 165 ransomware and benign software families, ensemble classifiers including XGBoost and Soft Voting achieve up to 96% accuracy and a ROC-AUC of 0.99. Each sample is analyzed in about 9.1 seconds with modular logging and SHAP-based feature importance explanations. The framework detects ransomware in under 9.2 seconds.
What carries the argument
The RansomTrack hybrid framework integrating Radare2 static feature extraction and Frida dynamic runtime instrumentation for machine learning-based classification of ransomware.
Load-bearing premise
The hybrid features from Radare2 and Frida, when used to train models on the 165-family dataset, will generalize effectively to new ransomware samples and real-world environments.
What would settle it
A test showing substantially lower accuracy or longer detection times when applying the models to ransomware samples from families not represented in the training dataset.
Figures
read the original abstract
Ransomware poses a serious and fast-acting threat to critical systems, often encrypting files within seconds of execution. Research indicates that ransomware is the most reported cybercrime in terms of financial damage, highlighting the urgent need for early-stage detection before encryption is complete. In this paper, we present RansomTrack, a hybrid behavioral analysis framework to eliminate the limitations of using static and dynamic detection methods separately. Static features are extracted using the Radare2 sandbox, while dynamic behaviors such as memory protection changes, mutex creation, registry access and network activity are obtained using the Frida toolkit. Our dataset of 165 different ransomware and benign software families is publicly released, offering the highest family-to-sample ratio known in the literature. Experimental evaluation using machine learning models shows that ensemble classifiers such as XGBoost and Soft Voting achieve up to 96% accuracy and a ROC-AUC score of 0.99. Each sample analyzed in 9.1 seconds includes modular behavioral logging, runtime instrumentation, and SHAP-based interpretability to highlight the most influential features. Additionally, RansomTrack framework is able to detect ransomware under 9.2 seconds. Overall, RansomTrack offers a scalable, low-latency, and explainable solution for real-time ransomware detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents RansomTrack, a hybrid behavioral analysis framework for ransomware detection that combines static features extracted via Radare2 with dynamic behaviors (memory protection changes, mutex creation, registry access, network activity) captured using Frida. It releases a public dataset of 165 ransomware and benign software families and reports that ensemble classifiers such as XGBoost and Soft Voting achieve up to 96% accuracy and 0.99 ROC-AUC, with each sample analyzed in 9.1 seconds and ransomware detected under 9.2 seconds. SHAP-based interpretability is included to highlight influential features.
Significance. If the performance holds under proper validation, the hybrid approach and public dataset release would provide a practical, low-latency, explainable system for early ransomware detection that addresses gaps in purely static or dynamic methods. The high family-to-sample ratio in the released dataset is a concrete strength that could support reproducible follow-on work.
major comments (2)
- [Experimental evaluation] Experimental evaluation (abstract and results section): The claims of 96% accuracy and 0.99 AUC provide no details on train-test split strategy, cross-validation method, or family-disjoint partitioning. Without family-disjoint or temporal splits, the metrics may reflect intra-family leakage from static strings or mutex names rather than generalizable hybrid behavioral discrimination, directly undermining the generalization and early-detection claims.
- [Methods] Feature combination (methods section): No discussion of correlation handling, multicollinearity checks, or leakage between the Radare2 static feature set and Frida dynamic behaviors is provided. If static and dynamic features are naively concatenated without mitigation, the reported ensemble performance could be inflated by redundant signals.
minor comments (2)
- [Abstract] The abstract asserts the dataset has 'the highest family-to-sample ratio known in the literature' without a supporting comparison table or citations to prior ransomware datasets.
- [Dataset description] Clarify the exact breakdown of the 165 families (ransomware vs. benign counts and samples per family) to support reproducibility claims.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review of our manuscript. We have carefully addressed each major comment below and describe the revisions we will incorporate to strengthen the paper.
read point-by-point responses
-
Referee: [Experimental evaluation] Experimental evaluation (abstract and results section): The claims of 96% accuracy and 0.99 AUC provide no details on train-test split strategy, cross-validation method, or family-disjoint partitioning. Without family-disjoint or temporal splits, the metrics may reflect intra-family leakage from static strings or mutex names rather than generalizable hybrid behavioral discrimination, directly undermining the generalization and early-detection claims.
Authors: We agree that the experimental evaluation section lacked sufficient detail on the data partitioning strategy, which is essential for supporting the generalization claims. In the revised manuscript, we will add an explicit subsection describing our approach: an 80/20 family-disjoint train-test split (ensuring no samples from the same ransomware or benign family appear in both sets) combined with 5-fold cross-validation using family-disjoint folds. This partitioning prevents leakage from family-specific static artifacts such as strings or mutex names and directly bolsters the early-detection claims by demonstrating performance on unseen families. We will also report the exact family counts in each split and any additional validation steps. revision: yes
-
Referee: [Methods] Feature combination (methods section): No discussion of correlation handling, multicollinearity checks, or leakage between the Radare2 static feature set and Frida dynamic behaviors is provided. If static and dynamic features are naively concatenated without mitigation, the reported ensemble performance could be inflated by redundant signals.
Authors: We thank the referee for raising this methodological concern. The static Radare2 features and Frida dynamic behaviors are intended to be complementary, with static analysis operating on the binary without execution and dynamic analysis capturing runtime events. However, we acknowledge that the methods section did not discuss correlation handling or multicollinearity. In the revised version, we will include a new subsection on feature preprocessing that details Pearson correlation thresholding (removing pairs above 0.8) and variance inflation factor (VIF) analysis to mitigate multicollinearity. We will also explain that potential leakage is limited because the two feature categories originate from fundamentally different analysis stages and that any residual overlap is addressed via the ensemble models and SHAP interpretability. revision: yes
Circularity Check
No circularity: empirical ML evaluation on released dataset is self-contained
full rationale
The paper presents a hybrid ransomware detection framework that extracts static features via Radare2 and dynamic behaviors via Frida, trains standard ensemble classifiers (XGBoost, Soft Voting) on a publicly released 165-family dataset, and reports empirical accuracy/AUC/detection-time metrics. No mathematical derivation chain, equations, or first-principles predictions exist that could reduce to inputs by construction. Performance numbers are obtained via conventional supervised learning on (presumably) held-out samples; any hyperparameter fitting is standard ML practice and does not create self-definitional or fitted-input-called-prediction circularity. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked. The central claims remain externally falsifiable against the released dataset and real-world samples.
Axiom & Free-Parameter Ledger
free parameters (1)
- XGBoost and ensemble hyperparameters
axioms (1)
- domain assumption Static features from Radare2 combined with dynamic Frida traces are sufficient to distinguish ransomware from benign software across the collected families.
Reference graph
Works this paper leans on
-
[1]
Malwarebazaar – a repository of malware samples
Abuse.ch, 2025. Malwarebazaar – a repository of malware samples. https://bazaar.abuse.ch/. [Online]
work page 2025
-
[2]
Yara – malware pattern matching tool.https: //virustotal.github.io/yara/
Alvarez, V., 2013. Yara – malware pattern matching tool.https: //virustotal.github.io/yara/. [Online]
work page 2013
-
[3]
Alzahrani, S., Xiao, Y., Asiri, S., Alasmari, N., Li, T., 2025. Ran- somformer: A cross-modal transformer architecture for ransomware detectionviathefusionofbyteandapifeatures. Electronics14,1245. doi:10.3390/electronics14071245
-
[4]
Bae, S., Yang, C., Kim, Y., 2022. Inviseal: A stealthy dynamic analysis framework for android systems, in: Proceedings of the 2022 ACMSIGSACConferenceonComputerandCommunicationsSecu- rity(CCS’22),LosAngeles,CA,USA.pp.3903–3916. doi:10.1145/ 3567599
work page 2022
-
[5]
International Journal of Research in Computer Applications and Information Technology
Bellamkonda,S.,2021.Ransomwareattacksoncriticalinfrastructure: A study of the colonial pipeline incident. International Journal of Research in Computer Applications and Information Technology
work page 2021
-
[6]
Noriben – automated sandbox reporting.https: //github.com/Rurik/Noriben
Bilbrey, B.L., 2020. Noriben – automated sandbox reporting.https: //github.com/Rurik/Noriben. [Online]
work page 2020
-
[7]
Bilge, L., Dumitras, T., Davaslioglu, B., 2019. Dissecting ran- somware: Dynamic analysis and evading detection, in: Proceedings of the USENIX Security Symposium
work page 2019
-
[8]
Benign-net: Benign windows executables dataset
Bormaa, 2022. Benign-net: Benign windows executables dataset. https://github.com/bormaa/Benign-NET. GitHub, [Online]
work page 2022
-
[9]
The recent trends in ransomware detection and behaviour analysis, in: Proc
Caliskan, B., Gulatas, I., Kilinc, H.H., Zaim, A.H., 2024. The recent trends in ransomware detection and behaviour analysis, in: Proc. 17th Int. Conf. Security of Information and Networks (SIN), Sydney, Australia. doi:10.1109/SIN63213.2024.10871663
-
[10]
Interpreting correlation coefficients.https://www.an drews.edu/~calkins/math/edrm611/edrm05.htm
Calkins, M., . Interpreting correlation coefficients.https://www.an drews.edu/~calkins/math/edrm611/edrm05.htm. [Accessed: May 8, 2025]
work page 2025
-
[11]
Kaseya Ransomware Attack – Technical Summary
CISA and FBI, 2021. Kaseya Ransomware Attack – Technical Summary. Technical Report. Cybersecurity Advisory Bulletin
work page 2021
-
[12]
Dumpitandmemoryacquisitiontoolkit
ComaeTechnologies,2020. Dumpitandmemoryacquisitiontoolkit. https://www.comae.com. [Online]
work page 2020
-
[13]
Cuckoosandbox–automatedmalwareanalysis
CuckooTeam,2016. Cuckoosandbox–automatedmalwareanalysis. https://cuckoosandbox.org. [Online]
work page 2016
-
[14]
Ransomware attack detection based on pertinent system calls using machine learning techniques
Dib, A., Ghazi, S., Mehdi, M.M.S., 2023. Ransomware attack detection based on pertinent system calls using machine learning techniques. International Journal of Computer Networks and Com- munications (IJCNC) 15, 129–140. doi:10.5121/ijcnc.2023.15408
-
[15]
How to circumvent andbeattheransomwareinandroidoperatingsystem—acasestudyof locker.cb!tr
Drabent, K., Janowski, R., Batalla, J.M., 2024. How to circumvent andbeattheransomwareinandroidoperatingsystem—acasestudyof locker.cb!tr. Electronics 13, 2212. doi:10.3390/electronics13112212
-
[16]
Pestudio – malware detection tool.https://www.wini tor.com
Ege, M., 2023. Pestudio – malware detection tool.https://www.wini tor.com. [Online]
work page 2023
-
[17]
Federal Bureau of Investigation, 2025. 2024 Internet Crime Report. Technical Report. Internet Crime Complaint Center (IC3). [Online]. Available:https://www.ic3.gov/AnnualReport/Reports/2024_IC3Repor t.pdf
work page 2025
-
[18]
Frida: Dynamic instrumentation toolkit
Frida Developers, 2022. Frida: Dynamic instrumentation toolkit. https://frida.re. [Online]
work page 2022
-
[19]
Procdot – visualizing process activities.https: //www.procdot.com/
Fuchs, C., 2021. Procdot – visualizing process activities.https: //www.procdot.com/. [Online]
work page 2021
-
[20]
Cyber crisis and national emergency: The2022costaricaransomwarecampaign
González, F., Smith, R., 2022. Cyber crisis and national emergency: The2022costaricaransomwarecampaign. CybersecurityPolicyand Strategy
work page 2022
-
[21]
Iotmalware detectionbasedonopcodepurification
Gulatas,I.,Kilinc,H.H.,Aydin,M.A.,Zaim,A.H.,2023. Iotmalware detectionbasedonopcodepurification. Electrica23,634–642. doi:10 .5152/electrica.2023.23043
-
[22]
Halcyon Threat Intel Team, 2023. 2023 ransomware and executable threat landscape.https://www.halcyon.ai/resources/ransomware-exe c-report. Halcyon.ai, [Online]
work page 2023
-
[23]
Dynamic feature dataset for ransomware detection using machine learning algorithms
Herrera-Silva, J.A., Hernández-Álvarez, M., 2023. Dynamic feature dataset for ransomware detection using machine learning algorithms. Sensors 23, 1053. doi:10.3390/s23031053
-
[24]
Ida pro disassembler.https://www.hex-rays.com/produc ts/ida/
Hex-Rays, . Ida pro disassembler.https://www.hex-rays.com/produc ts/ida/. [Accessed: June 29, 2025]
work page 2025
-
[25]
Ransap: An open dataset of ransomware storage access patterns for training machine learningmodels
Hirano, M., Hodota, R., Kobayashi, R., 2022. Ransap: An open dataset of ransomware storage access patterns for training machine learningmodels. ForensicScienceInternational:DigitalInvestigation 40, 301314. doi:10.1016/j.fsidi.2021.301314
-
[26]
Hirano, M., Kobayashi, R., 2025. Ransmap: Open dataset of ran- somwarestorageandmemoryaccesspatternsforcreatingdeeplearn- ingbasedransomwaredetectors. Computers&Security150,104202. doi:10.1016/j.cose.2024.104202
-
[27]
Inetsim – internet services simulation suite
INetSim Team, 2019. Inetsim – internet services simulation suite. https://www.inetsim.org. [Online]
work page 2019
-
[28]
Dikedataset – benign sample files.https: //github.com/iosifache/DikeDataset/tree/main/files/benign
Iosifache, A., 2023. Dikedataset – benign sample files.https: //github.com/iosifache/DikeDataset/tree/main/files/benign. GitHub, [Online]
work page 2023
-
[29]
Exploring sleep-based evasion in malware and defense techniques
Jiang, J., Sun, K., 2021. Exploring sleep-based evasion in malware and defense techniques. Journal of Information Security and Appli- cations 61, 102900
work page 2021
-
[30]
A survey of ransomware detection techniques: Static and dynamic perspectives
Kang, M., Kim, Y., Lee, J., 2023. A survey of ransomware detection techniques: Static and dynamic perspectives. Computers & Security 125, 103020
work page 2023
-
[31]
Koppanati, S., Reddy, S.S., Reddy, M.S., Reddy, M.S., 2025. Bert- powered malware detection with potential regional and contextual features, in: Proceedings of the 4th International Conference on Machine Learning and Data Engineering (iCMLDE 2024), Springer. pp. 313–327. doi:10.1007/978-3-031-87775-9_22
-
[32]
Lashkari, A.H., Kadir, A.F.A., Taheri, L., Ghorbani, A.A., 2018. Toward developing a systematic approach to generate benchmark an- droidmalwaredatasetsandclassification,in:Proc.2018Int.Carnahan Conf.SecurityTechnology(ICCST),Montreal,QC,Canada.pp.1–8. doi:10.1109/CCST.2018.8585560
-
[33]
Capstone: Disassembly framework, in: BlackHat Europe
Le, N.A.Q., 2014. Capstone: Disassembly framework, in: BlackHat Europe
work page 2014
-
[34]
Delayed execution in malware: Behavior-based detection strategies
Mathew, G., Komandur, R., 2020. Delayed execution in malware: Behavior-based detection strategies. International Journal of Cyber- security Intelligence and Cybercrime 3, 45–60
work page 2020
-
[35]
Malware analysis and static call graph generation with radare2
Mester, A., 2023. Malware analysis and static call graph generation with radare2. Studia Universitatis Babes,–Bolyai, Informatica 68, 5–
work page 2023
-
[36]
doi:10.24193/subbi.2023.1.01
-
[37]
Pe format.https://learn.microsoft.com/en-u s/windows/win32/debug/pe-format
Microsoft Docs, 2023. Pe format.https://learn.microsoft.com/en-u s/windows/win32/debug/pe-format. Microsoft Learn, [Online]
work page 2023
-
[38]
Process monitor.https://docs.micro soft.com/sysinternals/downloads/procmon
Microsoft Sysinternals, 2020. Process monitor.https://docs.micro soft.com/sysinternals/downloads/procmon. [Online]
work page 2020
-
[39]
Moreira, C.C., Moreira, D.C., de Sales Jr., C.S., 2023. Improving ransomware detection based on portable executable header using xception convolutional neural network. Computers & Security 130, 103265. doi:10.1016/j.cose.2023.103265
-
[40]
Mowri, R.A., Siddula, M., Roy, K., 2022. Application of explainable machine learning in detecting and classifying ransomware families based on api call analysis.https://arxiv.org/abs/2210.11235. ArXiv preprint arXiv:2210.11235, [Online]
-
[41]
Ghidrasoftwarereverseengineering framework.https://ghidra-sre.org
NationalSecurityAgency,2019. Ghidrasoftwarereverseengineering framework.https://ghidra-sre.org. [Online]
work page 2019
-
[42]
Software diversity: Security, resilience and detectability
Okhravi, H., et al., 2020. Software diversity: Security, resilience and detectability. ACM Computing Surveys 53, 1–33
work page 2020
-
[43]
A taxonomy of anti-analysis techniques in ransomware
Olaimat, M., Al-Rahayfeh, A., Alkasassbeh, M., 2021. A taxonomy of anti-analysis techniques in ransomware. IEEE Access 9, 45612– 45630
work page 2021
-
[44]
Cape:Malwareconfiguration and payload extraction.https://github.com/kevoreilly/CAPEv2
O’Reilly,K.,Brukhovetskyy,A.,2020. Cape:Malwareconfiguration and payload extraction.https://github.com/kevoreilly/CAPEv2. [Online]
work page 2020
-
[45]
Impact of the 2021 ransomware attack on the irish health system
O’Connor, Y., Rowan, W., 2021. Impact of the 2021 ransomware attack on the irish health system. Canadian Journal of Nursing Informatics
work page 2021
-
[46]
Radare2: Reverse engineering framework
Radare2 Team, 2020. Radare2: Reverse engineering framework. https://rada.re/n/. [Online]
work page 2020
-
[47]
Regshot: Registry snapshot tool.https: //sourceforge.net/projects/regshot/
RegShot Project, 2019. Regshot: Registry snapshot tool.https: //sourceforge.net/projects/regshot/. [Online]. Caliskan B. et al.:Preprint submitted to ElsevierPage 18 of 19 RansomTrack for Ransomware Detection
work page 2019
-
[48]
Detecting and bypassing frida dynamic function call tracing: Exploitation and mit- igation
Soriano-Salvador, E., Guardiola-Múzquiz, G., 2023. Detecting and bypassing frida dynamic function call tracing: Exploitation and mit- igation. Journal of Computer Virology and Hacking Techniques 19, 503–513. doi:10.1007/s11416-022-00458-7
-
[49]
Statista, 2022. Desktop windows os market share worldwide 2013– 2022.https://www.statista.com/statistics/218089/global-market-s hare-of-windows-7/. [Online]
work page 2022
-
[50]
Virustotal - free online virus, malware and url scanner
VirusTotal, . Virustotal - free online virus, malware and url scanner. https://www.virustotal.com. [Accessed: Jul. 1, 2025]
work page 2025
-
[51]
The volatility framework: Memory forensics
Walters, A., 2014. The volatility framework: Memory forensics. https://www.volatilityfoundation.org. [Online]
work page 2014
-
[52]
Portableexecutable.https://en.wikipedia.org/wik i/Portable_executable
Wikipedia,2024. Portableexecutable.https://en.wikipedia.org/wik i/Portable_executable. [Online]
work page 2024
-
[53]
Wireshark–networkprotocolanalyzer
WiresharkFoundation,2023. Wireshark–networkprotocolanalyzer. https://www.wireshark.org/. [Online]
work page 2023
-
[54]
Yüksel, A.K., 2025. Malware detection using machine learning methods on the apimds dataset-6: Random forest algorithm.https: //medium.com/ai-genai-llm/malware-detection-using-machine-learn ing-methods-on-the-apimds-dataset-6-random-forest-algorithm-584 8399ce12c. Medium, [Online]
work page 2025
-
[55]
A novel malware detection method based on api embedding and api parameters
Zhou, B., Huang, H., Xia, J., Tian, D., 2024. A novel malware detection method based on api embedding and api parameters. The Journal of Supercomputing 80, 2748–2766. doi:10.1007/s11227-023 -05556-x. Busra Caliskan received the B.S. degree in biomedical engineering from Yeditepe University, Istanbul, Turkiye, in 2020, and the M.S. degree in computer engin...
-
[56]
His current research interests include infor- mation security, malware analysis, and machine learning applications. HehasbeenanavyofficerintheTurkishNaval Forcessince2010.HiscurrentpositionintheNavy is at the National Defense University in Istanbul. H.HakanKilincreceivedhisB.S.degreeinMathe- maticsandComputerSciencefromEgeUniversity, Izmir,Turkey,in1997.H...
work page 2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.