Unraveling the Key of Machine Learning-based Android Malware Detection

Fabio Pierazzi; Jiahao Liu; Jun Zeng; Lorenzo Cavallaro; Zhenkai Liang; Ziqi Yang

arxiv: 2402.02953 · v2 · submitted 2024-02-05 · 💻 cs.CR · cs.LG

Unraveling the Key of Machine Learning-based Android Malware Detection

Jiahao Liu , Jun Zeng , Fabio Pierazzi , Ziqi Yang , Lorenzo Cavallaro , Zhenkai Liang This is my paper

Pith reviewed 2026-05-24 03:29 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords Android malware detectionmachine learningadversarial robustnessmalware evolutionmalware semanticsdetection taxonomyempirical evaluation

0 comments

The pith

ML-based Android malware detectors remain vulnerable to evolving threats and adversarial attacks because they fail to capture semantic information that characterizes malicious behaviors from APK features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper organizes prior ML-based Android malware detection work into a unified taxonomy based on app representations and modeling pipelines, then builds a general framework to re-implement 12 representative approaches from software engineering, security, and machine learning communities. It evaluates these systems across detection effectiveness, robustness to malware evolution and adversarial attacks, and efficiency on large-scale tests. The central finding is that even high-performing detectors are limited by their inability to leverage malware semantics, leaving them exposed in realistic conditions. The work concludes with insights and recommendations for addressing this gap.

Core claim

Through the taxonomy and re-implementation of 12 approaches, the paper shows that existing ML-based Android malware detectors achieve encouraging results in standard settings yet remain vulnerable to malware evolution and adversarial attacks, with these limitations stemming from insufficient capture and use of malware semantics defined as semantic information characterizing malicious behaviors derived from APK features.

What carries the argument

A general-purpose framework that unifies Android app representations and the ML modeling pipeline to enable consistent re-implementation and cross-dimensional evaluation of detection approaches.

If this is right

Improving the capture of malware semantics should directly increase robustness to evolution and attacks.
Current detectors trade off effectiveness for efficiency in ways that limit semantic depth.
A taxonomy organized by representations and pipelines allows systematic identification of gaps across research communities.
Recommendations for future work center on designing features and models that better encode malicious behavior semantics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The evaluation setup could be extended to test whether newer representation learning methods overcome the identified semantic shortfall.
If semantics are the key missing element, then hybrid systems combining static and dynamic behavioral traces may close the robustness gap faster than pure ML refinements.
The taxonomy provides a reusable structure for classifying and comparing any future Android malware detector without redoing the full re-implementation effort.

Load-bearing premise

The twelve re-implemented approaches accurately reproduce the original published methods and the chosen datasets, metrics, and attack models reflect real-world Android malware detection conditions.

What would settle it

A single detector that maintains high accuracy against both unseen malware families over time and adversarial perturbations while explicitly deriving and using semantic behavioral information from APK features.

Figures

Figures reproduced from arXiv: 2402.02953 by Fabio Pierazzi, Jiahao Liu, Jun Zeng, Lorenzo Cavallaro, Zhenkai Liang, Ziqi Yang.

**Figure 2.** Figure 2: The relationships between Feature Representations [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The architecture of FrameDroid. This analysis aims to estimate and unravel the effects of various realworld scenarios, such as different data sizes, goodware-to-malware ratios, and the presence of adversarial attacks, on the performance of these methods. Such experiments offer valuable insights into the current state of ML-based Android malware detection. Specifically, within this section, we re-implement… view at source ↗

**Figure 4.** Figure 4: Effectiveness of the selected approaches using dif [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: The performance of the selected techniques against diverse malware evolution periods. Columns display the absolute [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: The efficiency of feature transformation of the se [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: The overall distribution of investigated approaches [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: The rolling algorithm for evaluating the Android [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

read the original abstract

With the rapid advancement of machine learning (ML), ML-based Android malware detection has gained significant popularity due to its ability to automatically learn malicious patterns from Android apps. However, the lack of an in-depth and systematic analysis of existing research makes it difficult to obtain a holistic understanding of the state of the art in this field. In this work, we present the most comprehensive investigation to date of ML-based Android malware detection systems, combining both empirical and quantitative analyses. We first organize prior work into a unified taxonomy based on Android app representations and the ML modeling pipeline. Building on this taxonomy, we design a general-purpose framework for ML-based Android malware detection and re-implement 12 representative approaches from three research communities -- software engineering, security, and machine learning. Using this framework, we conduct a large-scale evaluation across three key dimensions: detection effectiveness, robustness to real-world challenges, and efficiency. Despite extensive research efforts and encouraging results, our findings reveal that existing learning-based Android malware detectors still face significant challenges, including vulnerability to malware evolution and susceptibility to adversarial attacks. We attribute these limitations to the detectors' ability to capture and leverage malware semantics, defined as semantic information that characterizes malicious behaviors derived from APK features. Finally, we summarize our key insights and provide actionable recommendations to guide future research in this domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper re-implements 12 Android malware detectors in one framework and tests their robustness to evolution and attacks, attributing shortfalls to weak semantic capture.

read the letter

This paper organizes prior ML-based Android malware work into a taxonomy by app representation and modeling pipeline, then re-implements 12 systems from software engineering, security, and ML communities inside a shared framework. It evaluates them on detection performance, robustness to real-world evolution and adversarial examples, and efficiency, concluding that the detectors remain vulnerable because they fail to leverage semantic information about malicious behavior.

Referee Report

2 major / 2 minor

Summary. The paper organizes prior ML-based Android malware detection work into a taxonomy based on app representations and the ML modeling pipeline, designs a general-purpose framework, re-implements 12 representative detectors from SE, security, and ML communities, and evaluates them at scale on detection effectiveness, robustness to malware evolution and adversarial attacks, and efficiency. It concludes that existing detectors remain vulnerable to evolution and adversarial examples because they fail to capture and leverage malware semantics (defined as semantic information characterizing malicious behaviors from APK features), and offers insights plus recommendations for future work.

Significance. If the central empirical claims hold after verification of reproduction fidelity, the work would be a significant contribution as the largest-scale comparative study in this area, providing a reusable framework and concrete evidence of persistent limitations that could steer the community toward semantics-aware approaches. The explicit taxonomy and unified re-implementation effort are strengths that enable direct comparability across communities.

major comments (2)

[§4] §4 (Re-implementation section): The manuscript does not report quantitative fidelity metrics (e.g., side-by-side F1 or accuracy on the exact dataset splits used in the original publications) for any of the 12 re-implemented detectors. Because the central attribution of failure modes to lack of semantic capture rests entirely on these reproductions, absence of such checks leaves open the possibility that observed vulnerabilities are artifacts of implementation differences rather than intrinsic properties of the original methods.
[§5.2–5.3] §5.2–5.3 (Evolution and adversarial evaluation): The paper attributes poor performance on evolved malware and adversarial examples to insufficient semantic capture, yet provides no ablation or feature-importance analysis showing that the detectors' learned representations indeed lack the semantic properties defined in the introduction. Without such evidence, the causal link between the observed failures and the semantics hypothesis remains correlational.

minor comments (2)

[Abstract / §1] The abstract and introduction repeatedly use the phrase 'most comprehensive investigation to date' without a supporting citation or explicit comparison table against prior surveys; a brief related-work paragraph quantifying coverage would strengthen this claim.
[§3] Notation for the unified framework (e.g., how APK features are mapped to the taxonomy categories) is introduced in §3 but not summarized in a single table; adding such a table would improve readability for readers comparing the 12 approaches.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment point-by-point below.

read point-by-point responses

Referee: [§4] §4 (Re-implementation section): The manuscript does not report quantitative fidelity metrics (e.g., side-by-side F1 or accuracy on the exact dataset splits used in the original publications) for any of the 12 re-implemented detectors. Because the central attribution of failure modes to lack of semantic capture rests entirely on these reproductions, absence of such checks leaves open the possibility that observed vulnerabilities are artifacts of implementation differences rather than intrinsic properties of the original methods.

Authors: We agree this is a valid concern for strengthening the reproduction claims. Our re-implementations followed the original papers as closely as possible within the unified framework, and overall trends align with published results. We will add a table in the revised §4 reporting side-by-side F1/accuracy comparisons against original publications on their reported dataset splits where those splits and data are available and reproducible. revision: yes
Referee: [§5.2–5.3] §5.2–5.3 (Evolution and adversarial evaluation): The paper attributes poor performance on evolved malware and adversarial examples to insufficient semantic capture, yet provides no ablation or feature-importance analysis showing that the detectors' learned representations indeed lack the semantic properties defined in the introduction. Without such evidence, the causal link between the observed failures and the semantics hypothesis remains correlational.

Authors: The attribution rests on the taxonomy in §3, which classifies each detector by its feature representations and explicitly identifies which rely on syntactic rather than semantic properties (as defined in the introduction). The uniform vulnerability pattern across non-semantic detectors provides supporting evidence. We acknowledge the absence of explicit ablation or feature-importance studies. We will expand the discussion in §5.2–5.3 to more directly connect results to the taxonomy classifications; adding full ablations would require new experiments beyond the current scope. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical re-implementations and evaluations are independent of the paper's own inputs.

full rationale

The paper organizes prior work into a taxonomy, re-implements 12 detectors in a general framework, and evaluates them empirically on detection effectiveness, robustness to evolution/adversarial attacks, and efficiency. Claims about limitations and attribution to 'malware semantics' (defined as semantic information characterizing malicious behaviors from APK features) follow from these new comparisons rather than reducing by construction to fitted parameters, self-definitions, or self-citation chains. No equations, predictions, or uniqueness theorems are present that equate outputs to inputs. The work is self-contained against external benchmarks via the re-implementations and large-scale evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on standard domain assumptions from ML security research without introducing new free parameters or invented entities.

axioms (1)

domain assumption Standard ML evaluation assumptions hold, including that benchmark datasets are representative of real-world Android malware distributions.
The large-scale evaluation implicitly depends on this background assumption common to the field.

pith-pipeline@v0.9.0 · 5775 in / 1206 out tokens · 37592 ms · 2026-05-24T03:29:43.523889+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

128 extracted references · 128 canonical work pages · 3 internal anchors

[1]

[n. d.]. Androguard. https://github.com/androguard/

work page
[2]

[n. d.]. Angr. https://angr.io/

work page
[3]

[n. d.]. Apktool. https://ibotpeaches.github.io/Apktool/

work page
[4]

[n. d.]. BackSmali. https://github.com/JesusFreke/smali

work page
[5]

[n. d.]. Harly: another Trojan subscriber on Google Play. https://www.kaspersky. com/blog/harly-trojan-subscriber/45573

work page
[6]

[n. d.]. How Many Apps In Google Play Store? https://www.bankmycell.com/ blog/number-of-google-play-store-apps

work page
[7]

[n. d.]. IDA Pro. https://hex-rays.com/ida-pro/

work page
[8]

[n. d.]. Kharon project. https://cidre.gitlabpages.inria.fr/malware/malware- website/dataset/malware_DroidKungFu1.html

work page
[9]

[n. d.]. LibRadar. https://github.com/pkumza/LibRadar

work page
[10]

[n. d.]. PyTorch. https://pytorch.org/

work page
[11]

[n. d.]. Share of Android OS of global smartphone shipments. https://www.statista.com/statistics/236027/global-smartphone-os-market- share-of-android

work page
[12]

[n. d.]. The mobile malware threat landscape in 2022. https://securelist.com/ mobile-threat-report-2022/108844

work page 2022
[13]

[n. d.]. VirusTotal. https://www.virustotal.com

work page
[14]

Yousra Aafer, Wenliang Du, and Heng Yin. 2013. Droidapiminer: Mining api- level features for robust malware detection in android. In International ICST Conference, SecureComm

work page 2013
[15]

Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2016. Androzoo: Collecting millions of android apps for the research community. In MSR

work page 2016
[16]

Muhammad Amin, Babar Shah, Aizaz Sharif, Tamleek Ali, Ki-Il Kim, and Sajid Anwar. 2022. Android malware detection through generative adversarial net- works. Emerging Telecommunications Technologies (2022)

work page 2022
[17]

Simone Aonzo, Gabriel Claudiu Georgiu, Luca Verderame, and Alessio Merlo

work page
[18]

SoftwareX (2020)

Obfuscapk: An open-source black-box obfuscation tool for Android apps. SoftwareX (2020)

work page 2020
[19]

Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. Drebin: Effective and explainable detection of android malware in your pocket.. In NDSS

work page 2014
[20]

Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In ICML

work page 2018
[21]

Kathy Wain Yee Au, Yi Fan Zhou, Zhen Huang, and David Lie. 2012. Pscout: analyzing the android permission specification. In CCS

work page 2012
[22]

Michael Backes, Sven Bugiel, Erik Derr, Patrick McDaniel, Damien Octeau, and Sebastian Weisgerber. 2016. On demystifying the android application framework:{Re-Visiting} android permission specification analysis. InSecurity

work page 2016
[23]

Federico Barbero, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro

work page
[24]

Transcending transcend: Revisiting malware classification in the presence of concept drift. In SP

work page
[25]

Arjun Nitin Bhagoji, Daniel Cullina, Chawin Sitawarin, and Prateek Mittal. 2018. Enhancing robustness of machine learning systems via data transformations. In CISS

work page 2018
[26]

Haipeng Cai. 2020. Assessing and improving malware detection sustainability through app evolution studies. TOSEM (2020)

work page 2020
[27]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In SP

work page 2017
[28]

Fabrício Ceschin, Marcus Botacin, Albert Bifet, Bernhard Pfahringer, Luiz S Oliveira, Heitor Murilo Gomes, and André Grégio. 2020. Machine learning (in) security: A stream of problems. Digital Threats: Research and Practice (2020)

work page 2020
[29]

Ngoc-Tu Chau and Souhwan Jung. 2018. Dynamic analysis with Android container: Challenges and opportunities. Digital Investigation (2018)

work page 2018
[30]

Simin Chen, Soroush Bateni, Sampath Grandhi, Xiaodi Li, Cong Liu, and Wei Yang. 2020. DENAS: automated rule generation by knowledge extraction from neural networks. In ESEC/FSE

work page 2020
[31]

Xiao Chen, Chaoran Li, Derui Wang, Sheng Wen, Jun Zhang, Surya Nepal, Yang Xiang, and Kui Ren. 2019. Android HIV: A study of repackaging malware for evading machine-learning detection. TIFS (2019)

work page 2019
[32]

Yizheng Chen, Zhoujie Ding, and David Wagner. 2023. Continuous Learning for Android Malware Detection. arXiv preprint arXiv:2302.04332 (2023)

work page arXiv 2023
[33]

Francisco Handrick da Costa, Ismael Medeiros, Thales Menezes, João Victor da Silva, Ingrid Lorraine da Silva, Rodrigo Bonifácio, Krishna Narasimhan, and Márcio Ribeiro. 2022. Exploring the use of static and dynamic analysis to improve the performance of the mining sandbox approach for android malware identification. Journal of Systems and Software (2022)

work page 2022
[34]

Nadia Daoudi, Jordan Samhi, Abdoul Kader Kabore, Kevin Allix, Tegawendé F Bissyandé, and Jacques Klein. 2021. Dexray: a simple, yet effective deep learn- ing approach to android malware detection based on image representation of bytecode. In DMLSD

work page 2021
[35]

Yuxin Ding, Xiao Zhang, Jieke Hu, and Wenting Xu. 2020. Android malware detection method based on bytecode image. AIHC (2020)

work page 2020
[36]

William Enck, Machigar Ongtang, and Patrick McDaniel. 2009. On lightweight mobile phone application certification. In CCS

work page 2009
[37]

Yujie Fan, Mingxuan Ju, Shifu Hou, Yanfang Ye, Wenqiang Wan, Kui Wang, Yinming Mei, and Qi Xiong. 2021. Heterogeneous temporal graph transformer: An intelligent system for evolving android malware detection. In KDD

work page 2021
[38]

Parvez Faruki, Ammar Bharmal, Vijay Laxmi, Vijay Ganmoor, Manoj Singh Gaur, Mauro Conti, and Muttukrishnan Rajarajan. 2014. Android security: a survey of issues, malware penetration, and defenses. IEEE communications surveys & tutorials (2014)

work page 2014
[39]

Adrienne Porter Felt, Erika Chin, Steve Hanna, Dawn Song, and David Wagner

work page
[40]

Android permissions demystified. In CCS

work page
[41]

Ruitao Feng, Sen Chen, Xiaofei Xie, Lei Ma, Guozhu Meng, Yang Liu, and Shang- Wei Lin. 2019. Mobidroid: A performance-sensitive malware detection system on mobile platform. In ICECCS

work page 2019
[42]

Ruitao Feng, Sen Chen, Xiaofei Xie, Guozhu Meng, Shang-Wei Lin, and Yang Liu

work page
[43]

TIFS (2020)

A performance-sensitive malware detection system using deep learning on mobile devices. TIFS (2020)

work page 2020
[44]

Han Gao, Shaoyin Cheng, and Weiming Zhang. 2021. GDroid: Android malware detection and classification with graph convolutional network. Computers & Security (2021)

work page 2021
[45]

Joshua Garcia, Mahmoud Hammad, and Sam Malek. 2018. Lightweight, obfuscation-resilient detection and family identification of android malware. TOSEM (2018)

work page 2018
[46]

Ross Girshick. 2015. Fast r-cnn. In ICCV

work page 2015
[47]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS

work page 2014
[48]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In ICLR

work page 2015
[49]

Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick McDaniel. 2017. Adversarial examples for malware detection. In ES- ORICS

work page 2017
[50]

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In CVPR

work page 2017
[51]

Ke He and Dong-Seong Kim. 2019. Malware detection with malware images using deep learning techniques. In TrustCom

work page 2019
[52]

Ping He, Yifan Xia, Xuhong Zhang, and Shouling Ji. 2023. Efficient Query-Based Attack against ML-Based Android Malware Detection under Zero Knowledge Setting. In CCS

work page 2023
[53]

Yiling He, Yiping Liu, Lei Wu, Ziqi Yang, Kui Ren, and Zhan Qin. 2022. MsDroid: Identifying Malicious Snippets for Android Malware Detection. In TDSC

work page 2022
[54]

Geoffrey Hinton. 2009. Deep belief networks. Scholarpedia (2009)

work page 2009
[55]

Shifu Hou, Yanfang Ye, Yangqiu Song, and Melih Abdulhayoglu. 2017. Hin- droid: An intelligent android malware detection system based on structured heterogeneous information network. In KDD. 13 Jiahao Liu, Jun Zeng, Fabio Pierazzi, Lorenzo Cavallaro, and Zhenkai Liang

work page 2017
[56]

TonTon Hsien-De Huang and Hung-Yu Kao. 2018. R2-d2: Color-inspired convo- lutional neural network cnn-based android malware detections. In BigData

work page 2018
[57]

Na Huang, Ming Xu, Ning Zheng, Tong Qiao, and Kim-Kwang Raymond Choo

work page
[58]

In TrustCom/BigDataSE

Deep android malware classification with API-based feature graph. In TrustCom/BigDataSE

work page
[59]

Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. 2018. Black-box adversarial attacks with limited queries and information. In ICML

work page 2018
[60]

Roberto Jordaney, Kumar Sharad, Santanu K Dash, Zhi Wang, Davide Papini, Ilia Nouretdinov, and Lorenzo Cavallaro. 2017. Transcend: Detecting concept drift in malware classification models. In Security

work page 2017
[61]

ElMouatez Billah Karbab and Mourad Debbabi. 2021. Petadroid: adaptive an- droid malware detection using deep learning. In DIMV A

work page 2021
[62]

ElMouatez Billah Karbab, Mourad Debbabi, Abdelouahid Derhab, and Djedjiga Mouheb. 2018. MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation (2018)

work page 2018
[63]

TaeGuen Kim, BooJoong Kang, Mina Rho, Sakir Sezer, and Eul Gyu Im. 2018. A multimodal deep learning method for android malware detection using various features. In TIFS

work page 2018
[64]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[65]

Tao Lei, Zhan Qin, Zhibo Wang, Qi Li, and Dengpan Ye. 2019. EveDroid: Event- aware Android malware detection against model degrading for IoT devices. IoTJ (2019)

work page 2019
[66]

Heng Li, Zhang Cheng, Bang Wu, Liheng Yuan, Cuiying Gao, Wei Yuan, and Xiapu Luo. 2023. Black-box Adversarial Example Attack towards FCG Based Android Malware Detection under Incomplete Feature Information. InSecurity

work page 2023
[67]

Heng Li, ShiYao Zhou, Wei Yuan, Jiahuan Li, and Henry Leung. 2019. Adversarial-example attacks toward android malware detection system. IEEE Systems Journal (2019)

work page 2019
[68]

Heng Li, Shiyao Zhou, Wei Yuan, Xiapu Luo, Cuiying Gao, and Shuiyan Chen

work page
[69]

Robust android malware detection against adversarial example attacks. In WWW

work page
[70]

Li Li, Tegawendé F Bissyandé, Mike Papadakis, Siegfried Rasthofer, Alexandre Bartel, Damien Octeau, Jacques Klein, and Le Traon. 2017. Static analysis of android apps: A systematic literature review. Information and Software Technology (2017)

work page 2017
[71]

Xuezixiang Li, Yu Qu, and Heng Yin. 2021. Palmtree: Learning an assembly language model for instruction embedding. In CCS

work page 2021
[72]

Yuping Li, Jiyong Jang, Xin Hu, and Xinming Ou. 2017. Android malware clus- tering through malicious payload mining. In Research in Attacks, Intrusions, and Defenses: 20th International Symposium, RAID 2017, Atlanta, GA, USA, September 18–20, 2017, Proceedings

work page 2017
[73]

Kaijun Liu, Shengwei Xu, Guoai Xu, Miao Zhang, Dawei Sun, and Haifeng Liu

work page
[74]

IEEE Access (2020)

A review of android malware detection approaches based on machine learning. IEEE Access (2020)

work page 2020
[75]

Yue Liu, Chakkrit Tantithamthavorn, Li Li, and Yepang Liu. 2022. Deep learning for android malware defenses: a systematic literature review. JACM (2022)

work page 2022
[76]

Enrico Mariconti, Lucky Onwuzurike, Panagiotis Andriotis, Emiliano De Cristo- faro, Gordon Ross, and Gianluca Stringhini. 2017. Mamadroid: Detecting android malware by building markov chains of behavioral models. In NDSS

work page 2017
[77]

Alejandro Martín, Félix Fuentes-Hurtado, Valery Naranjo, and David Cama- cho. 2017. Evolving deep neural networks architectures for android malware classification. In CEC

work page 2017
[78]

Niall McLaughlin, Jesus Martinez del Rincon, BooJoong Kang, Suleiman Yerima, Paul Miller, Sakir Sezer, Yeganeh Safaei, Erik Trickel, Ziming Zhao, Adam Doupé, et al. 2017. Deep android malware detection. In CODASPY

work page 2017
[79]

Larry R Medsker and LC Jain. 2001. Recurrent neural networks. Design and Applications (2001)

work page 2001
[80]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS

work page 2013

Showing first 80 references.

[1] [1]

[n. d.]. Androguard. https://github.com/androguard/

work page

[2] [2]

[n. d.]. Angr. https://angr.io/

work page

[3] [3]

[n. d.]. Apktool. https://ibotpeaches.github.io/Apktool/

work page

[4] [4]

[n. d.]. BackSmali. https://github.com/JesusFreke/smali

work page

[5] [5]

[n. d.]. Harly: another Trojan subscriber on Google Play. https://www.kaspersky. com/blog/harly-trojan-subscriber/45573

work page

[6] [6]

[n. d.]. How Many Apps In Google Play Store? https://www.bankmycell.com/ blog/number-of-google-play-store-apps

work page

[7] [7]

[n. d.]. IDA Pro. https://hex-rays.com/ida-pro/

work page

[8] [8]

[n. d.]. Kharon project. https://cidre.gitlabpages.inria.fr/malware/malware- website/dataset/malware_DroidKungFu1.html

work page

[9] [9]

[n. d.]. LibRadar. https://github.com/pkumza/LibRadar

work page

[10] [10]

[n. d.]. PyTorch. https://pytorch.org/

work page

[11] [11]

[n. d.]. Share of Android OS of global smartphone shipments. https://www.statista.com/statistics/236027/global-smartphone-os-market- share-of-android

work page

[12] [12]

[n. d.]. The mobile malware threat landscape in 2022. https://securelist.com/ mobile-threat-report-2022/108844

work page 2022

[13] [13]

[n. d.]. VirusTotal. https://www.virustotal.com

work page

[14] [14]

Yousra Aafer, Wenliang Du, and Heng Yin. 2013. Droidapiminer: Mining api- level features for robust malware detection in android. In International ICST Conference, SecureComm

work page 2013

[15] [15]

Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2016. Androzoo: Collecting millions of android apps for the research community. In MSR

work page 2016

[16] [16]

Muhammad Amin, Babar Shah, Aizaz Sharif, Tamleek Ali, Ki-Il Kim, and Sajid Anwar. 2022. Android malware detection through generative adversarial net- works. Emerging Telecommunications Technologies (2022)

work page 2022

[17] [17]

Simone Aonzo, Gabriel Claudiu Georgiu, Luca Verderame, and Alessio Merlo

work page

[18] [18]

SoftwareX (2020)

Obfuscapk: An open-source black-box obfuscation tool for Android apps. SoftwareX (2020)

work page 2020

[19] [19]

Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. Drebin: Effective and explainable detection of android malware in your pocket.. In NDSS

work page 2014

[20] [20]

Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In ICML

work page 2018

[21] [21]

Kathy Wain Yee Au, Yi Fan Zhou, Zhen Huang, and David Lie. 2012. Pscout: analyzing the android permission specification. In CCS

work page 2012

[22] [22]

Michael Backes, Sven Bugiel, Erik Derr, Patrick McDaniel, Damien Octeau, and Sebastian Weisgerber. 2016. On demystifying the android application framework:{Re-Visiting} android permission specification analysis. InSecurity

work page 2016

[23] [23]

Federico Barbero, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro

work page

[24] [24]

Transcending transcend: Revisiting malware classification in the presence of concept drift. In SP

work page

[25] [25]

Arjun Nitin Bhagoji, Daniel Cullina, Chawin Sitawarin, and Prateek Mittal. 2018. Enhancing robustness of machine learning systems via data transformations. In CISS

work page 2018

[26] [26]

Haipeng Cai. 2020. Assessing and improving malware detection sustainability through app evolution studies. TOSEM (2020)

work page 2020

[27] [27]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In SP

work page 2017

[28] [28]

Fabrício Ceschin, Marcus Botacin, Albert Bifet, Bernhard Pfahringer, Luiz S Oliveira, Heitor Murilo Gomes, and André Grégio. 2020. Machine learning (in) security: A stream of problems. Digital Threats: Research and Practice (2020)

work page 2020

[29] [29]

Ngoc-Tu Chau and Souhwan Jung. 2018. Dynamic analysis with Android container: Challenges and opportunities. Digital Investigation (2018)

work page 2018

[30] [30]

Simin Chen, Soroush Bateni, Sampath Grandhi, Xiaodi Li, Cong Liu, and Wei Yang. 2020. DENAS: automated rule generation by knowledge extraction from neural networks. In ESEC/FSE

work page 2020

[31] [31]

Xiao Chen, Chaoran Li, Derui Wang, Sheng Wen, Jun Zhang, Surya Nepal, Yang Xiang, and Kui Ren. 2019. Android HIV: A study of repackaging malware for evading machine-learning detection. TIFS (2019)

work page 2019

[32] [32]

Yizheng Chen, Zhoujie Ding, and David Wagner. 2023. Continuous Learning for Android Malware Detection. arXiv preprint arXiv:2302.04332 (2023)

work page arXiv 2023

[33] [33]

Francisco Handrick da Costa, Ismael Medeiros, Thales Menezes, João Victor da Silva, Ingrid Lorraine da Silva, Rodrigo Bonifácio, Krishna Narasimhan, and Márcio Ribeiro. 2022. Exploring the use of static and dynamic analysis to improve the performance of the mining sandbox approach for android malware identification. Journal of Systems and Software (2022)

work page 2022

[34] [34]

Nadia Daoudi, Jordan Samhi, Abdoul Kader Kabore, Kevin Allix, Tegawendé F Bissyandé, and Jacques Klein. 2021. Dexray: a simple, yet effective deep learn- ing approach to android malware detection based on image representation of bytecode. In DMLSD

work page 2021

[35] [35]

Yuxin Ding, Xiao Zhang, Jieke Hu, and Wenting Xu. 2020. Android malware detection method based on bytecode image. AIHC (2020)

work page 2020

[36] [36]

William Enck, Machigar Ongtang, and Patrick McDaniel. 2009. On lightweight mobile phone application certification. In CCS

work page 2009

[37] [37]

Yujie Fan, Mingxuan Ju, Shifu Hou, Yanfang Ye, Wenqiang Wan, Kui Wang, Yinming Mei, and Qi Xiong. 2021. Heterogeneous temporal graph transformer: An intelligent system for evolving android malware detection. In KDD

work page 2021

[38] [38]

Parvez Faruki, Ammar Bharmal, Vijay Laxmi, Vijay Ganmoor, Manoj Singh Gaur, Mauro Conti, and Muttukrishnan Rajarajan. 2014. Android security: a survey of issues, malware penetration, and defenses. IEEE communications surveys & tutorials (2014)

work page 2014

[39] [39]

Adrienne Porter Felt, Erika Chin, Steve Hanna, Dawn Song, and David Wagner

work page

[40] [40]

Android permissions demystified. In CCS

work page

[41] [41]

Ruitao Feng, Sen Chen, Xiaofei Xie, Lei Ma, Guozhu Meng, Yang Liu, and Shang- Wei Lin. 2019. Mobidroid: A performance-sensitive malware detection system on mobile platform. In ICECCS

work page 2019

[42] [42]

Ruitao Feng, Sen Chen, Xiaofei Xie, Guozhu Meng, Shang-Wei Lin, and Yang Liu

work page

[43] [43]

TIFS (2020)

A performance-sensitive malware detection system using deep learning on mobile devices. TIFS (2020)

work page 2020

[44] [44]

Han Gao, Shaoyin Cheng, and Weiming Zhang. 2021. GDroid: Android malware detection and classification with graph convolutional network. Computers & Security (2021)

work page 2021

[45] [45]

Joshua Garcia, Mahmoud Hammad, and Sam Malek. 2018. Lightweight, obfuscation-resilient detection and family identification of android malware. TOSEM (2018)

work page 2018

[46] [46]

Ross Girshick. 2015. Fast r-cnn. In ICCV

work page 2015

[47] [47]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS

work page 2014

[48] [48]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In ICLR

work page 2015

[49] [49]

Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick McDaniel. 2017. Adversarial examples for malware detection. In ES- ORICS

work page 2017

[50] [50]

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In CVPR

work page 2017

[51] [51]

Ke He and Dong-Seong Kim. 2019. Malware detection with malware images using deep learning techniques. In TrustCom

work page 2019

[52] [52]

Ping He, Yifan Xia, Xuhong Zhang, and Shouling Ji. 2023. Efficient Query-Based Attack against ML-Based Android Malware Detection under Zero Knowledge Setting. In CCS

work page 2023

[53] [53]

Yiling He, Yiping Liu, Lei Wu, Ziqi Yang, Kui Ren, and Zhan Qin. 2022. MsDroid: Identifying Malicious Snippets for Android Malware Detection. In TDSC

work page 2022

[54] [54]

Geoffrey Hinton. 2009. Deep belief networks. Scholarpedia (2009)

work page 2009

[55] [55]

Shifu Hou, Yanfang Ye, Yangqiu Song, and Melih Abdulhayoglu. 2017. Hin- droid: An intelligent android malware detection system based on structured heterogeneous information network. In KDD. 13 Jiahao Liu, Jun Zeng, Fabio Pierazzi, Lorenzo Cavallaro, and Zhenkai Liang

work page 2017

[56] [56]

TonTon Hsien-De Huang and Hung-Yu Kao. 2018. R2-d2: Color-inspired convo- lutional neural network cnn-based android malware detections. In BigData

work page 2018

[57] [57]

Na Huang, Ming Xu, Ning Zheng, Tong Qiao, and Kim-Kwang Raymond Choo

work page

[58] [58]

In TrustCom/BigDataSE

Deep android malware classification with API-based feature graph. In TrustCom/BigDataSE

work page

[59] [59]

Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. 2018. Black-box adversarial attacks with limited queries and information. In ICML

work page 2018

[60] [60]

Roberto Jordaney, Kumar Sharad, Santanu K Dash, Zhi Wang, Davide Papini, Ilia Nouretdinov, and Lorenzo Cavallaro. 2017. Transcend: Detecting concept drift in malware classification models. In Security

work page 2017

[61] [61]

ElMouatez Billah Karbab and Mourad Debbabi. 2021. Petadroid: adaptive an- droid malware detection using deep learning. In DIMV A

work page 2021

[62] [62]

ElMouatez Billah Karbab, Mourad Debbabi, Abdelouahid Derhab, and Djedjiga Mouheb. 2018. MalDozer: Automatic framework for android malware detection using deep learning. Digital Investigation (2018)

work page 2018

[63] [63]

TaeGuen Kim, BooJoong Kang, Mina Rho, Sakir Sezer, and Eul Gyu Im. 2018. A multimodal deep learning method for android malware detection using various features. In TIFS

work page 2018

[64] [64]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[65] [65]

Tao Lei, Zhan Qin, Zhibo Wang, Qi Li, and Dengpan Ye. 2019. EveDroid: Event- aware Android malware detection against model degrading for IoT devices. IoTJ (2019)

work page 2019

[66] [66]

Heng Li, Zhang Cheng, Bang Wu, Liheng Yuan, Cuiying Gao, Wei Yuan, and Xiapu Luo. 2023. Black-box Adversarial Example Attack towards FCG Based Android Malware Detection under Incomplete Feature Information. InSecurity

work page 2023

[67] [67]

Heng Li, ShiYao Zhou, Wei Yuan, Jiahuan Li, and Henry Leung. 2019. Adversarial-example attacks toward android malware detection system. IEEE Systems Journal (2019)

work page 2019

[68] [68]

Heng Li, Shiyao Zhou, Wei Yuan, Xiapu Luo, Cuiying Gao, and Shuiyan Chen

work page

[69] [69]

Robust android malware detection against adversarial example attacks. In WWW

work page

[70] [70]

Li Li, Tegawendé F Bissyandé, Mike Papadakis, Siegfried Rasthofer, Alexandre Bartel, Damien Octeau, Jacques Klein, and Le Traon. 2017. Static analysis of android apps: A systematic literature review. Information and Software Technology (2017)

work page 2017

[71] [71]

Xuezixiang Li, Yu Qu, and Heng Yin. 2021. Palmtree: Learning an assembly language model for instruction embedding. In CCS

work page 2021

[72] [72]

Yuping Li, Jiyong Jang, Xin Hu, and Xinming Ou. 2017. Android malware clus- tering through malicious payload mining. In Research in Attacks, Intrusions, and Defenses: 20th International Symposium, RAID 2017, Atlanta, GA, USA, September 18–20, 2017, Proceedings

work page 2017

[73] [73]

Kaijun Liu, Shengwei Xu, Guoai Xu, Miao Zhang, Dawei Sun, and Haifeng Liu

work page

[74] [74]

IEEE Access (2020)

A review of android malware detection approaches based on machine learning. IEEE Access (2020)

work page 2020

[75] [75]

Yue Liu, Chakkrit Tantithamthavorn, Li Li, and Yepang Liu. 2022. Deep learning for android malware defenses: a systematic literature review. JACM (2022)

work page 2022

[76] [76]

Enrico Mariconti, Lucky Onwuzurike, Panagiotis Andriotis, Emiliano De Cristo- faro, Gordon Ross, and Gianluca Stringhini. 2017. Mamadroid: Detecting android malware by building markov chains of behavioral models. In NDSS

work page 2017

[77] [77]

Alejandro Martín, Félix Fuentes-Hurtado, Valery Naranjo, and David Cama- cho. 2017. Evolving deep neural networks architectures for android malware classification. In CEC

work page 2017

[78] [78]

Niall McLaughlin, Jesus Martinez del Rincon, BooJoong Kang, Suleiman Yerima, Paul Miller, Sakir Sezer, Yeganeh Safaei, Erik Trickel, Ziming Zhao, Adam Doupé, et al. 2017. Deep android malware detection. In CODASPY

work page 2017

[79] [79]

Larry R Medsker and LC Jain. 2001. Recurrent neural networks. Design and Applications (2001)

work page 2001

[80] [80]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS

work page 2013