pith. sign in

arxiv: 1906.09456 · v1 · pith:U3AWKTTOnew · submitted 2019-06-22 · 💻 cs.CR

Andro-Simnet: Android Malware Family Classification Using Social Network Analysis

Pith reviewed 2026-05-25 18:04 UTC · model grok-4.3

classification 💻 cs.CR
keywords android malwaremalware family classificationsocial network analysiscommunity detectionbehavioral similarityfeature weightingmalware visualization
0
0 comments X

The pith

A network built from behavioral similarities classifies Android malware families at 97 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a system that constructs networks of malware samples connected by a similarity measure on shared behavioral features. It optimizes the weights of those features and applies community detection to tighten groups that correspond to known families. This is presented as a way to classify malware and its variants even when they avoid signature-based detection. A reader would care because accurate family grouping could support prediction of new samples based on observed attack patterns rather than exact code matches. The work also includes a visualization step to show how samples cluster by tactical characteristics.

Core claim

The central claim is that malware family classification can be performed by measuring similarity between samples using carefully chosen behavioral features that commonly appear together, deriving optimal feature weights through an explicit process, building a network from those similarities, and running a community detection algorithm to increase modularity within families, yielding 97 percent classification accuracy and 95 percent prediction accuracy on a real malware dataset via K-fold cross-validation.

What carries the argument

The weighted similarity measure on behavioral characteristics that is used to construct malware networks for subsequent community detection.

If this is right

  • Malware samples group according to attack behavior patterns and tactical characteristics.
  • A process exists to identify which features matter most for measuring similarity within families.
  • Graph visualizations of the networks reveal the distribution and likeness among samples.
  • The same pipeline supports both classification of known families and prediction of new samples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If behavioral features shift with new malware tactics, the weight-optimization step would need to be repeated on updated data.
  • The network approach could be tested for scalability by increasing the number of samples or families beyond the original dataset size.
  • Analysts might use the generated graphs to spot emerging clusters that do not match existing families.

Load-bearing premise

The chosen behavioral features and the process for deriving their optimal weights produce a generalizable grouping of malware families when combined with community detection.

What would settle it

Testing the trained system on a fresh collection of Android malware samples gathered after the original dataset and finding classification accuracy below 90 percent.

Figures

Figures reproduced from arXiv: 1906.09456 by Huy Kang Kim, Hye Min Kim, Hyun Min Song, Jae Woo Seo.

Figure 1
Figure 1. Figure 1: Overall process of Andro-Simnet Analyzer Static info. Dynamic info. Analyzed Feature Feature 1 Similarity Feature 2 Similarity Similarity of Analyzed Feat. Measure Similarity Final Similarity > Threshold Not connect Connect M1 M2 Yes No Network Graph Generator . . . Multiply Weights Scheduler [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The process of Analyzer and Network Graph Generato [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 2
Figure 2. Figure 2: Feature selection There are various features extracted from the analysis system (e.g., network packet, strings that used in malware, a method call). When we select four features, we consider that both malware behaviors and the signatures of the attacker are useful to represent the similar-relation of malware. The chosen features and the details of the reason why we select that feature are described below: … view at source ↗
Figure 3
Figure 3. Figure 3: Louvain community detection algorithm [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Louvain community detection algorithm 4.3 Process of Experiment We insert them into the task queue of the Analyzer through the Scheduler. The Analyzer performs static and dynamic analysis for each sample for 10 minutes. All analysis results are stored in the database so that they can be retrieved when necessary. While the Analyzer is running, the similarities of each feature between malware samples are cal… view at source ↗
Figure 5
Figure 5. Figure 5: The change of accuracy according to executions by t [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The final accuracy of the experiment according to th [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The change of accuracy according to Kth iteration [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: K-fold cross-validation accuracy according to th [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The final network graph 0 Unlabeled 11 0 0 7 0 4 3 Telman 0 1 0 0 0 0 0 19/24 smsSpy 0 3 0 0 0 0 108/118 0 Misosms 0 0 0 0 0 48/48 0 0 Gidix 0 0 0 0 49/49 0 0 0 Gepew 0 0 0 114/114 0 0 0 0 FakeInst 0 0 43/46 0 0 0 0 0 FakeBank 0 179/192 0 0 0 0 2 0 Bankun 91/91 0 0 0 0 0 0 0 Bankun FakeBank FakeInst Gepew Gidix Misosms smsSpy Telman Actual Predicted [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Confusion matrix of K-fold cross-validation [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Confusion matrix of K-fold cross-validation [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗
read the original abstract

While the rapid adaptation of mobile devices changes our daily life more conveniently, the threat derived from malware is also increased. There are lots of research to detect malware to protect mobile devices, but most of them adopt only signature-based malware detection method that can be easily bypassed by polymorphic and metamorphic malware. To detect malware and its variants, it is essential to adopt behavior-based detection for efficient malware classification. This paper presents a system that classifies malware by using common behavioral characteristics along with malware families. We measure the similarity between malware families with carefully chosen features commonly appeared in the same family. With the proposed similarity measure, we can classify malware by malware's attack behavior pattern and tactical characteristics. Also, we apply a community detection algorithm to increase the modularity within each malware family network aggregation. To maintain high classification accuracy, we propose a process to derive the optimal weights of the selected features in the proposed similarity measure. During this process, we find out which features are significant for representing the similarity between malware samples. Finally, we provide an intuitive graph visualization of malware samples which is helpful to understand the distribution and likeness of the malware networks. In the experiment, the proposed system achieved 97% accuracy for malware classification and 95% accuracy for prediction by K-fold cross-validation using the real malware dataset.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents Andro-Simnet, a system for Android malware family classification that selects common behavioral features, defines a similarity measure incorporating optimized feature weights, applies community detection to improve modularity in malware networks, and reports 97% classification accuracy together with 95% prediction accuracy via K-fold cross-validation on a real malware dataset; it also includes graph-based visualization of sample distributions.

Significance. If the reported accuracies are shown to be free of leakage, the combination of behavioral similarity, weight optimization for feature significance, and community detection could provide a useful, interpretable framework for grouping malware variants by tactical patterns, extending signature-based methods. The visualization component adds practical value for analysts.

major comments (3)
  1. [Abstract and §4] Abstract and §4 (experiment description): the 95% K-fold prediction accuracy rests on a similarity function whose feature weights are derived by an optimization process that simultaneously identifies 'significant' features; no explicit statement confirms that this tuning occurs strictly inside each training fold rather than on the full labeled set, so the cross-validation result may not demonstrate generalization.
  2. [§3] §3 (similarity measure and weight derivation): the procedure for obtaining optimal weights is described only at a high level without pseudocode, objective function, or separation from the K-fold splits; this leaves the central accuracy claim dependent on an unspecified step that could introduce circularity between weight selection and evaluation.
  3. [§4 and Table 1] §4 and Table 1 (dataset and results): dataset size, family distribution, and exact feature list are not quantified, preventing assessment of whether the 97% classification accuracy reflects balanced classes or overfitting to dominant families.
minor comments (2)
  1. [§3] Notation for the similarity formula is introduced without an equation number, making it hard to trace how weights enter the measure.
  2. [§3] The community-detection step is mentioned but lacks a specific algorithm name or modularity formula reference.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and will revise the manuscript to improve clarity and methodological transparency.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (experiment description): the 95% K-fold prediction accuracy rests on a similarity function whose feature weights are derived by an optimization process that simultaneously identifies 'significant' features; no explicit statement confirms that this tuning occurs strictly inside each training fold rather than on the full labeled set, so the cross-validation result may not demonstrate generalization.

    Authors: The referee is correct that the manuscript does not explicitly state the placement of the weight optimization relative to the K-fold splits. In the original experiments the optimization was performed inside each training fold, but this was not documented. We will revise §4 to describe a nested cross-validation procedure that confines weight tuning to the training portion of each fold. revision: yes

  2. Referee: [§3] §3 (similarity measure and weight derivation): the procedure for obtaining optimal weights is described only at a high level without pseudocode, objective function, or separation from the K-fold splits; this leaves the central accuracy claim dependent on an unspecified step that could introduce circularity between weight selection and evaluation.

    Authors: We agree that §3 provides only a high-level description. We will expand the section to include the precise objective function minimized during weight optimization, pseudocode for the full procedure, and an explicit statement that the optimization step is isolated from the held-out test folds. revision: yes

  3. Referee: [§4 and Table 1] §4 and Table 1 (dataset and results): dataset size, family distribution, and exact feature list are not quantified, preventing assessment of whether the 97% classification accuracy reflects balanced classes or overfitting to dominant families.

    Authors: We will augment §4 and Table 1 with the total number of samples, the per-family counts, and the complete enumerated list of behavioral features. These additions will enable readers to evaluate class balance and potential bias toward dominant families. revision: yes

Circularity Check

1 steps flagged

Feature weight optimization process risks reducing K-fold prediction accuracy to a fitted quantity

specific steps
  1. fitted input called prediction [Abstract]
    "To maintain high classification accuracy, we propose a process to derive the optimal weights of the selected features in the proposed similarity measure. During this process, we find out which features are significant for representing the similarity between malware samples. ... the proposed system achieved 97% accuracy for malware classification and 95% accuracy for prediction by K-fold cross-validation using the real malware dataset."

    The optimization process is presented as the mechanism that produces high accuracy, yet the same accuracy numbers are then reported as the outcome of K-fold cross-validation. Without an explicit statement that weight derivation is confined to training folds only, the 'prediction' accuracy reduces to a quantity fitted on the evaluation data by construction.

full rationale

The paper's central accuracy claims (97% classification, 95% prediction via K-fold) rest on a similarity measure whose feature weights are derived via an optimization process explicitly aimed at maintaining high accuracy and identifying significant features. The abstract describes this process without stating that weight tuning occurs strictly inside each training fold or on a held-out partition separate from the K-fold evaluation. This matches the fitted_input_called_prediction pattern: the weights are tuned on data that overlaps with (or is the same as) the data used to report the prediction accuracy, making the reported performance statistically forced rather than a true out-of-sample result. No other circularity patterns (self-citation chains, self-definitional equations, or imported uniqueness theorems) are evident from the provided text.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that behavioral features distinguish families and on fitted weights for the similarity measure; no new entities are postulated.

free parameters (1)
  • feature weights = optimized on data
    Optimal weights are derived via a process to maintain high classification accuracy, making them fitted parameters.
axioms (1)
  • domain assumption Common behavioral characteristics can reliably indicate membership in the same malware family
    Invoked to justify the similarity measure and feature selection

pith-pipeline@v0.9.0 · 5765 in / 1108 out tokens · 30560 ms · 2026-05-25T18:04:20.377945+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    Novel feature extraction, selection and fusion for effective malware fam ily classification

    Mansour Ahmadi, Dmitry Ulyanov, Stanislav Semenov, Mik hail Trofimov, and Giorgio Giacinto. Novel feature extraction, selection and fusion for effective malware fam ily classification. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy , pages 183–194. ACM, 2016

  2. [2]

    Static analysis of executables for coll aborative malware detection on android

    A-D Schmidt, Rainer Bye, H-G Schmidt, Jan Clausen, Osman Kiraz, Kamer A Y uksel, Seyit Ahmet Camtepe, and Sahin Albayrak. Static analysis of executables for coll aborative malware detection on android. In 2009 IEEE International Conference on Communications , pages 1–5. IEEE, 2009

  3. [3]

    Semantics- aware android malware classification using weighted contextual api dependency graphs

    Mu Zhang, Y ue Duan, Heng Yin, and Zhiruo Zhao. Semantics- aware android malware classification using weighted contextual api dependency graphs. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1105–1116. ACM, 2014. 12 A PREPRINT - J UNE 25, 2019

  4. [4]

    http://ocslab.hksecurity.net/Datasets/androsimnet

    Andro-simnet dataset. http://ocslab.hksecurity.net/Datasets/androsimnet

  5. [5]

    Andro-dumpsys: anti- malware system based on the similarity of malware creator an d malware centric information

    Jae-wook Jang, Hyunjae Kang, Jiyoung Woo, Aziz Mohaisen , and Huy Kang Kim. Andro-dumpsys: anti- malware system based on the similarity of malware creator an d malware centric information. computers & security, 58:125–138, 2016

  6. [6]

    Android malware d etection method based on function call graphs

    Y uxin Ding, Siyi Zhu, and Xiaoling Xia. Android malware d etection method based on function call graphs. In International Conference on Neural Information Processin g, pages 70–77. Springer, 2016

  7. [7]

    A malw are classification method based on similarity of function structure

    Y ang Zhong, Hirofumi Y amaki, and Hiroki Takakura. A malw are classification method based on similarity of function structure. In 2012 IEEE/IPSJ 12th International Symposium on Applicatio ns and the Internet , pages 256–261. IEEE, 2012

  8. [8]

    Droidscribe: Classifying a ndroid malware based on runtime behavior

    Santanu Kumar Dash, Guillermo Suarez-Tangil, Salahudd in Khan, Kimberly Tam, Mansour Ahmadi, Johannes Kinder, and Lorenzo Cavallaro. Droidscribe: Classifying a ndroid malware based on runtime behavior. In 2016 IEEE Security and Privacy W orkshops (SPW), pages 252–261. IEEE, 2016

  9. [9]

    The analysis of feature selection methods and classification algorithms in permission based android m alware detection

    U ˘gur Pehlivan, Nuray Baltaci, Cengiz Acartürk, and Nazife Ba ykal. The analysis of feature selection methods and classification algorithms in permission based android m alware detection. In 2014 IEEE Symposium on Computational Intelligence in Cyber Security (CICS) , pages 1–8. IEEE, 2014

  10. [10]

    Androsimilar: robust statistical feature signature for android malware detecti on

    Parvez Faruki, Vijay Ganmoor, Vijay Laxmi, Manoj Singh Gaur, and Ammar Bharmal. Androsimilar: robust statistical feature signature for android malware detecti on. In Proceedings of the 6th International Conference on Security of Information and Networks , pages 152–159. ACM, 2013

  11. [11]

    http://cuckoo-droid.readthedocs.io/

    Cuckoodroid. http://cuckoo-droid.readthedocs.io/

  12. [12]

    An open digest-based technique for spam detection

    Ernesto Damiani, Sabrina De Capitani di Vimercati, Ste fano Paraboschi, and Pierangela Samarati. An open digest-based technique for spam detection. ISCA PDCS, 2004:559–564, 2004

  13. [13]

    Fast unfolding of commu- nities in large networks

    Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambio tte, and Etienne Lefebvre. Fast unfolding of commu- nities in large networks. Journal of statistical mechanics: theory and experiment , 2008(10):P10008, 2008

  14. [14]

    D 3 data-driven documents

    Michael Bostock, V adim Ogievetsky, and Jeffrey Heer. D 3 data-driven documents. IEEE transactions on visual- ization and computer graphics , 17(12):2301–2309, 2011

  15. [15]

    https://youtu.be/JmfS-ZtCbg4

    The demo video of andro-simnet. https://youtu.be/JmfS-ZtCbg4. 13