pith. sign in

arxiv: 2604.19438 · v1 · submitted 2026-04-21 · 💻 cs.CR · cs.SE

Malicious ML Model Detection by Learning Dynamic Behaviors

Pith reviewed 2026-05-10 02:41 UTC · model grok-4.3

classification 💻 cs.CR cs.SE
keywords malicious pre-trained modelsdynamic analysisone-class SVMmodel hub securityML supply chainruntime behavior detection
0
0 comments X

The pith

DynaHug detects malicious pre-trained ML models by training a one-class SVM solely on the runtime behaviors of benign models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Pre-trained models distributed through hubs can contain hidden malicious code that executes upon loading in a user's environment. Current detectors rely on static rules or analysis that either overlook sophisticated attacks or incorrectly flag safe models. DynaHug instead collects execution traces from many benign models for a given task and uses those traces to train an OCSVM that learns the normal behavior distribution. New models are tested under the same dynamic conditions and flagged if their behavior deviates. Evaluation across more than 25,000 models shows this yields up to 44 percent higher F1-score than existing static, dynamic, and LLM-based detectors.

Core claim

DynaHug trains an OCSVM exclusively on runtime behavior traces gathered by executing task-specific benign PTMs. It then runs a candidate model under identical conditions and classifies it as malicious when its observed behavior falls outside the learned benign distribution. Ablation experiments confirm that dynamic analysis, the choice of OCSVM, and clustering each improve performance, while large-scale tests on Hugging Face and MalHug data demonstrate gains of up to 44 percent in F1-score over prior detectors.

What carries the argument

One-class SVM trained on runtime behavior traces collected through dynamic analysis of benign PTMs for a specific task

If this is right

  • Malicious code that activates only at runtime can be caught even when static code patterns evade rule-based scanners.
  • Task-specific training sets allow the detector to adapt normal behavior profiles to different model uses such as image or text tasks.
  • Clustering within the benign set produces tighter normal profiles and reduces false alarms on legitimate variation.
  • Dynamic analysis complements rather than replaces static methods, lowering both missed attacks and incorrect benign flags.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Model hubs could embed lightweight runtime checks during upload or first-load verification to block supply-chain attacks before users download the model.
  • Collecting behavior traces across multiple tasks might reduce the need for strictly task-specific training sets while preserving detection accuracy.
  • The same dynamic-feature approach could be applied to detect poisoned or backdoored models that alter outputs without obvious code injection.

Load-bearing premise

Runtime behaviors observed from a set of benign PTMs form a tight enough distribution that an OCSVM can reliably separate malicious models without producing excessive false positives on previously unseen benign models.

What would settle it

A large collection of benign PTMs drawn from new tasks or sources that the trained OCSVM consistently labels as outliers, producing high false-positive rates.

Figures

Figures reproduced from arXiv: 2604.19438 by Dhruv Pradhan, Ezekiel Soremekun, Sarang Nambiar.

Figure 1
Figure 1. Figure 1: Workflow of DynaHug Face analysis, only the specialised scanners detect the malicious PTM datasets collected in this work and reported in previous works (MalHug). In particular, the three specialised PTM scanners such as PickleScan and JFrog are more effective at detecting malicious PTMs than traditional scanners (see [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Inter-cluster and Intra-cluster analysis of the system calls with the [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Strace log snippet from deserialization [85] to isolate any processes running within it from the host machine, preventing any damage from potentially malicious models, e.g., Remote Code Execution (RCE) on the host machine [26]. Moreover, since the output logs from strace are non-deterministic and varies with the environment on which the PyTorch model is deserialized and the hardware resources available at … view at source ↗
Figure 5
Figure 5. Figure 5: Top-10 most important features that explain DynaHug’s predictions on malicious models Feature Selection: Results show that the default feature setting of Dy￾naHug (presence and frequency of system calls) contributes positively to its effectiveness and outperforms al￾ternative features setting. More im￾portantly, [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: DynaHug Algorithm Input: TAG, modelToAnalyze Output: Malicious or Benign 1: // Phase 1: Crawling 2: Mtag ← fetchModels(”P yT orch”, TAG) 3: // Phase 2: Dynamic Analysis 4: traces ← ∅ 5: for Mi ∈ Mtag do▷ Model deserialization 6: executeModelInSandbox(Mi) 7: traces ← traces ∪ getTraces(Mi) 8: end for 9: // Phase 3: Model Training 10: X ← ∅ ▷ Training dataset 11: for tracei ∈ traces do 12: di ← dataProcessin… view at source ↗
Figure 7
Figure 7. Figure 7: Top-10 important features that explain DynaHug’s predictions on benign models [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: System prompt and LLM response Prompt: You are analyzing strace logs to detect malicious activity in PyTorch files. Look for RED FLAGS like these: - Process creation (execve,fork,clone,...) - Network connections (socket, connect, bind, ...) - File writes outside normal directories - Access to sensitive files (/etc/passwd, SSH keys,...) - Essentially unusual system calls during deserialization IMPORTANT: Yo… view at source ↗
read the original abstract

Pre-trained machine learning models (PTMs) are commonly provided via Model Hubs (e.g., Hugging Face) in standard formats like Pickles to facilitate accessibility and reuse. However, this ML supply chain setting is susceptible to malicious attacks that are capable of executing arbitrary code on trusted user environments, e.g., during model loading. To detect malicious PTMs, state-of-the-art detectors (e.g., PickleScan) rely on rules, heuristics, or static analysis, but ignore runtime model behaviors. Consequently, they either miss malicious models due to under-approximation (blacklisting) or miscategorize benign models due to over-approximation (static analysis or whitelisting). To address this challenge, we propose a novel technique (DynaHug) which detects malicious PTMs by learning the behavior of benign PTMs using dynamic analysis and machine learning (ML). DynaHug trains an ML classifier (one-class SVM (OCSVM)) on the runtime behaviours of task-specific benign models. We evaluate DynaHug using over 25,000 benign and malicious PTMs from different sources including Hugging Face and MalHug. We also compare DynaHug to several state-of-the-art detectors including static, dynamic and LLM-based detectors. Results show that DynaHug is up to 44% more effective than existing baselines in terms of F1-score. Our ablation study demonstrates that our design decisions (dynamic analysis, OCSVM, clustering) contribute positively to DynaHug's effectiveness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes DynaHug, a technique that detects malicious pre-trained ML models (PTMs) by performing dynamic analysis on task-specific benign PTMs, extracting runtime behaviors, and training a one-class SVM (OCSVM) to identify outliers as malicious. It evaluates the approach on a corpus of over 25,000 benign and malicious PTMs sourced from Hugging Face and MalHug, reports up to 44% higher F1-score than static, dynamic, and LLM-based baselines, and includes an ablation study attributing gains to dynamic analysis, OCSVM, and clustering.

Significance. If the performance claims hold under proper controls, the work would be significant for ML supply-chain security by shifting from static heuristics to behavioral modeling of benign PTMs, potentially catching attacks that evade rule-based or static detectors like PickleScan. The scale of the evaluation corpus (>25k models) and the ablation study are explicit strengths that support reproducibility and design validation if the missing methodological details are supplied.

major comments (2)
  1. [Evaluation] Evaluation section: the manuscript reports quantitative F1 gains on >25k models and an ablation study but provides no details on the train/test split of benign PTMs for OCSVM training, the exact feature vectors extracted from dynamic traces, or any controls for distributional shift between training benign models and held-out benign PTMs. This is load-bearing for the central 44% F1 improvement claim, as the OCSVM decision boundary is fitted exclusively to the collected benign traces.
  2. [Method] Method and results: the OCSVM is described as trained only on runtime behaviors of task-specific benign PTMs, yet no experiment or analysis quantifies false-positive rates on unseen benign models (different fine-tunes, architectures, or slight task variations). Without this, the reported effectiveness cannot be separated from potential overfitting to the specific benign distribution used.
minor comments (2)
  1. [Abstract] The abstract states that clustering contributes positively in the ablation study, but the main method description does not clarify how clustering is integrated with the OCSVM pipeline.
  2. [Method] Notation for the dynamic analysis features and OCSVM hyperparameters (kernel, nu, gamma) is not standardized across sections, making it difficult to replicate the exact experimental setup.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the potential significance of DynaHug for ML supply-chain security, as well as the strengths of our large-scale evaluation corpus and ablation study. We address each major comment below and will revise the manuscript accordingly to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the manuscript reports quantitative F1 gains on >25k models and an ablation study but provides no details on the train/test split of benign PTMs for OCSVM training, the exact feature vectors extracted from dynamic traces, or any controls for distributional shift between training benign models and held-out benign PTMs. This is load-bearing for the central 44% F1 improvement claim, as the OCSVM decision boundary is fitted exclusively to the collected benign traces.

    Authors: We agree that these details are essential for reproducibility and for substantiating the performance claims. The current manuscript provides a high-level overview of the evaluation but does not specify the exact train/test split ratios for benign PTMs, the precise feature vector construction from dynamic traces, or explicit controls for distributional shift. In the revised version, we will expand the Evaluation section to include: (1) the data partitioning procedure (e.g., the proportion of benign PTMs reserved exclusively for OCSVM training versus held-out testing), (2) the complete definition of the feature vectors extracted from runtime behaviors, and (3) analysis demonstrating controls for distributional shift, such as diversity across model sources, tasks, and architectures between training and test sets. These additions will directly support the reported F1-score gains. revision: yes

  2. Referee: [Method] Method and results: the OCSVM is described as trained only on runtime behaviors of task-specific benign PTMs, yet no experiment or analysis quantifies false-positive rates on unseen benign models (different fine-tunes, architectures, or slight task variations). Without this, the reported effectiveness cannot be separated from potential overfitting to the specific benign distribution used.

    Authors: We concur that quantifying false-positive rates on unseen benign models with variations is important to demonstrate generalization and mitigate concerns about overfitting. While the existing ablation study and overall evaluation on over 25,000 models provide supporting evidence, the manuscript does not include a dedicated experiment isolating FPR on held-out benign PTMs that differ in fine-tunes, architectures, or task variations. In the revised manuscript, we will add a new analysis or subsection that evaluates the OCSVM on such additional unseen benign models and reports the corresponding false-positive rates to confirm robustness beyond the training distribution. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical ML detection with no derivation chain

full rationale

The paper presents DynaHug as an empirical technique that collects runtime behaviors from task-specific benign PTMs, trains a standard one-class SVM on those traces, and flags outliers as malicious. No mathematical derivation, first-principles result, or prediction is claimed; the central result is an experimental F1-score comparison on >25k models. The OCSVM decision boundary is fitted directly to the collected benign data, but the paper does not rename or re-present any fitted quantity as an independent prediction. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing steps. The method is therefore self-contained against external benchmarks and exhibits no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that benign PTM runtime behaviors are learnable and separable from malicious ones; no free parameters or invented entities are explicitly introduced in the abstract.

free parameters (1)
  • OCSVM hyperparameters (kernel, nu, gamma)
    Standard OCSVM parameters that must be chosen or tuned on the benign training traces; their specific values are not stated.
axioms (1)
  • domain assumption Runtime behaviors of task-specific benign PTMs form a coherent distribution that an OCSVM can model without high false-positive rates on other benign models.
    Invoked when the method trains exclusively on benign traces and treats deviations as malicious.

pith-pipeline@v0.9.0 · 5581 in / 1306 out tokens · 42962 ms · 2026-05-10T02:41:40.862827+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

87 extracted references · 87 canonical work pages

  1. [1]

    https://www.virustotal.com/gui/file/ e7191fdaf9bc9b19266e8bcf348788a4493cdbe03baadf67a483b7093783aa 4b , [Accessed 16-04-2026]

    admko/evil_test - virustotal. https://www.virustotal.com/gui/file/ e7191fdaf9bc9b19266e8bcf348788a4493cdbe03baadf67a483b7093783aa 4b , [Accessed 16-04-2026]

  2. [2]

    drhyrum/bert-tiny-torch-picklebomb-virustotal.https://www.virustotal.com/gui/file/ c8354611d7d2f2437aa2ce1af05760ae2a4acbfc21e2b3a4cbe77024130bf8df, [Accessed 16-04-2026]

  3. [3]

    Farishijazi/totally-harmless-model-virustotal.https://www.virustotal.com/gui/file/ 65a011d547d25bd600580861f8d47447b6fedc4fb11d4c01ed348ef58e510b3b, [Accessed 16-04-2026]

  4. [4]

    https://www.virustotal.com/gui/file/ 5949e26161da6e53cf2771aa7de4212b70d5aa065baab49e600bb295f11fad11, [Accessed 16-04-2026]

    Narsil/totallysafe - virustotal. https://www.virustotal.com/gui/file/ 5949e26161da6e53cf2771aa7de4212b70d5aa065baab49e600bb295f11fad11, [Accessed 16-04-2026]

  5. [5]

    https://kili-technology.com/blog/ 9-open-sourced-datasets-for-training-large-language-models , [Accessed 16-04-2026]

    Open-sourced training datasets for large language mod- els (llms). https://kili-technology.com/blog/ 9-open-sourced-datasets-for-training-large-language-models , [Accessed 16-04-2026]

  6. [6]

    https://www.virustotal.com/gui/home/ upload, [Accessed 16-04-2026]

    Virustotal - virustotal.com. https://www.virustotal.com/gui/home/ upload, [Accessed 16-04-2026]

  7. [7]

    https://www.virustotal.com/gui/file/ 2337248d2e27e905e483ce8d1740035a98fd915d01c575fb5235d60016b8a56a, [Accessed 16-04-2026]

    zpbrent/reuse - virustotal. https://www.virustotal.com/gui/file/ 2337248d2e27e905e483ce8d1740035a98fd915d01c575fb5235d60016b8a56a, [Accessed 16-04-2026]

  8. [8]

    https: //huggingface.co/gabejabe/bsidesSF-gordon-ramsey (2024), [Accessed 11-11-2025]

    gabejabe/bsidesSF-gordon-ramsey·Hugging Face — huggingface.co. https: //huggingface.co/gabejabe/bsidesSF-gordon-ramsey (2024), [Accessed 11-11-2025]

  9. [9]

    colab.google.https://colab.google/(2025), [Accessed 14-11-2025]

  10. [10]

    https://github.com/ s2e-lab/hf-model-analyzer.git(2025), [Accessed 11-11-2025]

    GitHub - s2e-lab/hf-model-analyzer — github.com. https://github.com/ s2e-lab/hf-model-analyzer.git(2025), [Accessed 11-11-2025]

  11. [11]

    https://www.deeplearning.ai/the-batch/meta-releases-llamafirewall-an-open- source-defense-against-ai-hijacking/ (2025), [Accessed 12-11-2025]

    Metareleasesllamafirewall,anopen-sourcedefenseagainstaihijacking—deeplearn- ing.ai. https://www.deeplearning.ai/the-batch/meta-releases-llamafirewall-an-open- source-defense-against-ai-hijacking/ (2025), [Accessed 12-11-2025]

  12. [12]

    What is few shot prompting? https://www.ibm.com/think/topics/ few-shot-prompting(2025), [Accessed 11-11-2025]

  13. [13]

    https://pypi.org/project/slutterprime/ (2026), [Accessed 28-01-2026]

    Slutterprime — pypi.org. https://pypi.org/project/slutterprime/ (2026), [Accessed 28-01-2026]

  14. [14]

    https://www.revshells.com/ (2021), [Accessed 09-04-2026]

    0dayCTF: Online - reverse shell generator. https://www.revshells.com/ (2021), [Accessed 09-04-2026]

  15. [15]

    https://openai.com/index/ introducing-gpt-5-2/(2025), [Accessed 29-01-2026]

    AI, O.: Introducing GPT-5.2 — openai.com. https://openai.com/index/ introducing-gpt-5-2/(2025), [Accessed 29-01-2026]

  16. [16]

    https: //github.com/protectai/modelscan (2025), gitHub repository, accessed: 2025-10-11

    AI, P.: Modelscan: Protection against model serialization attacks. https: //github.com/protectai/modelscan (2025), gitHub repository, accessed: 2025-10-11

  17. [17]

    of Bits, T.: Fickling: A python pickling decompiler and static analyzer.https: //github.com/trailofbits/fickling (2025), gitHub repository, accessed: 2025-10-11

  18. [18]

    In: 2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)

    Brown, P., Brown, A., Gupta, M., Abdelsalam, M.: Online malware classification with system-wide system calls in cloud iaas. In: 2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI). pp. 146–151. IEEE (2022) 22 Sarang Nambiar, Dhruv Pradhan, and Ezekiel Soremekun

  19. [19]

    Canonical: Strace man page, https://manpages.ubuntu.com/manpages/ jammy/man1/strace

  20. [20]

    Casey, B., Santos, J.C.S., Mirakhorli, M.: A large-scale exploit instrumentation study of ai/ml supply chain attacks in hugging face models (2024),https:// arxiv.org/abs/2410.04490

  21. [21]

    chrisw: Execute – pypi.org.https://pypi.org/project/execute/ (2010), [Accessed 10-11-2025]

  22. [22]

    https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging- face-ml-models-with-silent-backdoor/ (2024), [Accessed 27-10-2025]

    Cohen, D.: Data scientists targeted by malicious hugging face ml models with silent backdoor. https://jfrog.com/blog/data-scientists-targeted-by-malicious-hugging- face-ml-models-with-silent-backdoor/ (2024), [Accessed 27-10-2025]

  23. [23]

    Contributors, P.: weights_only_unpickler.py – pytorch.https://github.com/ pytorch/pytorch/blob/main/torch/_weights_only_unpickler.py (2025), github repository, Accessed: 2025-11-10

  24. [24]

    https://www.cve.org/CVERecord/SearchResults?query= pickle(2025), [Accessed 09-11-2025]

    CVE: cve.org. https://www.cve.org/CVERecord/SearchResults?query= pickle(2025), [Accessed 09-11-2025]

  25. [25]

    com/python/cpython, gitHub repository, accessed 2025-10-12

    Developers, P.:CPython: The python programming language.https://github. com/python/cpython, gitHub repository, accessed 2025-10-12

  26. [26]

    com/resources/what-container/(2025), [Accessed 11-11-2025]

    Docker: What is a Container? | Docker — docker.com.https://www.docker. com/resources/what-container/(2025), [Accessed 11-11-2025]

  27. [27]

    https://www.docker.com/(2025), [Accessed 06-11-2025]

    Docker, I.: Docker: Accelerated Container Application Development — docker.com. https://www.docker.com/(2025), [Accessed 06-11-2025]

  28. [28]

    https://huggingface.co/ docs/huggingface_hub/index(2025), [Accessed 07-11-2025]

    Face, H.: Hub client library — huggingface.co. https://huggingface.co/ docs/huggingface_hub/index(2025), [Accessed 07-11-2025]

  29. [29]

    https:// huggingface.co/(2025), [Accessed 07-11-2025]

    Face, H.: Hugging face – the ai community building the future. https:// huggingface.co/(2025), [Accessed 07-11-2025]

  30. [30]

    https://github.com/ huggingface/transformers/blob/main/src/transformers/modeling_ utils.py(2025), [Accessed 13-10-2025]

    Face, H.: huggingface/transformers github. https://github.com/ huggingface/transformers/blob/main/src/transformers/modeling_ utils.py(2025), [Accessed 13-10-2025]

  31. [31]

    https://huggingface.co/ docs/hub/en/security-pickle#what-we-have-now (2025), hugging Face documentation, accessed: 2025-10-11

    Face, H.: Pickle scanning (hub documentation). https://huggingface.co/ docs/hub/en/security-pickle#what-we-have-now (2025), hugging Face documentation, accessed: 2025-10-11

  32. [32]

    https://huggingface.co/docs/hub/ en/security-jfrog(2025), hugging Face documentation, accessed: 2025-10-11

    Face, H.: Third-party scanner: Jfrog. https://huggingface.co/docs/hub/ en/security-jfrog(2025), hugging Face documentation, accessed: 2025-10-11

  33. [33]

    https://huggingface.co/docs/ hub/en/security-protectai (2025), hugging Face documentation, accessed: 2025-10-11

    Face, H.: Third-party scanner: Protect ai. https://huggingface.co/docs/ hub/en/security-protectai (2025), hugging Face documentation, accessed: 2025-10-11

  34. [34]

    SMU Data Science Review7(2), 4 (2023)

    George, D., Mauldin, A., Mitchell, J., Mohammed, S., Slater, R.: Static malware family clustering via structural and functional characteristics. SMU Data Science Review7(2), 4 (2023)

  35. [35]

    GitHub: Github.https://github.com/(2025), [Accessed 07-11-2025]

  36. [36]

    Google: Cloud storage.https://cloud.google.com/storage?hl=en (2025), [Accessed 11-11-2025]

  37. [37]

    https: //docs.cloud.google.com/python/docs/reference/storage/latest (2025), [Accessed 11-11-2025]

    Google: Python client libraries | Google Cloud Documentation. https: //docs.cloud.google.com/python/docs/reference/storage/latest (2025), [Accessed 11-11-2025]

  38. [38]

    In: Proceedings of the 36th international conference on software engineering

    Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: Proceedings of the 36th international conference on software engineering. pp. 1025–1035 (2014)

  39. [39]

    totally-harmless-model

    Hijazi, F.: “totally-harmless-model” – model card.https://huggingface.co/ FarisHijazi/totally-harmless-model(2025), accessed: 2025-11-10 Malicious ML Model Detection by Learning Dynamic Behaviors 23

  40. [40]

    https://huggingface.co/blog/huggingface/ state-of-os-hf-spring-2026(2026), [Accessed 09-04-2026]

    HuggingFace: State of open source on hugging face: Spring 2026 — huggingface.co. https://huggingface.co/blog/huggingface/ state-of-os-hf-spring-2026(2026), [Accessed 09-04-2026]

  41. [41]

    https:// huggingface.co/jossefharush/gpt2-rs/tree/main (2023), [Accessed 28- 01-2026]

    jossefharush: jossefharush/gpt2-rs at main — huggingface.co. https:// huggingface.co/jossefharush/gpt2-rs/tree/main (2023), [Accessed 28- 01-2026]

  42. [42]

    Kaggle: Kaggle: Your Machine Learning and Data Science C ommunity.https: //www.kaggle.com/(2025), [Accessed 07-11-2025]

  43. [43]

    arXiv e-prints pp

    Kassianik, P., Saglam, B., Chen, A., Nelson, B., Vellore, A., Aufiero, M., Burch, F., Kedia, D., Zohary, A., Weerawardhena, S., et al.: Llama-3.1-foundationai- securityllm-base-8b technical report. arXiv e-prints pp. arXiv–2504 (2025)

  44. [44]

    Kellas, A.D., Christou, N., Jiang, W., Li, P., Simon, L., David, Y., Kemerlis, V.P., Davis, J.C., Yang, J.: Pickleball: Secure deserialization of pickle-based machine learn- ing models (extended report) (2025),https://arxiv.org/abs/2508.15987

  45. [45]

    https: //man7.org/linux/man-pages/man2/syscalls.2.html#:~:text=/usr/ include/asm/unistd.h(2025), [Accessed 30-10-2025]

    Kerrisk, M.: syscalls(2) - Linux manual page — man7.org. https: //man7.org/linux/man-pages/man2/syscalls.2.html#:~:text=/usr/ include/asm/unistd.h(2025), [Accessed 30-10-2025]

  46. [46]

    https://www.reversinglabs.com/blog/ rl-identifies-malware-ml-model-hosted-on-hugging-face (2025), [Accessed 09-11-2025]

    Labs, R.: Malicious ml models discovered on hugging face plat- form | reversinglabs. https://www.reversinglabs.com/blog/ rl-identifies-malware-ml-model-hosted-on-hugging-face (2025), [Accessed 09-11-2025]

  47. [47]

    https://scikit-learn.org/ stable/user_guide.html(2025), [Accessed 07-11-2025]

    scikit learn: User Guide — scikit-learn.org. https://scikit-learn.org/ stable/user_guide.html(2025), [Accessed 07-11-2025]

  48. [48]

    Liu, Y., Cao, J., Liu, C., Ding, K., Jin, L.: Datasets for large language models: A comprehensive survey (2024),https://arxiv.org/abs/2402.18041

  49. [49]

    Curran Associates, Inc

    Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predic- tions. Curran Associates, Inc. (2017), http://papers.nips.cc/paper/ 7062-a-unified-approach-to-interpreting-model-predictions. pdf

  50. [50]

    https://ai.meta.com/blog/meta-llama-3-1/ (2024), [Accessed 09-11-2025]

    Meta: Llama 3.1 release. https://ai.meta.com/blog/meta-llama-3-1/ (2024), [Accessed 09-11-2025]

  51. [51]

    https://huggingface.co/ meta-llama/Llama-3.1-8B-Instruct(2025), [Accessed 09-11-2025]

    Meta: meta-llama/llama-3.1-8b-instruct. https://huggingface.co/ meta-llama/Llama-3.1-8B-Instruct(2025), [Accessed 09-11-2025]

  52. [52]

    https://github.com/mmaitre314/picklescan (2025), gitHub repository, accessed: 2025-10-11

    mmaitre314: picklescan: Security scanner detecting python pickle files performing suspicious actions. https://github.com/mmaitre314/picklescan (2025), gitHub repository, accessed: 2025-10-11

  53. [53]

    https://modelscope.cn/home (2025), [Accessed 07- 11-2025]

    ModelScope: Modelscope. https://modelscope.cn/home (2025), [Accessed 07- 11-2025]

  54. [54]

    Montalbano, E.: (https://www.darkreading.com/application-security/hugging-face- ai-platform-100-malicious-code-execution-models) (Feb 2024)

  55. [55]

    https://thehackernews.com/2024/02/ new-hugging-face-vulnerability-exposes.html (2024), [Accessed 09-11-2025]

    News, T.H.: New Hugging Face Vulnerability Exposes AI Models to Supply Chain Attacks. https://thehackernews.com/2024/02/ new-hugging-face-vulnerability-exposes.html (2024), [Accessed 09-11-2025]

  56. [56]

    https://thehackernews.com/2024/03/ over-100-malicious-aiml-models-found-on.html (2024), [Accessed 09-11-2025]

    News, T.H.: Over 100 Malicious AI/ML Models Found on Hug- ging F ace Platform. https://thehackernews.com/2024/03/ over-100-malicious-aiml-models-found-on.html (2024), [Accessed 09-11-2025]

  57. [57]

    News, T.H.: Malicious ML Models on Hugging Face Leverage Broken Pickle Format to Evade Detection.https://thehackernews.com/2025/02/ 24 Sarang Nambiar, Dhruv Pradhan, and Ezekiel Soremekun malicious-ml-models-found-on-hugging.html (2025), [Accessed 09-11- 2025]

  58. [58]

    Nmap Project: Ncat — nmap’s netcat replacement.https://nmap.org/ncat/ (2025), accessed: 2025-11-10

  59. [59]

    Numpy: Numpy documentation; numpy v2.3 manual — numpy.org.https:// numpy.org/doc/stable/(2025), [Accessed 07-11-2025]

  60. [60]

    OpenCSG: Opencsg.https://opencsg.com/(2025), [Accessed 07-11-2025]

  61. [61]

    https://www.architecture-performance.fr/ap_blog/ loading-data-into-a-pandas-dataframe-a-performance-study/ (2019), [Accessed 14-10-2025]

    PACULL, F.: Loading data into a Pandas DataFrame – a perfor- mance study. https://www.architecture-performance.fr/ap_blog/ loading-data-into-a-pandas-dataframe-a-performance-study/ (2019), [Accessed 14-10-2025]

  62. [62]

    https://pandas.pydata.org/docs/(2025), [Accessed 07-11-2025]

    Pandas: pandas documentation; pandas 2.3.3 documentation — pandas.pydata.org. https://pandas.pydata.org/docs/(2025), [Accessed 07-11-2025]

  63. [63]

    https://blog.tensorflow.org/2022/09/colabs-pay-as- you-go-offers-more-access-to-powerful-nvidia-compute-for-machine-learning.html (2022), [Accessed 13-11-2025]

    Perry, C.: Colab Nvidia GPU. https://blog.tensorflow.org/2022/09/colabs-pay-as- you-go-offers-more-access-to-powerful-nvidia-compute-for-machine-learning.html (2022), [Accessed 13-11-2025]

  64. [64]

    Digital Threats5(1) (Mar 2024)

    Pimenta, T.S.R., Ceschin, F., Gregio, A.: Androidgyny: Reviewing clustering techniques for android malware family classification. Digital Threats5(1) (Mar 2024). https://doi.org/10.1145/3587471, https://doi.org/10.1145/ 3587471

  65. [65]

    https://www.infosecurity-magazine.com/news/ malicious-ai-models-hugging-face/(2025), [Accessed 09-11-2025]

    Poireault, K.: Malicious AI Models on Hugging Face Exploit Novel A ttack Technique. https://www.infosecurity-magazine.com/news/ malicious-ai-models-hugging-face/(2025), [Accessed 09-11-2025]

  66. [66]

    In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)

    Prusa, J., Khoshgoftaar, T.M., Seliya, N.: The effect of dataset size on training tweet sentiment classifiers. In: 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA). pp. 96–102 (2015).https://doi.org/10. 1109/ICMLA.2015.22

  67. [67]

    PyPI: Execute Search — pypi.org.https://pypi.org/search/?q=execute (2025), [Accessed 11-11-2025]

  68. [68]

    PyPI: PyPI &xB7; The Python Package Index — pypi.org.https://pypi.org (2025), [Accessed 10-11-2025]

  69. [69]

    https://docs.pytorch.org/docs/stable/index.html (2025), [Accessed 07-11-2025]

    PyTorch: Pytorch documentation; pytorch 2.9 documentation — docs.pytorch.org. https://docs.pytorch.org/docs/stable/index.html (2025), [Accessed 07-11-2025]

  70. [70]

    https: //huggingface.co/pytorch/Phi-4-mini-instruct-FP8/tree/main (2025), [Accessed 10-11-2025]

    PyTorch: pytorch/phi-4-mini-instruct-fp8 at main — huggingface.co. https: //huggingface.co/pytorch/Phi-4-mini-instruct-FP8/tree/main (2025), [Accessed 10-11-2025]

  71. [71]

    BMC Bioinformatics24(1), 48 (Feb 2023)

    Rajput, D., Wang, W.J., Chen, C.C.: Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics24(1), 48 (Feb 2023)

  72. [72]

    Remote Sens

    Ramezan, C.A., Warner, T.A., Maxwell, A.E., Price, B.S.: Effects of training set size on supervised machine-learning land-cover classification of large-area high-resolution remotely sensed data. Remote Sens. (Basel)13(3), 368 (Jan 2021)

  73. [73]

    https://www.federalregister.gov/documents/ 2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of- artificial-intelligence (2023), [Accessed 09-11-2025]

    Register, F.: Federal register :: Request access. https://www.federalregister.gov/documents/ 2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of- artificial-intelligence (2023), [Accessed 09-11-2025]

  74. [74]

    ACM Transactions on Internet Technology

    Rondanini, C., Carminati, B., Ferrari, E., Kundu, A., Gaudiano, A.: Malware detection at the edge with lightweight llms: A performance evaluation. ACM Transactions on Internet Technology

  75. [75]

    Ronik: (2024), https://weam.ai/blog/guide/huggingface-statistics/ Malicious ML Model Detection by Learning Dynamic Behaviors 25

  76. [76]

    Electronics10(13) (2021)

    Senanayake, J., Kalutarage, H., Al-Kadri, M.O.: Android mobile malware detection using machine learning: A systematic review. Electronics10(13) (2021). https://doi.org/10.3390/electronics10131606, https://www. mdpi.com/2079-9292/10/13/1606

  77. [77]

    SparkNLP: Spark nlp.https://sparknlp.org/(2025), [Accessed 07-11-2025]

  78. [78]

    llm stacking: llm-stacking/G_learn_depth at main — huggingface.co.https:// huggingface.co/llm-stacking/G_learn_depth/tree/main (2024), [Ac- cessed 11-11-2025]

  79. [79]

    https://huggingface.co/star23/round2/tree/ main(2023), [Accessed 09-11-2025]

    Star23: star23/round2. https://huggingface.co/star23/round2/tree/ main(2023), [Accessed 09-11-2025]

  80. [80]

    Starovoitov, A., Borkmann, D.: ebpf - introduction, tutorials & community resources — ebpf.io.https://ebpf.io/(2025), [Accessed 12-10-2025]

Showing first 80 references.