Privacy-Aware Machine Unlearning with SISA for Reinforcement Learning-Based Ransomware Detection
Pith reviewed 2026-05-10 07:48 UTC · model grok-4.3
The pith
SISA training lets RL ransomware detectors delete specific samples by retraining only one shard, with at most 0.05 percent F1-score loss.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Partitioning data into five shards with majority-vote aggregation enables unlearning by deleting 5 percent of samples from one shard and retraining only that shard; both DQN and DDQN agents retain near-identical in-distribution performance, with DDQN showing slightly lower utility loss and greater stability.
What carries the argument
SISA training with five shards and majority-vote aggregation, which isolates the effect of any deleted sample to a single shard that can be retrained independently.
If this is right
- Data-deletion requests can be honored in deployed RL detectors without rebuilding the entire model.
- Retraining cost scales with shard size rather than full dataset size.
- DDQN exhibits marginally better stability than DQN under the same unlearning protocol.
- Continuous Q-score margin evaluation remains usable for ROC-AUC analysis after partial updates.
Where Pith is reading between the lines
- The same shard-and-vote structure could be tested on other RL security tasks such as intrusion detection or malware family classification.
- Varying the number of shards or the deletion percentage would show the practical limits of the time-accuracy trade-off.
- Combining SISA with differential privacy mechanisms inside each shard might further strengthen formal privacy guarantees.
Load-bearing premise
Dividing the dataset into five shards and aggregating by majority vote after retraining only the affected shard does not introduce bias or instability into the reinforcement-learning agents.
What would settle it
A controlled repeat of the 5-percent deletion experiment in which F1 score drops more than 0.05 percent or detection variance rises measurably after shard retraining.
Figures
read the original abstract
Ransomware detection systems increasingly rely on behavior-based machine learning to address evolving attack strategies. However, emerging privacy compliance, data governance, and responsible AI deployment demand not only accurate detection but also the ability to efficiently remove the influence of specific training samples without retraining the models from scratch. In this study, we present a privacy-aware machine unlearning evaluation framework for reinforcement learning (RL)-based ransomware detection built on Sharded, Isolated, Sliced, and Aggregated (SISA) training. The framework enables efficient data deletion by retraining only the affected model shards rather than the entire detector, reducing the retraining cost while preserving detection performance. We conduct a controlled comparative study using value-based RL agents, including Deep Q-Network (DQN) and Double Deep Q-Network (DDQN), under identical experimental settings with a cost-sensitive reward design and 5-fold cross-validation on Windows 11 ransomware dataset. Detection confidence is evaluated using a continuous Q-score margin, enabling ROC-AUC analysis beyond binary predictions. For unlearning, the dataset is partitioned into five shards with majority-vote aggregation, and a fast-unlearning path is evaluated by deleting 5% of the samples from a single shard and retraining only that shard. Results show that SISA-based unlearning incurs negligible utility degradation (<= 0.05 percent F1 drop) while substantially reducing retraining time relative to full SISA retraining. DDQN exhibits slightly improved stability and lower utility loss than DQN, while both agents maintain near identical in-distribution performance after unlearning. These findings indicate that SISA provides an efficient unlearning mechanism for RL-based ransomware detection, supporting privacy-aware deployment without compromising security effectiveness.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a SISA-based machine unlearning framework for reinforcement learning agents (DQN and DDQN) applied to ransomware detection on a Windows 11 dataset. Using 5-fold cross-validation and a cost-sensitive reward, it partitions data into five shards, deletes 5% of samples from one shard, retrains only that shard, and aggregates via majority vote. The central empirical claim is that this incurs at most 0.05% F1-score degradation while substantially reducing retraining time relative to full retraining, with DDQN showing marginally better stability and both agents preserving near-identical in-distribution performance; detection uses continuous Q-score margins for ROC-AUC evaluation.
Significance. If the results prove robust, the work offers a practical demonstration that SISA-style sharding can be adapted to value-based RL models in security domains to support efficient, privacy-compliant unlearning. This is relevant for regulated deployments where data deletion requests must be honored without full retraining. The emphasis on continuous Q-score margins rather than binary decisions is a constructive choice for evaluating nuanced detection behavior.
major comments (2)
- [Abstract] Abstract and experimental results: the headline claim of ≤0.05% F1 drop (and reduced retraining time) is presented without baselines, error bars, statistical tests, or reported variance across the 5-fold runs or multiple random seeds. This makes the central performance assertion unverifiable from the given evidence and load-bearing for the paper's contribution.
- [Methodology (SISA unlearning path)] Unlearning procedure and aggregation: majority-vote aggregation on discrete actions after single-shard retraining may not preserve the continuous Q-score margins used for ROC-AUC in stochastic value-based RL agents. RL policies are sensitive to data distribution shifts; the manuscript provides no analysis showing that post-unlearning Q-value distributions or decision thresholds remain stable.
minor comments (2)
- [Abstract] The abstract states 'near identical in-distribution performance' but does not specify the exact metrics, tables, or figures supporting this comparison.
- Experimental details such as shard sizes, exact deletion selection method, hyperparameter settings for DQN/DDQN, and full training curves are not summarized, limiting reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below and commit to revisions that strengthen the empirical presentation and analysis.
read point-by-point responses
-
Referee: [Abstract] Abstract and experimental results: the headline claim of ≤0.05% F1 drop (and reduced retraining time) is presented without baselines, error bars, statistical tests, or reported variance across the 5-fold runs or multiple random seeds. This makes the central performance assertion unverifiable from the given evidence and load-bearing for the paper's contribution.
Authors: We agree that the abstract would benefit from greater statistical transparency to make the ≤0.05% F1 claim more verifiable. The manuscript already averages results over 5-fold cross-validation, but we will revise the abstract, results section, and tables to include error bars, standard deviations across folds and random seeds, statistical significance tests against full-retraining baselines, and explicit variance reporting. These additions will not change the reported findings but will improve rigor and address the load-bearing nature of the claim. revision: yes
-
Referee: [Methodology (SISA unlearning path)] Unlearning procedure and aggregation: majority-vote aggregation on discrete actions after single-shard retraining may not preserve the continuous Q-score margins used for ROC-AUC in stochastic value-based RL agents. RL policies are sensitive to data distribution shifts; the manuscript provides no analysis showing that post-unlearning Q-value distributions or decision thresholds remain stable.
Authors: The referee correctly identifies a potential mismatch between discrete majority-vote aggregation and the continuous Q-score margins used for ROC-AUC evaluation. While our results show near-identical in-distribution performance after unlearning, we did not include explicit analysis of post-unlearning Q-value distributions or threshold stability. We will add this analysis in the revised manuscript, including comparative histograms of Q-score margins, decision-threshold sensitivity, and ROC-AUC stability before and after single-shard retraining for both DQN and DDQN. revision: yes
Circularity Check
No circularity: purely empirical results from controlled experiments
full rationale
The manuscript reports direct experimental outcomes from applying the SISA framework to value-based RL agents (DQN/DDQN) on a ransomware dataset. It partitions data into five shards, retrains only the affected shard after 5% deletion, aggregates via majority vote, and measures F1 and ROC-AUC under 5-fold cross-validation. All reported quantities (utility degradation <=0.05% F1, retraining time savings) are observed metrics from these runs, not derived quantities that reduce to fitted parameters or self-referential definitions. No equations, uniqueness theorems, or ansatzes are invoked that collapse the central claims to the inputs by construction. The work is self-contained against its own experimental benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Muhammad Shabbir Abbasi, Harith Al-Sahaf, Masood Mansoori, and Ian Welch
-
[2]
doi:10.1016/j.asoc.2022.108744
Behavior-based ransomware classification: A particle swarm optimization wrapper-based approach for feature selection.Applied Soft Computing121 (2022), 108744. doi:10.1016/j.asoc.2022.108744
-
[3]
M. Adnan Alvi and Zunera Jalil. 2025. XRGuard: A Model-Agnostic Approach to Ransomware Detection Using Dynamic Analysis and Explainable AI.IEEE Access13 (2025), 53159–53170. doi:10.1109/ACCESS.2025.3553562
-
[4]
XiZhen Deng, MingCan Cen, M Jiang, and Meiqu Lu. 2024. Ransomware early de- tection using deep reinforcement learning on portable executable header.Cluster Computing27, 2 (2024), 1867–1881. doi:10.1007/s10586-023-04043-5
-
[5]
Jannatul Ferdous, Rafiqul Islam, Arash Mahboubi, and Md Zahidul Islam. 2025. A novel technique for ransomware detection using image based dynamic features and transfer learning to address dataset limitations.Scientific Reports15, 1 (2025), 32342. doi:10.1038/s41598-025-17647-1
-
[6]
Jannatul Ferdous, Rafiqul Islam, Arash Mahboubi, and Md Zahidul Islam. 2024. AI-Based Ransomware Detection: A Comprehensive Review.IEEE Access12 (2024), 136666–136695. doi:10.1109/ACCESS.2024.3461965
-
[7]
Matthew G Gaber, Mohiuddin Ahmed, and Helge Janicke. 2024. Malware detec- tion with artificial intelligence: A systematic literature review.Comput. Surveys 56, 6 (2024), 1–33. doi:10.1145/3638552
-
[8]
Nikolai Hampton, Zubair Baig, and Sherali Zeadally. 2018. Ransomware be- havioural analysis on Windows platforms.Journal of Information Security and Applications40 (2018), 44–51. doi:10.1016/j.jisa.2018.02.008
-
[9]
Dynamic feature dataset for ransomware detection using machine learning algorithms
Juan A. Herrera-Silva and Myriam Hernández-Álvarez. 2023. Dynamic Feature Dataset for Ransomware Detection Using Machine Learning Algorithms.Sensors 23, 3 (2023). doi:10.3390/s23031053
-
[10]
Sekione Reward Jeremiah, Haotian Chen, Stefanos Gritzalis, and Jong Hyuk Park. 2024. Leveraging application permissions and network traffic attributes for Android ransomware detection.Journal of Network and Computer Applications 230 (2024), 103950. doi:10.1016/j.jnca.2024.103950
-
[11]
Singh, Saket Sarin, Chandra Kumari Subba, Varsha Arya, N
Sudhakar Kumar, Sunil K. Singh, Saket Sarin, Chandra Kumari Subba, Varsha Arya, N. Nandhini, Brij B. Gupta, and Kwok Tai Chui. 2025. Leveraging dynamic embeddings and reinforcement learning with bayesian networks for ransomware resiliences.Cyber Security and Applications3 (2025), 100095. doi:10.1016/j.csa. 2025.100095
-
[12]
Na Li, Chunyi Zhou, Yansong Gao, Hui Chen, Zhi Zhang, Boyu Kuang, and Anmin Fu. 2025. Machine Unlearning: Taxonomy, Metrics, Applications, Challenges, and Prospects.IEEE Transactions on Neural Networks and Learning Systems36, 8 (2025), 13709–13729. doi:10.1109/TNNLS.2025.3530988
-
[13]
Ziyao Liu, Huanyi Ye, Chen Chen, Yongsen Zheng, and Kwok-Yan Lam. 2025. Threats, Attacks, and Defenses in Machine Unlearning: A Survey.IEEE Open Journal of the Computer Society6 (2025), 413–425. doi:10.1109/OJCS.2025.3543483
-
[14]
Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee- Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2025. A survey of machine unlearning.ACM Transactions on Intelligent Systems and Technology16, 5 (2025), 1–46. doi:10.1145/3749987
-
[15]
Akshara Ravi, Vivek Chaturvedi, and Muhammad Shafique. 2025. ADVeRL-ELF: ADVersarial ELF Malware Generation using Reinforcement Learning. (2025), 1–7. doi:10.1109/DAC63849.2025.11132466
-
[16]
Muhammed K. P. Shafi, Serena Nicolazzo, Antonino Nocera, and P. Vinod. 2026. How secure is forgetting? Linking machine unlearning to machine learning attacks.Neurocomputing662 (2026), 131971. doi:10.1016/j.neucom.2025.131971
-
[17]
Thanveer Shaik, Xiaohui Tao, Haoran Xie, Lin Li, Xiaofeng Zhu, and Qing Li
-
[18]
Exploring the Landscape of Machine Unlearning: A Comprehensive Survey and Taxonomy.IEEE Transactions on Neural Networks and Learning Systems36, 7 (2025), 11676–11696. doi:10.1109/TNNLS.2024.3486109
-
[19]
Weiqi Wang, Chenhan Zhang, Zhiyi Tian, and Shui Yu. 2026. SMS: Self-Supervised Model Seeding for Verification of Machine Unlearning.IEEE Transactions on Dependable and Secure Computing23, 1 (2026), 1219–1231. doi:10.1109/TDSC. 2025.3615615
-
[20]
Jie Xu, Zihan Wu, Cong Wang, and Xiaohua Jia. 2024. Machine Unlearning: Solutions and Challenges.IEEE Transactions on Emerging Topics in Computational Intelligence8, 3 (2024), 2150–2168. doi:10.1109/TETCI.2024.3379240
-
[21]
Göksun Önal and Mesut Güven. 2025. Enhancing Dynamic Malware Behavior Analysis Through Novel Windows Events With Machine Learning.IEEE Access 13 (2025), 153937–153958. doi:10.1109/ACCESS.2025.3604979
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.