Machine Unlearning for the XGBoost Model with Network Intrusion Datasets
Pith reviewed 2026-06-26 21:14 UTC · model grok-4.3
The pith
XGBoost-Forget removes targeted data from network intrusion models without full retraining while keeping similar accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
XGBoost-Forget is an unlearning technique for the XGBoost model that removes the influence of chosen data points on tabular network intrusion datasets. When evaluated on IoT-23 and GeNIS, the method produces models whose predictive performance remains close to that of the original trained model while completing the unlearning step in significantly less time than would be required for full retraining from scratch. The evaluation uses separate measures for overall model performance, the speed of the unlearning operation, and the quality of the forgetting achieved.
What carries the argument
XGBoost-Forget, the unlearning procedure applied to the XGBoost gradient boosting model that selectively adjusts for removal of targeted training records.
If this is right
- The same unlearning procedure can be applied to other tabular network intrusion datasets without changing the core approach.
- Operators gain a practical way to honor data removal requests while avoiding the full cost of retraining an XGBoost intrusion detector.
- Detection accuracy on unseen traffic stays nearly unchanged, so the operational value of the model is preserved after unlearning.
- The method scales to the size of typical IoT and network traffic logs where full retraining would be prohibitive.
Where Pith is reading between the lines
- Similar unlearning steps could be adapted for other tree ensemble models used in security monitoring beyond the two datasets tested here.
- Integration into live intrusion detection pipelines might allow periodic removal of outdated or compromised records with low overhead.
- The approach could lower the barrier for deploying gradient boosting models in regulated environments that require data erasure capabilities.
Load-bearing premise
The chosen metrics for performance, speed, and forgetting quality are enough to confirm that the influence of the removed data points has truly been eliminated rather than merely masked on the test set.
What would settle it
A verification step in which an auditor checks whether the unlearned model still produces outputs that encode information unique to the removed data points would directly test whether forgetting succeeded.
read the original abstract
Machine Unlearning (MU) has emerged as an important technique for removing specific data points from trained models without requiring full retraining. However, most existing MU research focuses on deep learning and image data, leaving a gap in the domain of network intrusion detection, which relies heavily on tabular data. This work introduces XGBoost-Forget, an unlearning approach for the XGBoost model, to address this gap. The approach is evaluated on two tabular Network Intrusion (NI) datasets, IoT-23 and GeNIS, using multiple metrics to assess model performance, unlearning efficiency, and forgetting quality. The results show that XGBoost-Forget maintains predictive performance close to the original model while providing significantly faster unlearning, demonstrating its potential for MU in tabular NI settings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces XGBoost-Forget, an unlearning method for XGBoost models on tabular network intrusion datasets (IoT-23 and GeNIS). It evaluates the approach on metrics for model performance, unlearning efficiency, and forgetting quality, claiming that the method maintains predictive performance close to the original model while achieving significantly faster unlearning than full retraining.
Significance. If the central claim of verifiable forgetting holds, the work would address a relevant gap by extending machine unlearning to tree-based models on tabular security data. The empirical evaluation on two real-world network intrusion datasets provides practical grounding. However, the absence of quantitative results, baselines, and verification steps in the abstract reduces the assessed contribution at present.
major comments (2)
- [Abstract] Abstract: The abstract reports positive results on two datasets but provides no quantitative numbers, error bars, baseline comparisons, or details on how forgetting quality was measured. This makes the central empirical claims impossible to assess from the given information.
- [Evaluation] Evaluation section: The reported metrics (accuracy/F1 similarity to the original model and faster runtime) are compatible with the model retaining influence from the forget set via unchanged splits or leaf statistics. No comparison to exact retraining on the retain set or membership-inference/influence-function verification is described, which is load-bearing for establishing that targeted points have been removed.
minor comments (2)
- Clarify the exact definition and computation of the 'forgetting quality' metric, including any formulas or pseudocode.
- Add statistical significance tests or error bars to the performance comparisons to support the 'close to original' claim.
Simulated Author's Rebuttal
We thank the referee for their detailed review and constructive comments on our manuscript. We address each major comment below and plan to make revisions to strengthen the paper.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract reports positive results on two datasets but provides no quantitative numbers, error bars, baseline comparisons, or details on how forgetting quality was measured. This makes the central empirical claims impossible to assess from the given information.
Authors: We agree that the abstract would benefit from more specific details to allow readers to assess the claims. In the revised manuscript, we will update the abstract to include quantitative results from our experiments, such as the accuracy and F1 scores achieved, the speedup factors compared to retraining, and a brief mention of how forgetting quality was evaluated through performance similarity metrics. revision: yes
-
Referee: [Evaluation] Evaluation section: The reported metrics (accuracy/F1 similarity to the original model and faster runtime) are compatible with the model retaining influence from the forget set via unchanged splits or leaf statistics. No comparison to exact retraining on the retain set or membership-inference/influence-function verification is described, which is load-bearing for establishing that targeted points have been removed.
Authors: This is a valid concern regarding the strength of evidence for actual unlearning. Our current approach evaluates by showing that the unlearned model maintains similar performance to the original while being faster than full retraining. To address this, we will add explicit comparisons to models retrained on the retain set only. For membership inference or influence function-based verification, these methods are not standard for XGBoost and tabular data; we will include a discussion of this limitation and why our metrics provide supporting evidence in this context. revision: partial
Circularity Check
No derivation chain present; purely empirical evaluation
full rationale
The manuscript is an applied empirical study introducing and benchmarking the XGBoost-Forget method on two tabular network-intrusion datasets. It reports performance, runtime, and forgetting-quality metrics but contains no equations, derivations, or first-principles claims that could reduce to their own inputs. Consequently there are no load-bearing steps of the enumerated circularity kinds. The central claim rests on experimental comparisons rather than any self-referential mathematical construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Araújo, A., Rodrigues, D., Leite, P., Gonçalves, J.: Network intrusion detection system based on multiple datasets: Machine learning approaches. In: 2025 13th ISDFS (2025). https://doi.org/10.1109/ISDFS65363.2025.11011909
-
[2]
Adversary instantiation: Lower bounds for differentially private machine learning,
Bourtoule, L., Chandrasekaran, V., Choquette-Choo, C.A., Jia, H., Travers, A., Zhang, B., Lie, D., Papernot, N.: Machine unlearning. In: 2021 IEEE Symposium on Security and Privacy (2021). https://doi.org/10.1109/SP40001.2021.00019
-
[3]
In: Proceedings of the 38th International Conf
Brophy, J., Lowd, D.: Machine unlearning for random forests. In: Proceedings of the 38th International Conf. on Machine Learning. PMLR (2021)
2021
-
[4]
Cevallos, I.D., Benalcázar, M.E., Valdivieso Caraguay, A.L., Zea, J.A., Barona- López, L.I.: A systematic literature review of machine unlearning techniques in neural networks. Computers (2025). https://doi.org/10.3390/computers14040150
-
[5]
IAENG Inter- national Journal of Computer Science (2025)
Chen, W., Almamy, D.F.: Xgboost-driven intrusion detection method: Integrating smote-based class imbalance mitigation and multi-phase learning. IAENG Inter- national Journal of Computer Science (2025)
2025
-
[6]
Chowdhury, S.B.R., Choromanski, K., Sehanobish, A., Dubey, A., Chaturvedi, S.: Towards scalable exact machine unlearning using parameter-efficient fine-tuning (2025),https://arxiv.org/abs/2406.16257
arXiv 2025
-
[7]
IoT-23: A labeled dataset with malicious and benign IoT network traffic,
Garcia, S., Parmisano, A., Erquiaga, M.J.: Iot-23: A labeled dataset with malicious and benign iot network traffic (2020). https://doi.org/10.5281/zenodo.4743746
-
[8]
Network intrusion datasets: a survey, limitations, and recommendations,
Goldschmidt, P., Chudá, D.: Network intrusion datasets: A sur- vey, limitations, and recommendations. Computers & Security (2025). https://doi.org/10.1016/j.cose.2025.104510
-
[9]
In: Proceedings of the 31st International Joint Conf
Hu, H., Salčić, Z., Dobbie, G., Chen, J., Sun, L., Zhang, X.: Membership inference via backdooring. In: Proceedings of the 31st International Joint Conf. on Artificial Intelligence (2022). https://doi.org/10.24963/ijcai.2022/532
-
[10]
Journal of Intelligent Systems (2024)
Issa, M., Aljanabi, M., Muhialdeen, H.: Systematic literature review on intrusion detectionsystems:Researchtrends,algorithms,methods,datasets,andlimitations. Journal of Intelligent Systems (2024). https://doi.org/10.1515/jisys-2023-0248
-
[11]
Available: https://doi.org/10.1007/978-3-642-04898-2 616
Joyce, J.: Kullback-Leibler Divergence. Springer Berlin Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2\_327
-
[12]
Kiflay, A.Z., Tsokanos, A., Kirner, R.: A network intrusion detec- tion system using ensemble machine learning. In: ICCST (2021). https://doi.org/10.1109/ICCST49569.2021.9717397 12 D. Magalhães et al
-
[13]
In: Proceedings of the 29th ACM SIGKDD Conf
Lin, H., Chung, J.W., Lao, Y., Zhao, W.: Machine unlearning in gradient boosting decision trees. In: Proceedings of the 29th ACM SIGKDD Conf. on Knowledge Dis- covery and Data Mining. ACM (2023). https://doi.org/10.1145/3580305.3599420
-
[14]
A Survey of Machine Unlearning
Nguyen,T.T.,Huynh,T.T.,Ren,Z.,Nguyen,P.L.,Liew,A.W.C.,Yin,H.,Nguyen, Q.V.H.: A survey of machine unlearning. ACM Trans. Intell. Syst. Technol. (2025). https://doi.org/10.1145/3749987
-
[15]
Nirupama, B.K., Niranjanamurthy, M.: Network intrusion detection using deci- sion tree and random forest. In: 2022 International Conf. on ACCAI (2022). https://doi.org/10.1109/ACCAI53970.2022.9752578
-
[16]
Pandit, P.V., Bhushan, S., Waje, P.V.: Implementation of intrusion detection sys- tem using various machine learning approaches with ensemble learning. In: 2023 InCACCT (2023). https://doi.org/10.1109/InCACCT57535.2023.10141704
-
[17]
In: Proceedings of the 2024 European Interdisciplinary Cybersecurity Conf
Perifanis, V., Karypidis, E., Komodakis, N., Efraimidis, P.: Sftc: Machine unlearning via selective fine-tuning and targeted confusion. In: Proceedings of the 2024 European Interdisciplinary Cybersecurity Conf. ACM (2024). https://doi.org/10.1145/3655693.3655697
-
[18]
In: Proceedings of the 11th Interna- tional Conf
Pinto, D., Vitorino, J., Maia, E., Amorim, I., Praça, I.: Flow exporter impact on intelligent intrusion detection systems. In: Proceedings of the 11th Interna- tional Conf. on Inf. Systems Security and Privacy - Volume 2: ICISSP (2025). https://doi.org/10.5220/0013131900003899
-
[19]
Qu, Y., Yuan, X., Ding, M., Ni, W., Rakotoarivelo, T., Smith, D.: Learn to unlearn: Insights into machine unlearning. Computer (2024). https://doi.org/10.1109/MC.2023.3333319
-
[20]
In: 2024 IEEE 3rd International Conf
Rahman, M.S., Tausif Islam, W., Ahmed Khan, M.R.: Enhancing cy- bersecurity with an investigation into network intrusion detection sys- tem using machine learning. In: 2024 IEEE 3rd International Conf. on Robotics, Automation, Artificial-Intelligence and Internet-of-Things (2024). https://doi.org/10.1109/RAAICON64172.2024.10928505
-
[21]
In: Proceedings of the 2021 International Conf
Schelter, S., Grafberger, S., Dunning, T.: Hedgecut: Maintaining randomised trees for low-latency machine unlearning. In: Proceedings of the 2021 International Conf. on Management of Data. ACM (2021). https://doi.org/10.1145/3448016.3457239
-
[22]
Silva, M., Pinto, D., Vitorino, J., Gonçalves, J., Maia, E., Praça, I.: Genis: A modular dataset for network intrusion detection and classification. Data in Brief (2025). https://doi.org/10.1016/j.dib.2025.111487
-
[23]
Frontiers in neurorobotics (2024)
Wang, C., Ying, Z., Pan, Z.: Machine unlearning in brain- inspired neural network paradigms. Frontiers in neurorobotics (2024). https://doi.org/10.3389/fnbot.2024.1361577
-
[24]
Wang, M., Yang, N., Guo, Y., Weng, N.: Learn-ids: Bridging gaps between datasets and learning-based network intrusion detection. Electronics (2024). https://doi.org/10.3390/electronics13061072
-
[25]
IEEE Transactions on Information Forensics and Security (2024)
Wang, W., Zhang, C., Tian, Z., Yu, S.: Machine unlearning via representation for- getting with parameter self-sharing. IEEE Transactions on Information Forensics and Security (2024). https://doi.org/10.1109/TIFS.2023.3331239
-
[26]
Wu, Z., Zhu, J., Li, Q., He, B.: Deltaboost: Gradient boosting decision trees with efficient machine unlearning. Proc. ACM Manag. Data (2023). https://doi.org/10.1145/3589313
-
[27]
Machine Unlearning: Solutions and Challenges
Xu, J., Wu, Z., Wang, C., Jia, X.: Machine unlearning: Solutions and challenges. IEEE Transactions on Emerging Topics in Computational Intelligence (2024). https://doi.org/10.1109/TETCI.2024.3379240
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.