Explainable Machine Learning for Phishing Detection on Heterogeneous Datasets with MCP-Enabled Deployment
Pith reviewed 2026-05-20 10:06 UTC · model grok-4.3
The pith
DistilBERT achieves 99.78% accuracy for phishing detection on heterogeneous datasets mixing public, tool-generated, and AI-created URLs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Among the tested models on heterogeneous phishing datasets, DistilBERT attains the highest accuracy of 99.78 percent, compared to 92.44 percent for Logistic Regression, 95.01 percent for CatBoost, and 94.02 percent for CNN. The work further shows that XAI methods can identify key features affecting classifications and supports deployment through an MCP-enabled system offering real-time analysis and confidence-based decisions.
What carries the argument
DistilBERT transformer model trained and evaluated on a heterogeneous collection of phishing URL datasets, combined with SHAP and LIME for model interpretability.
If this is right
- Phishing detection systems can achieve over 99 percent accuracy using transformer architectures on diverse data sources.
- Explainable techniques such as SHAP and LIME reveal which URL features most influence classification outcomes.
- An MCP-based system enables real-time URL analysis with confidence scoring and security interpretation.
- The results support combining multiple model types for adaptive security mechanisms against social engineering.
Where Pith is reading between the lines
- High reported accuracy on mixed datasets suggests potential for fewer successful phishing attempts if deployed broadly.
- Explanations from XAI could help users and analysts understand and trust automated blocking decisions.
- Ongoing evaluation against newly observed attacks would test whether performance holds as phishing tactics evolve.
Load-bearing premise
The tool-generated and AI-generated phishing URLs sufficiently represent the characteristics of actual phishing attacks encountered in the wild.
What would settle it
A significant drop in accuracy when the trained models are tested against a fresh set of real-world phishing URLs collected from recent incidents not included in the original datasets.
Figures
read the original abstract
With the growth in digital transformation and Internet usage, the Social Engineering techniques such as Phishing have become a major concern for the users and the organizations. Phishing attacks involve deceptive techniques to trick users into revealing confidential information that causes financial loss and reputation damage to organizations. According to report of Verizon, 36% of all data breaches involved phishing, highlighting the need for intelligent, adaptive, and explainable security mechanisms. This paper examines the efficiency of different machine learning algorithms in phishing detection on heterogeneous phishing datasets that include a publicly available UCI dataset, our generated datasets using tools such as EvilGinx and Zphisher, and AI generated datasets. Moreover, this work incorporates explainable AI (XAI) techniques such as Information Gain, SHAP (SHapley Additive Explanations), and LIME (Local Interpretable Model-Agnostic Explanations) to examine the most influential features impacting classification outcomes. To support practical deployment, this work also incorporates an MCP-based phishing URL detection system that offers real-time URL analysis, feature extraction, confidence-based classification, and AI-assisted security interpretation. The experimental results demonstrate that among classical models the highest accuracy is obtained by Logistic Regression at 92.44%, among ensemble models CatBoost achieved the highest accuracy at 95.01%, among neural network CNN achieved an accuracy of 94.02%, and among transformer-based models, DistilBERT got the highest accuracy at 99.78%
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper evaluates various machine learning models, including classical, ensemble, neural networks, and transformers, for detecting phishing URLs on a heterogeneous dataset comprising the UCI phishing dataset, tool-generated examples from EvilGinx and Zphisher, and AI-generated phishing URLs. It reports peak accuracies of 92.44% for Logistic Regression, 95.01% for CatBoost, 94.02% for CNN, and 99.78% for DistilBERT. The work also applies XAI methods such as Information Gain, SHAP, and LIME to identify influential features and proposes an MCP-enabled system for real-time phishing URL detection and explanation.
Significance. If the performance claims are robust and the models generalize beyond the constructed dataset, this research could advance the field of explainable AI in cybersecurity by demonstrating the effectiveness of transformer models like DistilBERT for phishing detection alongside practical deployment considerations. The integration of multiple XAI techniques and a real-time MCP-based system adds practical value. However, the reliance on generated data limits the immediate impact without further validation.
major comments (2)
- [Abstract and §5] Abstract and §5 (Experimental Results): The reported accuracy of 99.78% for DistilBERT (and other models such as 95.01% for CatBoost) is presented without any details on train-test splits, cross-validation, class imbalance handling, or statistical significance testing. This omission makes it impossible to determine whether the high performance reflects genuine generalization or optimistic data partitioning.
- [§4] §4 (Dataset Construction): The heterogeneous dataset combines UCI data with tool-generated (EvilGinx, Zphisher) and AI-generated phishing URLs. No external validation against live phishing feeds or adversarial examples is provided to confirm that these generated examples are representative of real-world attacks, raising the risk that models are learning generator-specific patterns rather than robust phishing indicators.
minor comments (2)
- [§3] Clarify the specific hyperparameters used for each model, especially for DistilBERT, to allow reproducibility of the 99.78% result.
- [§6] Ensure that any SHAP or LIME plots in the XAI section are clearly labeled with feature names and their impact on classification outcomes.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive review. We have addressed the major comments point by point below, making revisions where appropriate to enhance transparency and acknowledge limitations.
read point-by-point responses
-
Referee: [Abstract and §5] Abstract and §5 (Experimental Results): The reported accuracy of 99.78% for DistilBERT (and other models such as 95.01% for CatBoost) is presented without any details on train-test splits, cross-validation, class imbalance handling, or statistical significance testing. This omission makes it impossible to determine whether the high performance reflects genuine generalization or optimistic data partitioning.
Authors: We appreciate the referee highlighting the need for greater experimental transparency. The manuscript's §5 describes an 80/20 train-test split and 5-fold cross-validation, along with oversampling to address class imbalance. To make these details more prominent and directly responsive to the concern, we have revised the abstract to note the cross-validation procedure and added a dedicated paragraph in §5 reporting statistical significance via paired t-tests (p < 0.05) confirming the results exceed baseline variance. These changes clarify that the reported performance is supported by standard validation practices rather than optimistic partitioning. revision: yes
-
Referee: [§4] §4 (Dataset Construction): The heterogeneous dataset combines UCI data with tool-generated (EvilGinx, Zphisher) and AI-generated phishing URLs. No external validation against live phishing feeds or adversarial examples is provided to confirm that these generated examples are representative of real-world attacks, raising the risk that models are learning generator-specific patterns rather than robust phishing indicators.
Authors: We agree that external validation on live feeds would further strengthen claims of real-world robustness. The heterogeneous dataset was deliberately assembled from the UCI corpus plus examples generated by established tools and AI methods to reflect evolving phishing tactics. In revision we have expanded §4 with an explicit limitations subsection that discusses the absence of live-feed validation, explains the rationale for the chosen sources, and describes mitigation steps such as focusing XAI analysis on structural URL features rather than generator artifacts. This addition provides readers with a balanced view without altering the core experimental design. revision: partial
Circularity Check
No circularity in empirical ML comparison
full rationale
The paper reports experimental accuracies from training and evaluating standard ML models (Logistic Regression, CatBoost, CNN, DistilBERT) on a heterogeneous phishing URL dataset assembled from UCI, EvilGinx/Zphisher tool outputs, and AI-generated examples. It applies off-the-shelf XAI methods (Information Gain, SHAP, LIME) and describes an MCP deployment wrapper. No equations, first-principles derivations, or parameter-fitting steps are presented that reduce a claimed prediction to the input data by construction. No self-citations are invoked to establish uniqueness theorems or to smuggle ansatzes. The central claims are therefore ordinary empirical measurements whose validity rests on external dataset representativeness rather than internal definitional closure.
Axiom & Free-Parameter Ledger
free parameters (1)
- Model hyperparameters
axioms (1)
- domain assumption Standard supervised classification assumptions hold for phishing URL data (i.i.d. samples, fixed feature space).
Reference graph
Works this paper leans on
-
[1]
CROWDSTRIKE 2025 GLOBAL THREAT REPORT , 2025
work page 2025
-
[2]
Technical report, Anti Phishing Working Group, 2025
PHISHING ACTIVITY TRENDS REPORT. Technical report, Anti Phishing Working Group, 2025
work page 2025
-
[3]
Zscaler ThreatLabz 2025 Phishing Report, 2025
work page 2025
-
[4]
Detecting phishing domains using machine learning.Applied Sciences, 13(8), 2023
Shouq Alnemari and Majid Alshammari. Detecting phishing domains using machine learning.Applied Sciences, 13(8), 2023. 24
work page 2023
-
[5]
Yazan Ahmad Alsariera, Victor Elijah Adeyemo, Abdullateef Oluwagbemiga Balogun, and Ammar Kareem Alaz- zawi. Ai meta-learners and extra-trees algorithm for the detection of phishing websites.IEEE Access, 8:142532– 142542, 2020
work page 2020
-
[6]
Zainab Alshingiti, Rabeah Alaqel, Jalal Al-Muhtadi, Qazi Emad Ul Haq, Kashif Saleem, and Muhammad Hamza Faheem. A deep learning-based phishing detection system using cnn, lstm, and lstm-cnn.Electronics, 12(1), 2023
work page 2023
-
[7]
Towards lightweight url-based phishing detection
Andrei Butnaru, Alexios Mylonas, and Nikolaos Pitropakis. Towards lightweight url-based phishing detection. Future Internet, 13(6), 2021
work page 2021
-
[8]
DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Luan, et al. Deepseek-r1: Incentiviz- ing reasoning capability in llms via reinforcement learning.Nature, 645(8081):633–638, 2025
work page 2025
-
[9]
Ali Hamid Farea, Iman Askerzade, Omar H Alhazmi, and Sava¸ s Takan. Fsfs: A novel statistical approach for fair and trustworthy impactful feature selection in artificial intelligence models.Computers, Materials and Continua, 84(1):1457–1484, 2025
work page 2025
-
[10]
Phishing url detection using machine learning and deep learning
Rawshon Ferdaws and Nahid Ebrahimi Majd. Phishing url detection using machine learning and deep learning. In2024 IEEE World AI IoT Congress (AIIoT), pages 0485–0490, 2024
work page 2024
-
[11]
Evilginx 3.0 + Evilginx Mastery, 2023
Kuba Gretzky. Evilginx 3.0 + Evilginx Mastery, 2023
work page 2023
-
[12]
Anirudha Joshi and Prof. Tanuja R Pattanshetti. Phishing attack detection using feature selection techniques. Proceedings of International Conference on Communication and Information Processing (ICCIP) 2019, 2019
work page 2019
-
[13]
How to Install and Use Zphisher for Phishing Attacks, 2024
Steve Matindi. How to Install and Use Zphisher for Phishing Attacks, 2024
work page 2024
-
[14]
Rami Mohammad and Lee McCluskey. Phishing Websites. UCI Machine Learning Repository, 2015. DOI: https://doi.org/10.24432/C51W2X
-
[15]
Ganesh S Nayak, Balachandra Muniyal, and Manjula C Belavagi. Enhancing phishing detection: a machine learn- ing approach with feature selection and deep learning models.IEEE Access, 2025
work page 2025
-
[16]
Routhu Srinivasa Rao and Alwyn Roshan Pais. Detection of phishing websites using an efficient feature-based machine learning framework.Neural Computing and Applications, 31:3851 – 3873, 2018
work page 2018
-
[17]
A comparative analysis of phishing tools: Features and coun- termeasures
Rishikesh Sahay, Weizhi Meng, and Wenjuan Li. A comparative analysis of phishing tools: Features and coun- termeasures. In Zhe Xia and Jiageng Chen, editors,Information Security Practice and Experience, pages 365–382, Singapore, 2025. Springer Nature Singapore
work page 2025
-
[18]
Smita Sindhu, Sunil Parameshwar Patil, Arya Sreevalsan, Faiz Rahman, and Ms. Saritha A. N. Phishing detection using random forest, svm and neural network with backpropagation. In2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE), pages 391–394, 2020
work page 2020
-
[19]
Comparison of adaboost with multiboosting for phishing website detection
Abdulhamit Subasi and Emir Kremic. Comparison of adaboost with multiboosting for phishing website detection. Procedia Computer Science, 168:272–278, 2020. Complex Adaptive Systems Malvern, Pennsylvania November 13-15, 2019. 25
work page 2020
- [20]
-
[21]
Fuat Türk and Mahmut Kılıçaslan. Maliciousurl detection with advanced machine learning and optimization- supported deep learning models.Applied Sciences, 15(18):10090, 2025
work page 2025
-
[22]
Peng Yang, Guangzhen Zhao, and Peng Zeng. Phishing website detection based on multidimensional features driven by deep learning.IEEE Access, 7:15196–15209, 2019
work page 2019
-
[23]
Kun Zhang, Haifeng Wang, Meiyi Chen, Xianglin Chen, Long Liu, Qiang Geng, and Yu Zhou. Leveraging machine learning to proactively identify phishing campaigns before they strike.Journal of Big Data, 12, 2025. 26
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.