Novel Interpretable and Robust Web-based AI Platform for Phishing Email Detection
Pith reviewed 2026-05-24 00:41 UTC · model grok-4.3
The pith
A machine learning model classifies phishing emails at 0.99 F1 score on the largest public dataset and runs inside a web application with built-in explainable AI.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a high-performance machine learning model trained on the largest public phishing dataset reaches an F1 score of 0.99, integrates explainable AI to improve user trust, and is realized as a practical web-based platform ready for real-world deployment.
What carries the argument
The machine learning email classifier together with explainable AI modules, packaged as a web application for real-time use.
If this is right
- The web application allows users to classify emails in real time without needing local software installation.
- Explainable AI outputs increase user trust by showing which email features drive each phishing label.
- High accuracy on the largest public dataset positions the tool as a practical addition to existing anti-phishing defenses.
- The approach moves research from proprietary data settings to openly reproducible, deployable systems.
Where Pith is reading between the lines
- If the model generalizes beyond the training distribution, email providers could embed similar classifiers directly into client software for automatic filtering.
- The same dataset-plus-XAI pattern could be tested on related tasks such as detection of phishing websites or SMS messages.
- Long-term monitoring of live email streams would reveal how quickly new phishing tactics degrade the reported F1 score.
Load-bearing premise
Performance measured on the chosen public dataset will translate to acceptable accuracy on the distribution of emails users actually receive in the wild, including novel phishing campaigns not present in the training data.
What would settle it
Running the deployed model on a fresh collection of phishing emails gathered after the public dataset was assembled and checking whether the F1 score remains near 0.99.
read the original abstract
Phishing emails continue to pose a significant threat, causing financial losses and security breaches. This study addresses limitations in existing research, such as reliance on proprietary datasets and lack of real-world application, by proposing a high-performance machine learning model for email classification. Utilizing a comprehensive and largest available public dataset, the model achieves a f1 score of 0.99 and is designed for deployment within relevant applications. Additionally, Explainable AI (XAI) is integrated to enhance user trust. This research offers a practical and highly accurate solution, contributing to the fight against phishing by empowering users with a real-time web-based application for phishing email detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a machine learning model for phishing email detection trained on the largest available public dataset, reports an F1 score of 0.99, integrates Explainable AI (XAI) techniques, and delivers a web-based platform intended for real-time deployment and user-facing application.
Significance. A verified high-accuracy, interpretable phishing detector with an accompanying deployable web application would be a useful practical contribution, particularly if accompanied by reproducible code or explicit evaluation protocols that future work could build upon.
major comments (2)
- [Abstract] Abstract: the central claim of an F1 score of 0.99 is presented without any description of model architecture, feature set, training/validation split, class balance, or cross-validation procedure, rendering the numerical result impossible to assess or reproduce.
- [Abstract] Abstract: the assertion that the model 'is designed for deployment within relevant applications' rests on an untested i.i.d. assumption; no temporal hold-out, adversarial robustness checks, or evaluation on zero-day phishing campaigns absent from the public corpus are reported, which directly undermines the deployment suitability claim.
minor comments (2)
- The title advertises 'Robust' performance, yet no robustness experiments (e.g., against prompt injection, adversarial text perturbations, or distribution shift) appear in the provided text.
- [Abstract] The abstract states the dataset is 'the largest available public dataset' without a citation or explicit name/size comparison to prior corpora (e.g., Enron, SpamAssassin, or phishing-specific collections).
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on the abstract. We address each major comment below and indicate the changes we will make to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim of an F1 score of 0.99 is presented without any description of model architecture, feature set, training/validation split, class balance, or cross-validation procedure, rendering the numerical result impossible to assess or reproduce.
Authors: We agree that the abstract, due to space constraints, omits these details and that this limits immediate assessment of the reported result. The full manuscript describes the model architecture, feature engineering, the public dataset with its class balance, the train/validation/test splits, and the cross-validation procedure in Sections 3 and 4. To improve accessibility, we will revise the abstract to include a concise statement of the dataset, split strategy, and validation approach. revision: yes
-
Referee: [Abstract] Abstract: the assertion that the model 'is designed for deployment within relevant applications' rests on an untested i.i.d. assumption; no temporal hold-out, adversarial robustness checks, or evaluation on zero-day phishing campaigns absent from the public corpus are reported, which directly undermines the deployment suitability claim.
Authors: The referee is correct that the manuscript reports only a standard random split on the public corpus and does not include temporal hold-out, adversarial, or zero-day evaluations. The web platform demonstrates real-time inference on the trained model but does not constitute robustness testing under distribution shift. We will revise the abstract to qualify the deployment language, stating that the platform illustrates potential real-time use while noting that additional robustness evaluations would be required for production deployment. revision: yes
Circularity Check
No circularity in derivation chain
full rationale
The paper presents a standard supervised ML pipeline for binary classification on a public phishing dataset, reporting an F1 score of 0.99 on held-out test data together with an XAI component and a web deployment interface. No equations, fitted parameters, or uniqueness theorems are invoked; performance is measured by conventional train/test split metrics rather than any self-referential construction. The central claim (high accuracy on the chosen corpus) is therefore an empirical measurement, not a quantity that reduces to its own inputs by definition or by a self-citation chain. External generalization risk exists but is outside the scope of circularity analysis.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
SVM with TF-IDF preprocessing on merged Enron/Ling/CEAS/... dataset achieving F1=0.99; LIME for interpretability; Flask web deployment
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
No mention of recognition cost, golden ratio, or distinction-based emergence
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
PhishTank > Statistics about phishing activity and PhishTank usage
Cisco Talos Intelligence Group, “PhishTank > Statistics about phishing activity and PhishTank usage.” Mar. 2024. [Online]. Available: https://phishtank.org/stats.php
work page 2024
-
[2]
Introducing Cloudflare’s 2023 phishing threats report
E. Dzuba and J. Cash, “Introducing Cloudflare’s 2023 phishing threats report.” Cloudflare, Mar. 2023. [Online]. Available: https://blog.cloudflare.com/2023 - phishing-report/
work page 2023
-
[3]
How Does a Phishing Attack Work?
Simplilearn and B. Kumar, “How Does a Phishing Attack Work?” Mar. 2023. [Online]. Available: https://www.simplilearn.com/ice9/free_resources_article_thumb/phishing_working _2-What_Is_Phishing.PNG
work page 2023
-
[4]
Federal Bureau of Investigation (FBI), “Business Email Compromise.” [Online]. Available: https://www.fbi.gov/how -we-can-help-you/scams-and-safety/common- scams-and-crimes/business-email-compromise
-
[5]
APWG, “Phishing E -mail Reports and Phishing Site Trends 4 Brand -Domain Pairs Measurement 5 Brands & Legitimate Entities Hijacked by E -mail Phishing Attacks 6 Use of Domain Names for Phishing 7-9 Phishing and Identity Theft in Brazil 10-11 Most Targeted Industry Sectors 12 APWG Phishing Trends Report Contributors 13 Unifying the Global Response To Cyber...
work page 2024
-
[6]
Ahead of the Curve: Kaspersky’s projections for 2024’s Advanced Threats Landscape
Kaspersky, “Ahead of the Curve: Kaspersky’s projections for 2024’s Advanced Threats Landscape.” Kaspersky, Mar. 2023. [Online]. Available: https://www.kaspersky.com/about/press-releases/2023_ahead-of-the-curve- kasperskys-projections-for-2024s-advanced-threats-landscape
work page 2024
-
[7]
Highly accurate phishing URL detection based on machine learning,
S. Jalil, M. Usman, and A. Fong, “Highly accurate phishing URL detection based on machine learning,” J Ambient Intell Humaniz Comput, vol. 14, no. 7, pp. 9233– 9251, Jul. 2023, doi: 10.1007/s12652-022-04426-3
-
[8]
Phishing Detection System Through Hybrid Machine Learning Based on URL,
A. Karim, M. Shahroz, K. Mustofa, S. B. Belhaouari, and S. R. K. Joga, “Phishing Detection System Through Hybrid Machine Learning Based on URL,” IEEE Access, vol. 11, pp. 36805– 36822, 2023, doi: 10.1109/ACCESS.2023.3252366
-
[9]
E. A. Aldakheel, M. Zakariah, G. A. Gashgari, F. A. Almarshad, and A. I. A. Alzahrani, “A Deep Learning -Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators,” Sensors , vol. 23, no. 9, May 2023, doi: 10.3390/s23094403
-
[10]
Modeling Hybrid Feature -Based Phishing Websites Detection Using Machine Learning Techniques,
S. Das Guptta, K. T. Shahriar, H. Alqahtani, D. Alsalman, and I. H. Sarker, “Modeling Hybrid Feature -Based Phishing Websites Detection Using Machine Learning Techniques,” Annals of Data Science , vol. 11, no. 1, pp. 217 – 242, Feb. 2024, doi: 10.1007/s40745-022-00379-8
-
[12]
Detecting Phishing Domains Using Machine Learning,
S. Alnemari and M. Alshammari, “Detecting Phishing Domains Using Machine Learning,” Applied Sciences , vol. 13, no. 8, p. 4649, Apr. 2023, doi: 10.3390/app13084649
-
[13]
P. Pandey and N. Mishra, “Phish -Sight: a new approach for phishing detection using dominant colors on web pages and machine learning,” Int J Inf Secur, vol. 22, no. 4, pp. 881– 891, Aug. 2023, doi: 10.1007/s10207-023-00672-4
-
[14]
A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning,
M. W. Shaukat, R. Amin, M. M. A. Muslam, A. H. Alshehri, and J. Xie, “A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning,” Sensors, vol. 23, no. 19, p. 8070, Sep. 2023, doi: 10.3390/s23198070
-
[15]
Real-time phishing detection using deep learning methods by extensions,
D. Minh Linh, H. D. Hung, H. Minh Chau, Q. Sy Vu, and T.-N. Tran, “Real-time phishing detection using deep learning methods by extensions,” International Journal of Electrical and Computer Engineering (IJECE), vol. 14, no. 3, p. 3021, Jun. 2024, doi: 10.11591/ijece.v14i3.pp3021-3035
-
[16]
Web Phishing Detection Using Web Crawling, Cloud Infrastructure and Deep Learning Framework,
L. M. Abdulrahman, S. H. Ahmed, Z. N. Rashid, Y. S. Jghef, T. M. Ghazi, and U. H. Jader, “Web Phishing Detection Using Web Crawling, Cloud Infrastructure and Deep Learning Framework,” Journal of Applied Science and Technology Trends, vol. 4, no. 01, pp. 54– 71, Mar. 2023, doi: 10.38094/jastt401144
-
[17]
Machine learning for email spam filtering: review, approaches and open research problems,
E. G. Dada, J. S. Bassi, H. Chiroma, S. M. Abdulhamid, A. O. Adetunmbi, and O. E. Ajibuwa, “Machine learning for email spam filtering: review, approaches and open research problems,” Heliyon, vol. 5, no. 6, p. e01802, Jun. 2019, doi: 10.1016/j.heliyon.2019.e01802
-
[18]
Applicability of machine learning in spam and phishing email filtering: review and approaches,
T. Gangavarapu, C. D. Jaidhar, and B. Chanduka, “Applicability of machine learning in spam and phishing email filtering: review and approaches,” Artif Intell Rev, vol. 53, no. 7, pp. 5019– 5081, Oct. 2020, doi: 10.1007/S10462-020-09814-9/METRICS
-
[19]
Phishing Detection Leveraging Machine Learning and Deep Learning: A Review,
D. M. Divakaran and A. Oest, “Phishing Detection Leveraging Machine Learning and Deep Learning: A Review,” IEEE Secur Priv, vol. 20, no. 5, pp. 86– 95, Sep. 2022, doi: 10.1109/MSEC.2022.3175225
-
[20]
H. F. Atlam and O. Oluwatimilehin, “Business Email Compromise Phishing Detection Based on Machine Learning: A Systematic Literature Review,” Electronics (Basel), vol. 12, no. 1, p. 42, Dec. 2022, doi: 10.3390/electronics12010042
-
[21]
Applying machine learning and natural language processing to detect phishing email,
A. Alhogail and A. Alsabih, “Applying machine learning and natural language processing to detect phishing email,” Comput Secur, vol. 110, p. 102414, Nov. 2021, doi: 10.1016/j.cose.2021.102414
-
[22]
SEMI -SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS
T. N. Kipf and M. Welling, “SEMI -SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS”
-
[23]
Graph Convolutional Networks for Text Classification
L. Yao, C. Mao, and Y. Luo, “Graph Convolutional Networks for Text Classification”, Accessed: Mar. 24, 2024. [Online]. Available: www.aaai.org
work page 2024
-
[24]
Spam Email Detection Using Deep Learning Techniques,
I. AbdulNabi and Q. Yaseen, “Spam Email Detection Using Deep Learning Techniques,” Procedia Comput Sci, vol. 184, pp. 853–858, 2021, doi: 10.1016/j.procs.2021.03.107
-
[25]
T. M. Ma, K. Yamamori, and A. Thida, “A Comparative Approach to Naïve Bayes Classifier and Support Vector Machine for Email Spam Classification,” 2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020, pp. 324– 326, Oct. 2020, doi: 10.1109/GCCE50665.2020.9291921
-
[26]
L. Halgaš, I. Agrafiotis, and J. R. C. Nurse, “Catching the Phish: Detecting Phishing Attacks Using Recurrent Neural Networks (RNNs),” in Lecture Notes in Computer Science, vol. 11897, Springer, Cham, 2020, pp. 219 – 233. doi: 10.1007/978 -3-030- 39303-8_17
work page doi:10.1007/978 2020
-
[27]
Detecting spam email with machine learning optimized with bio -inspired metaheuristic algorithms,
S. Gibson, B. Issac, L. Zhang, and S. M. Jacob, “Detecting spam email with machine learning optimized with bio -inspired metaheuristic algorithms,” IEEE Access , vol. 8, pp. 187914– 187932, 2020, doi: 10.1109/ACCESS.2020.3030751
-
[28]
A lifelong spam emails classification model,
R. M. A. Mohammad, “A lifelong spam emails classification model,” Applied Computing and Informatics , vol. 20, no. 1 – 2, pp. 35 – 54, Jan. 2024, doi: 10.1016/J.ACI.2020.01.002/FULL/PDF
-
[29]
Email Spam Detection Using Machine Learning Algorithms,
N. Kumar, S. Sonowal, and Nishant, “Email Spam Detection Using Machine Learning Algorithms,” in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) , IEEE, Jul. 2020, pp. 108– 113. doi: 10.1109/ICIRCA48905.2020.9183098
-
[30]
Improving email spam detection using content based feature engineering approach,
W. Hijawi, H. Faris, J. Alqatawna, A. M. Al -Zoubi, and I. Aljarah, “Improving email spam detection using content based feature engineering approach,” in 2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), IEEE, Oct. 2017, pp. 1– 6. doi: 10.1109/AEECT.2017.8257764
-
[31]
Sentiment analysis and spam detection in short informal text using learning classifier systems,
M. H. Arif, J. Li, M. Iqbal, and K. Liu, “Sentiment analysis and spam detection in short informal text using learning classifier systems,” Soft comput, vol. 22, no. 21, pp. 7281– 7291, Nov. 2018, doi: 10.1007/S00500-017-2729-X/METRICS
-
[32]
A. Kumar, J. M. Chatterjee, and V. G. Díaz, “A novel hybrid approach of SVM combined with NLP and probabilistic neural network for email phishing,” International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 1, p. 486, Feb. 2020, doi: 10.11591/ijece.v10i1.pp486-493
-
[33]
Phishing Email Detection Using Improved RCNN Model With Multilevel Vectors and Attention Mechanism,
Y. Fang, C. Zhang, C. Huang, L. Liu, and Y. Yang, “Phishing Email Detection Using Improved RCNN Model With Multilevel Vectors and Attention Mechanism,” IEEE Access, vol. 7, pp. 56329– 56340, 2019, doi: 10.1109/ACCESS.2019.2913705
-
[34]
CLAIR collection of fraud email,
Dragomir Radev, “ CLAIR collection of fraud email,” ACL Data and Code Repository, ADCR2008T001. Jun. 2008
work page 2008
-
[35]
The Enron Corpus: A New Dataset for Email Classification Research,
B. Klimt and Y. Yang, “The Enron Corpus: A New Dataset for Email Classification Research,” in Machine Learning: ECML 2004, F. and G. F. and P. D. Boulicaut Jean - François and Esposito, Ed., Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 217– 226. doi: 10.1007/978-3-540-30115-8_22
-
[36]
Spam Assassin Project, “Spam Assassin Project,” Spam Assassin Public Corpus. 2015
work page 2015
-
[37]
Encyclopedia of Machine Learning, “TF– IDF,” Encyclopedia of Machine Learning, pp. 986– 987, 2011, doi: 10.1007/978-0-387-30164-8_832
-
[38]
Efficient Estimation of Word Representations in Vector Space
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space”, Accessed: Mar. 24, 2024. [Online]. Available: http://ronan.collobert.com/senna/
work page 2024
-
[39]
A review of spam email detection: analysis of spammer strategies and the dataset shift problem,
F. Jáñez-Martino, R. Alaiz-Rodríguez, V. González-Castro, E. Fidalgo, and E. Alegre, “A review of spam email detection: analysis of spammer strategies and the dataset shift problem,” Artif Intell Rev , vol. 56, no. 2, pp. 1145 – 1173, Feb. 2023, doi: 10.1007/s10462-022-10195-4
-
[40]
D. Garreau, “Theoretical analysis of LIME,” in Explainable Deep Learning AI, Elsevier, 2023, pp. 293– 316. doi: 10.1016/B978-0-32-396098-4.00020-X
-
[41]
Recruitment phishing attack targeting students | News | Students | The University of Aberdeen
University of Aberdeen, “Recruitment phishing attack targeting students | News | Students | The University of Aberdeen.” Apr. 2024. [Online]. Available: https://www.abdn.ac.uk/students/news/22987/
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.