pith. sign in

arxiv: 2405.11619 · v2 · submitted 2024-05-19 · 💻 cs.LG · cs.AI

Novel Interpretable and Robust Web-based AI Platform for Phishing Email Detection

Pith reviewed 2026-05-24 00:41 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords phishing email detectionmachine learningexplainable AIweb-based applicationemail classificationcybersecurity
0
0 comments X

The pith

A machine learning model classifies phishing emails at 0.99 F1 score on the largest public dataset and runs inside a web application with built-in explainable AI.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a machine learning classifier that labels incoming emails as phishing or legitimate. Training occurs on the largest available public dataset of such emails, producing an F1 score of 0.99. The system adds explainable AI components so users can see the reasons behind each classification and packages everything as a deployable real-time web application. A sympathetic reader would care because phishing causes direct financial losses and security breaches, and an accessible, high-accuracy tool could let ordinary users check suspicious messages before they act on them.

Core claim

The central claim is that a high-performance machine learning model trained on the largest public phishing dataset reaches an F1 score of 0.99, integrates explainable AI to improve user trust, and is realized as a practical web-based platform ready for real-world deployment.

What carries the argument

The machine learning email classifier together with explainable AI modules, packaged as a web application for real-time use.

If this is right

  • The web application allows users to classify emails in real time without needing local software installation.
  • Explainable AI outputs increase user trust by showing which email features drive each phishing label.
  • High accuracy on the largest public dataset positions the tool as a practical addition to existing anti-phishing defenses.
  • The approach moves research from proprietary data settings to openly reproducible, deployable systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the model generalizes beyond the training distribution, email providers could embed similar classifiers directly into client software for automatic filtering.
  • The same dataset-plus-XAI pattern could be tested on related tasks such as detection of phishing websites or SMS messages.
  • Long-term monitoring of live email streams would reveal how quickly new phishing tactics degrade the reported F1 score.

Load-bearing premise

Performance measured on the chosen public dataset will translate to acceptable accuracy on the distribution of emails users actually receive in the wild, including novel phishing campaigns not present in the training data.

What would settle it

Running the deployed model on a fresh collection of phishing emails gathered after the public dataset was assembled and checking whether the F1 score remains near 0.99.

read the original abstract

Phishing emails continue to pose a significant threat, causing financial losses and security breaches. This study addresses limitations in existing research, such as reliance on proprietary datasets and lack of real-world application, by proposing a high-performance machine learning model for email classification. Utilizing a comprehensive and largest available public dataset, the model achieves a f1 score of 0.99 and is designed for deployment within relevant applications. Additionally, Explainable AI (XAI) is integrated to enhance user trust. This research offers a practical and highly accurate solution, contributing to the fight against phishing by empowering users with a real-time web-based application for phishing email detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a machine learning model for phishing email detection trained on the largest available public dataset, reports an F1 score of 0.99, integrates Explainable AI (XAI) techniques, and delivers a web-based platform intended for real-time deployment and user-facing application.

Significance. A verified high-accuracy, interpretable phishing detector with an accompanying deployable web application would be a useful practical contribution, particularly if accompanied by reproducible code or explicit evaluation protocols that future work could build upon.

major comments (2)
  1. [Abstract] Abstract: the central claim of an F1 score of 0.99 is presented without any description of model architecture, feature set, training/validation split, class balance, or cross-validation procedure, rendering the numerical result impossible to assess or reproduce.
  2. [Abstract] Abstract: the assertion that the model 'is designed for deployment within relevant applications' rests on an untested i.i.d. assumption; no temporal hold-out, adversarial robustness checks, or evaluation on zero-day phishing campaigns absent from the public corpus are reported, which directly undermines the deployment suitability claim.
minor comments (2)
  1. The title advertises 'Robust' performance, yet no robustness experiments (e.g., against prompt injection, adversarial text perturbations, or distribution shift) appear in the provided text.
  2. [Abstract] The abstract states the dataset is 'the largest available public dataset' without a citation or explicit name/size comparison to prior corpora (e.g., Enron, SpamAssassin, or phishing-specific collections).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on the abstract. We address each major comment below and indicate the changes we will make to the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of an F1 score of 0.99 is presented without any description of model architecture, feature set, training/validation split, class balance, or cross-validation procedure, rendering the numerical result impossible to assess or reproduce.

    Authors: We agree that the abstract, due to space constraints, omits these details and that this limits immediate assessment of the reported result. The full manuscript describes the model architecture, feature engineering, the public dataset with its class balance, the train/validation/test splits, and the cross-validation procedure in Sections 3 and 4. To improve accessibility, we will revise the abstract to include a concise statement of the dataset, split strategy, and validation approach. revision: yes

  2. Referee: [Abstract] Abstract: the assertion that the model 'is designed for deployment within relevant applications' rests on an untested i.i.d. assumption; no temporal hold-out, adversarial robustness checks, or evaluation on zero-day phishing campaigns absent from the public corpus are reported, which directly undermines the deployment suitability claim.

    Authors: The referee is correct that the manuscript reports only a standard random split on the public corpus and does not include temporal hold-out, adversarial, or zero-day evaluations. The web platform demonstrates real-time inference on the trained model but does not constitute robustness testing under distribution shift. We will revise the abstract to qualify the deployment language, stating that the platform illustrates potential real-time use while noting that additional robustness evaluations would be required for production deployment. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper presents a standard supervised ML pipeline for binary classification on a public phishing dataset, reporting an F1 score of 0.99 on held-out test data together with an XAI component and a web deployment interface. No equations, fitted parameters, or uniqueness theorems are invoked; performance is measured by conventional train/test split metrics rather than any self-referential construction. The central claim (high accuracy on the chosen corpus) is therefore an empirical measurement, not a quantity that reduces to its own inputs by definition or by a self-citation chain. External generalization risk exists but is outside the scope of circularity analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical axioms, free parameters, or invented entities are stated in the abstract; the work rests on the unstated assumption that standard supervised learning on the chosen corpus produces a deployable detector.

pith-pipeline@v0.9.0 · 5657 in / 1109 out tokens · 29218 ms · 2026-05-24T00:41:40.308360+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    PhishTank > Statistics about phishing activity and PhishTank usage

    Cisco Talos Intelligence Group, “PhishTank > Statistics about phishing activity and PhishTank usage.” Mar. 2024. [Online]. Available: https://phishtank.org/stats.php

  2. [2]

    Introducing Cloudflare’s 2023 phishing threats report

    E. Dzuba and J. Cash, “Introducing Cloudflare’s 2023 phishing threats report.” Cloudflare, Mar. 2023. [Online]. Available: https://blog.cloudflare.com/2023 - phishing-report/

  3. [3]

    How Does a Phishing Attack Work?

    Simplilearn and B. Kumar, “How Does a Phishing Attack Work?” Mar. 2023. [Online]. Available: https://www.simplilearn.com/ice9/free_resources_article_thumb/phishing_working _2-What_Is_Phishing.PNG

  4. [4]

    Business Email Compromise

    Federal Bureau of Investigation (FBI), “Business Email Compromise.” [Online]. Available: https://www.fbi.gov/how -we-can-help-you/scams-and-safety/common- scams-and-crimes/business-email-compromise

  5. [5]

    [Online]

    APWG, “Phishing E -mail Reports and Phishing Site Trends 4 Brand -Domain Pairs Measurement 5 Brands & Legitimate Entities Hijacked by E -mail Phishing Attacks 6 Use of Domain Names for Phishing 7-9 Phishing and Identity Theft in Brazil 10-11 Most Targeted Industry Sectors 12 APWG Phishing Trends Report Contributors 13 Unifying the Global Response To Cyber...

  6. [6]

    Ahead of the Curve: Kaspersky’s projections for 2024’s Advanced Threats Landscape

    Kaspersky, “Ahead of the Curve: Kaspersky’s projections for 2024’s Advanced Threats Landscape.” Kaspersky, Mar. 2023. [Online]. Available: https://www.kaspersky.com/about/press-releases/2023_ahead-of-the-curve- kasperskys-projections-for-2024s-advanced-threats-landscape

  7. [7]

    Highly accurate phishing URL detection based on machine learning,

    S. Jalil, M. Usman, and A. Fong, “Highly accurate phishing URL detection based on machine learning,” J Ambient Intell Humaniz Comput, vol. 14, no. 7, pp. 9233– 9251, Jul. 2023, doi: 10.1007/s12652-022-04426-3

  8. [8]

    Phishing Detection System Through Hybrid Machine Learning Based on URL,

    A. Karim, M. Shahroz, K. Mustofa, S. B. Belhaouari, and S. R. K. Joga, “Phishing Detection System Through Hybrid Machine Learning Based on URL,” IEEE Access, vol. 11, pp. 36805– 36822, 2023, doi: 10.1109/ACCESS.2023.3252366

  9. [9]

    A Deep Learning -Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators,

    E. A. Aldakheel, M. Zakariah, G. A. Gashgari, F. A. Almarshad, and A. I. A. Alzahrani, “A Deep Learning -Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators,” Sensors , vol. 23, no. 9, May 2023, doi: 10.3390/s23094403

  10. [10]

    Modeling Hybrid Feature -Based Phishing Websites Detection Using Machine Learning Techniques,

    S. Das Guptta, K. T. Shahriar, H. Alqahtani, D. Alsalman, and I. H. Sarker, “Modeling Hybrid Feature -Based Phishing Websites Detection Using Machine Learning Techniques,” Annals of Data Science , vol. 11, no. 1, pp. 217 – 242, Feb. 2024, doi: 10.1007/s40745-022-00379-8

  11. [12]

    Detecting Phishing Domains Using Machine Learning,

    S. Alnemari and M. Alshammari, “Detecting Phishing Domains Using Machine Learning,” Applied Sciences , vol. 13, no. 8, p. 4649, Apr. 2023, doi: 10.3390/app13084649

  12. [13]

    Phish -Sight: a new approach for phishing detection using dominant colors on web pages and machine learning,

    P. Pandey and N. Mishra, “Phish -Sight: a new approach for phishing detection using dominant colors on web pages and machine learning,” Int J Inf Secur, vol. 22, no. 4, pp. 881– 891, Aug. 2023, doi: 10.1007/s10207-023-00672-4

  13. [14]

    A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning,

    M. W. Shaukat, R. Amin, M. M. A. Muslam, A. H. Alshehri, and J. Xie, “A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning,” Sensors, vol. 23, no. 19, p. 8070, Sep. 2023, doi: 10.3390/s23198070

  14. [15]

    Real-time phishing detection using deep learning methods by extensions,

    D. Minh Linh, H. D. Hung, H. Minh Chau, Q. Sy Vu, and T.-N. Tran, “Real-time phishing detection using deep learning methods by extensions,” International Journal of Electrical and Computer Engineering (IJECE), vol. 14, no. 3, p. 3021, Jun. 2024, doi: 10.11591/ijece.v14i3.pp3021-3035

  15. [16]

    Web Phishing Detection Using Web Crawling, Cloud Infrastructure and Deep Learning Framework,

    L. M. Abdulrahman, S. H. Ahmed, Z. N. Rashid, Y. S. Jghef, T. M. Ghazi, and U. H. Jader, “Web Phishing Detection Using Web Crawling, Cloud Infrastructure and Deep Learning Framework,” Journal of Applied Science and Technology Trends, vol. 4, no. 01, pp. 54– 71, Mar. 2023, doi: 10.38094/jastt401144

  16. [17]

    Machine learning for email spam filtering: review, approaches and open research problems,

    E. G. Dada, J. S. Bassi, H. Chiroma, S. M. Abdulhamid, A. O. Adetunmbi, and O. E. Ajibuwa, “Machine learning for email spam filtering: review, approaches and open research problems,” Heliyon, vol. 5, no. 6, p. e01802, Jun. 2019, doi: 10.1016/j.heliyon.2019.e01802

  17. [18]

    Applicability of machine learning in spam and phishing email filtering: review and approaches,

    T. Gangavarapu, C. D. Jaidhar, and B. Chanduka, “Applicability of machine learning in spam and phishing email filtering: review and approaches,” Artif Intell Rev, vol. 53, no. 7, pp. 5019– 5081, Oct. 2020, doi: 10.1007/S10462-020-09814-9/METRICS

  18. [19]

    Phishing Detection Leveraging Machine Learning and Deep Learning: A Review,

    D. M. Divakaran and A. Oest, “Phishing Detection Leveraging Machine Learning and Deep Learning: A Review,” IEEE Secur Priv, vol. 20, no. 5, pp. 86– 95, Sep. 2022, doi: 10.1109/MSEC.2022.3175225

  19. [20]

    Business Email Compromise Phishing Detection Based on Machine Learning: A Systematic Literature Review,

    H. F. Atlam and O. Oluwatimilehin, “Business Email Compromise Phishing Detection Based on Machine Learning: A Systematic Literature Review,” Electronics (Basel), vol. 12, no. 1, p. 42, Dec. 2022, doi: 10.3390/electronics12010042

  20. [21]

    Applying machine learning and natural language processing to detect phishing email,

    A. Alhogail and A. Alsabih, “Applying machine learning and natural language processing to detect phishing email,” Comput Secur, vol. 110, p. 102414, Nov. 2021, doi: 10.1016/j.cose.2021.102414

  21. [22]

    SEMI -SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS

    T. N. Kipf and M. Welling, “SEMI -SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS”

  22. [23]

    Graph Convolutional Networks for Text Classification

    L. Yao, C. Mao, and Y. Luo, “Graph Convolutional Networks for Text Classification”, Accessed: Mar. 24, 2024. [Online]. Available: www.aaai.org

  23. [24]

    Spam Email Detection Using Deep Learning Techniques,

    I. AbdulNabi and Q. Yaseen, “Spam Email Detection Using Deep Learning Techniques,” Procedia Comput Sci, vol. 184, pp. 853–858, 2021, doi: 10.1016/j.procs.2021.03.107

  24. [25]

    A Comparative Approach to Naïve Bayes Classifier and Support Vector Machine for Email Spam Classification,

    T. M. Ma, K. Yamamori, and A. Thida, “A Comparative Approach to Naïve Bayes Classifier and Support Vector Machine for Email Spam Classification,” 2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020, pp. 324– 326, Oct. 2020, doi: 10.1109/GCCE50665.2020.9291921

  25. [26]

    A Benchmark of PDF Information Extraction Tools Using a Multi-task and Multi- domain Evaluation Framework for Academic Documents,

    L. Halgaš, I. Agrafiotis, and J. R. C. Nurse, “Catching the Phish: Detecting Phishing Attacks Using Recurrent Neural Networks (RNNs),” in Lecture Notes in Computer Science, vol. 11897, Springer, Cham, 2020, pp. 219 – 233. doi: 10.1007/978 -3-030- 39303-8_17

  26. [27]

    Detecting spam email with machine learning optimized with bio -inspired metaheuristic algorithms,

    S. Gibson, B. Issac, L. Zhang, and S. M. Jacob, “Detecting spam email with machine learning optimized with bio -inspired metaheuristic algorithms,” IEEE Access , vol. 8, pp. 187914– 187932, 2020, doi: 10.1109/ACCESS.2020.3030751

  27. [28]

    A lifelong spam emails classification model,

    R. M. A. Mohammad, “A lifelong spam emails classification model,” Applied Computing and Informatics , vol. 20, no. 1 – 2, pp. 35 – 54, Jan. 2024, doi: 10.1016/J.ACI.2020.01.002/FULL/PDF

  28. [29]

    Email Spam Detection Using Machine Learning Algorithms,

    N. Kumar, S. Sonowal, and Nishant, “Email Spam Detection Using Machine Learning Algorithms,” in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) , IEEE, Jul. 2020, pp. 108– 113. doi: 10.1109/ICIRCA48905.2020.9183098

  29. [30]

    Improving email spam detection using content based feature engineering approach,

    W. Hijawi, H. Faris, J. Alqatawna, A. M. Al -Zoubi, and I. Aljarah, “Improving email spam detection using content based feature engineering approach,” in 2017 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), IEEE, Oct. 2017, pp. 1– 6. doi: 10.1109/AEECT.2017.8257764

  30. [31]

    Sentiment analysis and spam detection in short informal text using learning classifier systems,

    M. H. Arif, J. Li, M. Iqbal, and K. Liu, “Sentiment analysis and spam detection in short informal text using learning classifier systems,” Soft comput, vol. 22, no. 21, pp. 7281– 7291, Nov. 2018, doi: 10.1007/S00500-017-2729-X/METRICS

  31. [32]

    A novel hybrid approach of SVM combined with NLP and probabilistic neural network for email phishing,

    A. Kumar, J. M. Chatterjee, and V. G. Díaz, “A novel hybrid approach of SVM combined with NLP and probabilistic neural network for email phishing,” International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 1, p. 486, Feb. 2020, doi: 10.11591/ijece.v10i1.pp486-493

  32. [33]

    Phishing Email Detection Using Improved RCNN Model With Multilevel Vectors and Attention Mechanism,

    Y. Fang, C. Zhang, C. Huang, L. Liu, and Y. Yang, “Phishing Email Detection Using Improved RCNN Model With Multilevel Vectors and Attention Mechanism,” IEEE Access, vol. 7, pp. 56329– 56340, 2019, doi: 10.1109/ACCESS.2019.2913705

  33. [34]

    CLAIR collection of fraud email,

    Dragomir Radev, “ CLAIR collection of fraud email,” ACL Data and Code Repository, ADCR2008T001. Jun. 2008

  34. [35]

    The Enron Corpus: A New Dataset for Email Classification Research,

    B. Klimt and Y. Yang, “The Enron Corpus: A New Dataset for Email Classification Research,” in Machine Learning: ECML 2004, F. and G. F. and P. D. Boulicaut Jean - François and Esposito, Ed., Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 217– 226. doi: 10.1007/978-3-540-30115-8_22

  35. [36]

    Spam Assassin Project,

    Spam Assassin Project, “Spam Assassin Project,” Spam Assassin Public Corpus. 2015

  36. [37]

    TF– IDF,

    Encyclopedia of Machine Learning, “TF– IDF,” Encyclopedia of Machine Learning, pp. 986– 987, 2011, doi: 10.1007/978-0-387-30164-8_832

  37. [38]

    Efficient Estimation of Word Representations in Vector Space

    T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space”, Accessed: Mar. 24, 2024. [Online]. Available: http://ronan.collobert.com/senna/

  38. [39]

    A review of spam email detection: analysis of spammer strategies and the dataset shift problem,

    F. Jáñez-Martino, R. Alaiz-Rodríguez, V. González-Castro, E. Fidalgo, and E. Alegre, “A review of spam email detection: analysis of spammer strategies and the dataset shift problem,” Artif Intell Rev , vol. 56, no. 2, pp. 1145 – 1173, Feb. 2023, doi: 10.1007/s10462-022-10195-4

  39. [40]

    Theoretical analysis of LIME,

    D. Garreau, “Theoretical analysis of LIME,” in Explainable Deep Learning AI, Elsevier, 2023, pp. 293– 316. doi: 10.1016/B978-0-32-396098-4.00020-X

  40. [41]

    Recruitment phishing attack targeting students | News | Students | The University of Aberdeen

    University of Aberdeen, “Recruitment phishing attack targeting students | News | Students | The University of Aberdeen.” Apr. 2024. [Online]. Available: https://www.abdn.ac.uk/students/news/22987/