PHISHREV: A Hybrid Machine Learning and Post-Hoc Non-monotonic Reasoning Framework for Context-Aware Phishing Website Classification

Amlan Chakrabarti; Kumar Sankar Ray; Mainak Sen

arxiv: 2604.25512 · v1 · submitted 2026-04-28 · 💻 cs.AI

PHISHREV: A Hybrid Machine Learning and Post-Hoc Non-monotonic Reasoning Framework for Context-Aware Phishing Website Classification

Mainak Sen , Kumar Sankar Ray , Amlan Chakrabarti This is my paper

Pith reviewed 2026-05-07 16:16 UTC · model grok-4.3

classification 💻 cs.AI

keywords phishing detectionmachine learninganswer set programmingnon-monotonic reasoninghybrid frameworkcontext-aware classificationdecision refinement

0 comments

The pith

A hybrid framework uses answer set programming to revise machine learning predictions for more consistent phishing website detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes integrating standard machine learning classifiers with a post-hoc layer of non-monotonic reasoning based on answer set programming. This layer applies expert-encoded rules to revise classifier outputs when they conflict with known phishing contexts. A sympathetic reader would care because machine learning models alone often lack contextual understanding and require expensive retraining when threats evolve. The approach changes a modest fraction of decisions while supporting rapid insertion of new rules.

Core claim

The PHISHREV framework combines machine learning classifiers with non-monotonic reasoning via Answer Set Programming to perform context-aware decision refinement. The post-hoc reasoning layer incorporates expert knowledge to revise classifier predictions through formal belief revisions. Experimental results indicate that the reasoning module modifies 5.08% of classifier outputs, leading to improved decision consistency, and that new domain knowledge can be incorporated into the reasoning layer in O(n) time without retraining the model.

What carries the argument

The post-hoc non-monotonic reasoning layer using Answer Set Programming (ASP) that performs formal belief revision on the outputs of the machine learning classifier.

If this is right

The reasoning module changes 5.08% of the machine learning classifier outputs.
Decision consistency improves after the reasoning layer is applied.
New domain knowledge integrates into the system in linear time without any need to retrain the underlying classifier.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This design could let defenders update phishing detectors quickly when new attack patterns appear by editing rules instead of gathering fresh training data.
The same hybrid pattern might apply to other security tasks such as fraud detection where explicit domain rules can correct statistical errors.
Explicit rule-based revisions could make the overall system more auditable than a pure black-box model.

Load-bearing premise

The expert knowledge encoded in the ASP rules accurately reflects real-world phishing contexts and revising the machine learning predictions based on this reasoning improves the actual correctness of classifications.

What would settle it

A comparison of classification accuracy on a labeled test set of phishing and legitimate websites before and after the reasoning layer is applied, to determine whether the revisions raise or lower the true positive and false positive rates.

Figures

Figures reproduced from arXiv: 2604.25512 by Amlan Chakrabarti, Kumar Sankar Ray, Mainak Sen.

**Figure 1.** Figure 1: Proposed hybrid phishing detection framework integrating machine view at source ↗

read the original abstract

Phishing detection systems are predominantly rely on statistical machine learning models, which often lack contextual reasoning and are vulnerable to adversarial manipulation. In this work, we propose a hybrid framework that integrates machine learning classifiers with non-monotonic reasoning using Answer Set Programming (ASP) to enable context-aware decision refinement. The proposed post-hoc reasoning layer incorporates expert knowledge to revise classifier predictions through formal belief revisions. Experimental results indicate that the reasoning module modifies 5.08\% of classifier outputs, leading to improved decision consistency. A key advantage is that new domain knowledge can be incorporated into the reasoning layer in $\mathcal{O}(n)$ time, eliminating the need for model retraining.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds post-hoc ASP reasoning to an ML phishing classifier for context and easy updates, but the results only claim consistency gains without showing accuracy improvements.

read the letter

The main point is a hybrid setup where a machine learning classifier gets its phishing predictions revised afterward by Answer Set Programming rules that encode expert context. New knowledge slots in linearly without retraining the model, which is the practical hook for fast-changing threats like phishing sites. The integration of non-monotonic reasoning this way is the clearest new element, and it does address a genuine limitation of pure statistical detectors that ignore broader context or require full retrains for rule changes. The architecture description is straightforward and shows how belief revision can flip some outputs based on the rules. That part is solid as far as it goes. The evaluation is the soft spot. The abstract states the reasoning layer changes 5.08% of outputs for better decision consistency, yet it supplies no accuracy, precision, recall, or F1 numbers before versus after, no dataset details, no baseline comparisons, and no statistical checks. Without those, it is impossible to tell whether the revisions actually catch more real phishing sites or just make the outputs match the chosen rules more closely. The stress-test concern holds: rule consistency can improve even if the rules themselves are incomplete or misaligned with ground truth. If the full paper has tables with labeled data results showing net gains, that would fix the gap; otherwise the central claim stays unproven. This is for people working on hybrid symbolic and statistical systems in security applications. A reader already interested in ASP for refining ML outputs might extract the update-time idea or the revision mechanism. It is not ready for broad citation yet. I would send it to peer review so referees can check the full experiments and implementation details, but it would need revisions to include proper before-and-after metrics and dataset information.

Referee Report

2 major / 2 minor

Summary. The paper proposes PHISHREV, a hybrid framework integrating machine learning classifiers with post-hoc non-monotonic reasoning via Answer Set Programming (ASP) for context-aware phishing website classification. A reasoning layer encodes expert knowledge to perform formal belief revisions on ML predictions. The central empirical claim is that this module modifies 5.08% of classifier outputs and yields improved decision consistency; a secondary claim is that new domain knowledge can be incorporated into the ASP layer in O(n) time without retraining the underlying ML model.

Significance. If the revisions can be shown to improve accuracy against ground truth (rather than merely increasing consistency with a fixed rule set), the approach would address a practical limitation of pure ML phishing detectors: the inability to rapidly incorporate new contextual knowledge without retraining. The post-hoc design and claimed linear update cost are potentially valuable for security applications where threat models evolve quickly.

major comments (2)

[Abstract] Abstract: The claim that the reasoning module 'modifies 5.08% of classifier outputs, leading to improved decision consistency' is presented without any description of the experimental setup, datasets, baseline classifiers, statistical tests, or the precise definition and measurement of 'decision consistency'. This information is required to evaluate the central empirical result.
[Abstract] Abstract: No before/after accuracy, precision, recall, or F1 scores on labeled data are reported, nor any comparison demonstrating that ASP revisions increase correctness relative to ground truth rather than simply aligning outputs with the non-monotonic rules. Without these metrics the claim that the hybrid system improves phishing classification cannot be assessed.

minor comments (2)

[Abstract] Abstract contains a grammatical error: 'are predominantly rely on' should read 'predominantly rely on'.
[Abstract] The O(n) incorporation claim is stated without defining n or describing the update mechanism in the ASP layer.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point by point below. We agree that the abstract requires expansion for clarity and will revise it to include the requested details while preserving its conciseness. We also clarify the scope of our empirical claims and will incorporate additional metrics as noted.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that the reasoning module 'modifies 5.08% of classifier outputs, leading to improved decision consistency' is presented without any description of the experimental setup, datasets, baseline classifiers, statistical tests, or the precise definition and measurement of 'decision consistency'. This information is required to evaluate the central empirical result.

Authors: We agree that the abstract omits these details due to length constraints. The full manuscript (Section 4) specifies the experimental setup, including the phishing datasets used, baseline classifiers (e.g., standard ML models for website classification), the definition of decision consistency (as the rate of alignment between revised predictions and the ASP-encoded expert rules), and statistical validation of the 5.08% modification rate. We will revise the abstract to include a brief summary of the datasets, baselines, consistency definition, and mention of the experimental protocol to allow immediate evaluation of the central result. revision: yes
Referee: [Abstract] Abstract: No before/after accuracy, precision, recall, or F1 scores on labeled data are reported, nor any comparison demonstrating that ASP revisions increase correctness relative to ground truth rather than simply aligning outputs with the non-monotonic rules. Without these metrics the claim that the hybrid system improves phishing classification cannot be assessed.

Authors: The manuscript's primary empirical claim concerns improved decision consistency with the non-monotonic rules rather than increased correctness against ground-truth labels. The ASP layer performs belief revision to enforce expert knowledge, which may produce outputs that differ from the original ground truth depending on rule quality; we therefore focused evaluation on consistency gains and the O(n) update property. No before/after accuracy metrics appear in the current version because they were outside the stated scope. We acknowledge that reporting these metrics would strengthen the presentation and will add before/after accuracy, precision, recall, and F1 comparisons on the labeled data in the revised manuscript, along with explicit discussion of how the revisions relate to ground truth. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims and architectural properties do not reduce to inputs by construction

full rationale

The paper describes a hybrid ML+ASP framework whose central results are experimental observations (reasoning module modifies 5.08% of outputs, yielding improved consistency) and a stated complexity advantage (O(n) incorporation of new knowledge without retraining). These are presented as measured outcomes and design properties rather than first-principles derivations or predictions. No equations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations appear in the provided text. The derivation chain is therefore self-contained against external benchmarks; the skeptic concern about accuracy vs. consistency is a correctness issue, not a circularity reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach assumes the suitability of ASP for this domain without providing independent evidence beyond the experimental modification rate; no new entities postulated.

axioms (1)

domain assumption Answer Set Programming can effectively model non-monotonic reasoning and belief revision for incorporating expert knowledge into ML predictions.
This underpins the post-hoc reasoning layer described in the abstract.

pith-pipeline@v0.9.0 · 5417 in / 1390 out tokens · 70889 ms · 2026-05-07T16:16:04.169747+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

Analysis of phishing attack trends, impacts and prevention methods: literature study,

F. P. E. Putra, A. Zulfikri, G. Arifin, R. M. Ilhamsyahet al., “Analysis of phishing attack trends, impacts and prevention methods: literature study,”Brilliance: Research of Artificial Intelligence, vol. 4, no. 1, pp. 413–421, 2024

work page 2024
[2]

Machine learning techniques for detecting phishing url attacks,

D. T. Mosa, M. Y . Shams, A. A. Abohany, E.-S. M. El-kenawy, and M. Thabet, “Machine learning techniques for detecting phishing url attacks,”Computers, Materials and Continua, vol. 75, no. 1, pp. 1271– 1290, 2023

work page 2023
[3]

Phishing website detection using deep learning models,

U. Zara, K. Ayub, H. U. Khan, A. Daud, T. Alsahfi, and S. Gulzar, “Phishing website detection using deep learning models,”IEEE Access, 2024

work page 2024
[4]

A review of adversarial at- tack and defense for classification methods,

Y . Li, M. Cheng, C.-J. Hsieh, and T. C. Lee, “A review of adversarial at- tack and defense for classification methods,”The American Statistician, vol. 76, no. 4, pp. 329–345, 2022

work page 2022
[5]

Phishstorm: Detecting phishing with streaming analytics,

S. Marchal, J. Franc ¸ois, R. State, and T. Engel, “Phishstorm: Detecting phishing with streaming analytics,”IEEE Transactions on Network and Service Management, vol. 11, no. 4, pp. 458–471, 2014

work page 2014
[6]

Phishing url detection with neural networks: an empirical study,

H. Ghalechyan, E. Israyelyan, A. Arakelyan, G. Hovhannisyan, and A. Davtyan, “Phishing url detection with neural networks: an empirical study,”Scientific reports, vol. 14, no. 1, p. 25134, 2024

work page 2024
[7]

Brewka, J

G. Brewka, J. Dix, K. Konoligeet al.,Nonmonotonic reasoning: an overview. CSLI publications Stanford, 1997, vol. 73

work page 1997
[8]

Sok: a comprehensive reexamination of phishing research from the security perspective,

A. Das, S. Baki, A. El Aassal, R. Verma, and A. Dunbar, “Sok: a comprehensive reexamination of phishing research from the security perspective,”IEEE Communications Surveys & Tutorials, vol. 22, no. 1, pp. 671–708, 2019

work page 2019
[9]

Machine learningtechniquesfor detection of website phishing: A review for promises and challenges,

A. Odeh, I. Keshta, and E. Abdelfattah, “Machine learningtechniquesfor detection of website phishing: A review for promises and challenges,” in2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 2021, pp. 0813–0818

work page 2021
[10]

Deep learning for phishing detection: Taxonomy, current challenges and future directions,

N. Q. Do, A. Selamat, O. Krejcar, E. Herrera-Viedma, and H. Fujita, “Deep learning for phishing detection: Taxonomy, current challenges and future directions,”Ieee Access, vol. 10, pp. 36 429–36 463, 2022

work page 2022
[11]

Multi-shot asp solving with clingo,

M. GEBSER, R. KAMINSKI, B. KAUFMANN, and T. SCHAUB, “Multi-shot asp solving with clingo,”Theory and Practice of Logic Programming, vol. 19, no. 1, p. 27–82, 2019

work page 2019
[12]

Web page phishing detection,

A. Hannousse and S. Yahiouche, “Web page phishing detection,” 2021. [Online]. Available: https://doi.org/10.17632/c2gw7fy2j4.3 5

work page doi:10.17632/c2gw7fy2j4.3 2021
[13]

Breaking alert fatigue: Ai-assisted siem framework for effective incident response,

T. Ban, T. Takahashi, S. Ndichu, and D. Inoue, “Breaking alert fatigue: Ai-assisted siem framework for effective incident response,”Applied Sciences, vol. 13, no. 11, p. 6610, 2023

work page 2023
[14]

Towards benchmark datasets for machine learning based website phishing detection: An experimental study,

A. Hannousse and S. Yahiouche, “Towards benchmark datasets for machine learning based website phishing detection: An experimental study,”Engineering Applications of Artificial Intelligence, vol. 104, p. 104347, 2021

work page 2021
[15]

Single and hybrid-ensemble learning-based phishing website detection: Examining impacts of varied nature datasets and informative feature selection technique,

K. Adane, B. Beyene, and M. Abebe, “Single and hybrid-ensemble learning-based phishing website detection: Examining impacts of varied nature datasets and informative feature selection technique,”Digital Threats: Research and Practice, vol. 4, no. 3, pp. 1–27, 2023

work page 2023
[16]

An explainable feature selection framework for web phish- ing detection with machine learning,

S. S. Shafin, “An explainable feature selection framework for web phish- ing detection with machine learning,”Data Science and Management, vol. 8, no. 2, pp. 127–136, 2025

work page 2025

[1] [1]

Analysis of phishing attack trends, impacts and prevention methods: literature study,

F. P. E. Putra, A. Zulfikri, G. Arifin, R. M. Ilhamsyahet al., “Analysis of phishing attack trends, impacts and prevention methods: literature study,”Brilliance: Research of Artificial Intelligence, vol. 4, no. 1, pp. 413–421, 2024

work page 2024

[2] [2]

Machine learning techniques for detecting phishing url attacks,

D. T. Mosa, M. Y . Shams, A. A. Abohany, E.-S. M. El-kenawy, and M. Thabet, “Machine learning techniques for detecting phishing url attacks,”Computers, Materials and Continua, vol. 75, no. 1, pp. 1271– 1290, 2023

work page 2023

[3] [3]

Phishing website detection using deep learning models,

U. Zara, K. Ayub, H. U. Khan, A. Daud, T. Alsahfi, and S. Gulzar, “Phishing website detection using deep learning models,”IEEE Access, 2024

work page 2024

[4] [4]

A review of adversarial at- tack and defense for classification methods,

Y . Li, M. Cheng, C.-J. Hsieh, and T. C. Lee, “A review of adversarial at- tack and defense for classification methods,”The American Statistician, vol. 76, no. 4, pp. 329–345, 2022

work page 2022

[5] [5]

Phishstorm: Detecting phishing with streaming analytics,

S. Marchal, J. Franc ¸ois, R. State, and T. Engel, “Phishstorm: Detecting phishing with streaming analytics,”IEEE Transactions on Network and Service Management, vol. 11, no. 4, pp. 458–471, 2014

work page 2014

[6] [6]

Phishing url detection with neural networks: an empirical study,

H. Ghalechyan, E. Israyelyan, A. Arakelyan, G. Hovhannisyan, and A. Davtyan, “Phishing url detection with neural networks: an empirical study,”Scientific reports, vol. 14, no. 1, p. 25134, 2024

work page 2024

[7] [7]

Brewka, J

G. Brewka, J. Dix, K. Konoligeet al.,Nonmonotonic reasoning: an overview. CSLI publications Stanford, 1997, vol. 73

work page 1997

[8] [8]

Sok: a comprehensive reexamination of phishing research from the security perspective,

A. Das, S. Baki, A. El Aassal, R. Verma, and A. Dunbar, “Sok: a comprehensive reexamination of phishing research from the security perspective,”IEEE Communications Surveys & Tutorials, vol. 22, no. 1, pp. 671–708, 2019

work page 2019

[9] [9]

Machine learningtechniquesfor detection of website phishing: A review for promises and challenges,

A. Odeh, I. Keshta, and E. Abdelfattah, “Machine learningtechniquesfor detection of website phishing: A review for promises and challenges,” in2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 2021, pp. 0813–0818

work page 2021

[10] [10]

Deep learning for phishing detection: Taxonomy, current challenges and future directions,

N. Q. Do, A. Selamat, O. Krejcar, E. Herrera-Viedma, and H. Fujita, “Deep learning for phishing detection: Taxonomy, current challenges and future directions,”Ieee Access, vol. 10, pp. 36 429–36 463, 2022

work page 2022

[11] [11]

Multi-shot asp solving with clingo,

M. GEBSER, R. KAMINSKI, B. KAUFMANN, and T. SCHAUB, “Multi-shot asp solving with clingo,”Theory and Practice of Logic Programming, vol. 19, no. 1, p. 27–82, 2019

work page 2019

[12] [12]

Web page phishing detection,

A. Hannousse and S. Yahiouche, “Web page phishing detection,” 2021. [Online]. Available: https://doi.org/10.17632/c2gw7fy2j4.3 5

work page doi:10.17632/c2gw7fy2j4.3 2021

[13] [13]

Breaking alert fatigue: Ai-assisted siem framework for effective incident response,

T. Ban, T. Takahashi, S. Ndichu, and D. Inoue, “Breaking alert fatigue: Ai-assisted siem framework for effective incident response,”Applied Sciences, vol. 13, no. 11, p. 6610, 2023

work page 2023

[14] [14]

Towards benchmark datasets for machine learning based website phishing detection: An experimental study,

A. Hannousse and S. Yahiouche, “Towards benchmark datasets for machine learning based website phishing detection: An experimental study,”Engineering Applications of Artificial Intelligence, vol. 104, p. 104347, 2021

work page 2021

[15] [15]

Single and hybrid-ensemble learning-based phishing website detection: Examining impacts of varied nature datasets and informative feature selection technique,

K. Adane, B. Beyene, and M. Abebe, “Single and hybrid-ensemble learning-based phishing website detection: Examining impacts of varied nature datasets and informative feature selection technique,”Digital Threats: Research and Practice, vol. 4, no. 3, pp. 1–27, 2023

work page 2023

[16] [16]

An explainable feature selection framework for web phish- ing detection with machine learning,

S. S. Shafin, “An explainable feature selection framework for web phish- ing detection with machine learning,”Data Science and Management, vol. 8, no. 2, pp. 127–136, 2025

work page 2025