An AI Security Agent for Banking: Multi-Vector Fraud and AML Detection Across Retail and Corporate Accounts

Joseph Walusimbi; Joshua Benjamin Ssentongo

arxiv: 2606.17555 · v1 · pith:QWVH7EX7new · submitted 2026-06-16 · 💻 cs.CR · cs.AI· cs.CE· cs.ET

An AI Security Agent for Banking: Multi-Vector Fraud and AML Detection Across Retail and Corporate Accounts

Joseph Walusimbi , Joshua Benjamin Ssentongo This is my paper

Pith reviewed 2026-06-27 00:34 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CEcs.ET

keywords AI security agentfraud detectionAML detectionbanking securityLSTM modelgraph networkssynthetic transaction datamulti-vector threats

0 comments

The pith

A fusion of LSTM sequence models, statistical monitors, and graph networks on transaction and session streams detects both signature-based fraud and behavioral financial crimes in banking.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes an AI security agent that processes two parallel streams of banking events to catch fraud and anti-money laundering violations that static rules miss. One stream handles transactions for card fraud and AML categories while the other tracks sessions for account takeovers and hijacks. Each stream uses an LSTM to learn account behavior over time, a statistical component to flag unusual velocity, and a graph module to spot network patterns like fan-in or pass-through that indicate layering or mule accounts. On a synthetic log of over 237,000 transactions and 113,000 sessions, the combined model reaches F1 scores of 0.787 and 0.867, outperforming rule-based and LSTM-only baselines. This approach matters because sophisticated attacks are engineered to look like normal activity at the single-event level.

Core claim

The central claim is that a three-component fusion architecture operating on transaction and session streams can detect both traditional signature-based fraud and behavioral financial crimes such as business email compromise and money laundering layering, which static rules cannot reliably identify because they appear indistinguishable from legitimate activity.

What carries the argument

The three-component fusion architecture that combines an LSTM sequence model for behavioral history, a statistical velocity and threshold monitor, and a graph or network module for relationship patterns like fan-in and fan-out.

If this is right

The proposed model achieves an overall F1 of 0.787 on the transaction stream compared to 0.562 for rule-based and 0.655 for LSTM-only baselines.
It achieves 0.867 F1 on the session stream versus 0.733 and 0.713 for the baselines.
A customer-facing chatbot provides 96.6% identity verification accuracy and detects 86.8% of mass-reset attacks.
An analyst case-summary assistant reaches 99.3% action-recommendation F1.
Critical-tier automated responses have latency under 0.43 ms at the 95th percentile.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The architecture could be extended to incorporate additional data sources such as device fingerprints or external watchlists to further improve detection of mule networks.
Deployment in live systems might reduce the volume of false positives that analysts must review by catching patterns across multiple accounts.
Testing the same fusion approach on real rather than synthetic banking logs would clarify how well the performance holds when attack distributions differ from the simulation.
The dual-stream design suggests a general pattern for security agents that must handle both point events and longer behavioral sequences.

Load-bearing premise

The synthetic event log accurately simulates the characteristics of real-world fraud, AML, and legitimate banking activity, including the indistinguishability of sophisticated attacks.

What would settle it

Evaluating the model on a real-world anonymized banking transaction dataset and checking whether the F1 scores remain higher than the rule-based and LSTM baselines.

Figures

Figures reproduced from arXiv: 2606.17555 by Joseph Walusimbi, Joshua Benjamin Ssentongo.

read the original abstract

Banks simultaneously face signature-based fraud (card-not-present attacks, account takeover, ATM cloning) and behavioural financial crime (structuring, layering, mule networks, business email compromise) -- two threat families with fundamentally different detection requirements. Static rule engines that reliably catch brute-force and high-velocity events are structurally blind to business-email-compromise (BEC) payment redirection, session hijacking, and money-laundering layering, which are engineered to appear indistinguishable from legitimate activity at the individual transaction or session level. This paper presents an AI security agent for retail and corporate banking that addresses this gap through a three-component fusion architecture operating on two parallel event streams: a transaction stream (card fraud, ACH/wire fraud, AML categories) and a session stream (account takeover, session hijacking, SIM-swap, insider abuse). Each stream combines an LSTM sequence model capturing per-account behavioural history, a statistical velocity/threshold monitor, and a graph/network module capturing account-counterparty relationship patterns (fan-in, fan-out, pass-through ratio) for money-laundering detection. Experiments on a synthetic event log of 237,669 transactions and 113,508 sessions across 13 threat categories and 3,470 simulated accounts demonstrate overall F1 of 0.787 (transaction stream) and 0.867 (session stream) for the proposed model, versus 0.562/0.733 for a rule-based baseline and 0.655/0.713 for an LSTM-only baseline. The agent includes a customer-facing transaction-verification chatbot (96.6% identity verification accuracy, 86.8% mass-reset attack detection) and an analyst case-summary assistant (99.3% action-recommendation F1), with Critical-tier automated response latency under 0.43 ms at the 95th percentile.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper fuses LSTM, stats thresholds, and graph metrics on synthetic banking logs to beat basic baselines on fraud and AML, but the simulation details are missing so the gains are hard to trust.

read the letter

The paper puts together an LSTM for per-account sequences, a statistical velocity monitor, and graph metrics like fan-in and pass-through ratios, running them on separate transaction and session streams. It also adds a customer chatbot and an analyst assistant. On their synthetic log of 237k transactions and 113k sessions across 13 threat types, the full model hits F1 of 0.787 and 0.867, ahead of the rule-based and LSTM-only numbers they report.

It does a decent job framing the split between obvious high-velocity attacks and the ones meant to blend in, like BEC or layering. The three-component setup is a straightforward way to cover both, and the chatbot accuracy numbers give it a practical touch that some applied papers skip.

The soft spot is the data. Everything rests on how the synthetic generator created the attacks and made them look like normal activity. No description of the generation rules, parameter ranges, or any check against real bank logs appears in the abstract or the stress-test note. If the simulated attacks are easier to separate than real ones, the reported deltas are artifacts. No error bars or significance tests either, and the baselines stay simple.

This is for teams building fraud systems in finance who want an example of component fusion rather than new theory. A reader working on deployed detection might pick up the architecture as a template.

Send it to peer review. The problem is concrete and the approach is clear enough that referees can push on the validation gaps.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a three-component AI security agent for banking that fuses LSTM sequence models, statistical velocity/threshold monitors, and graph/network modules (fan-in, fan-out, pass-through ratio) operating on parallel transaction and session streams to detect both signature-based fraud and behavioral crimes such as BEC and layering. Experiments on a synthetic log of 237,669 transactions and 113,508 sessions across 13 threat categories and 3,470 accounts report overall F1 scores of 0.787 (transaction) and 0.867 (session) for the proposed model versus 0.562/0.733 (rule-based) and 0.655/0.713 (LSTM-only), plus a customer chatbot (96.6% verification accuracy) and analyst assistant (99.3% action-recommendation F1) with sub-millisecond latency.

Significance. If the synthetic data faithfully reproduces the statistical indistinguishability of sophisticated attacks from legitimate activity, the multi-vector fusion approach could represent a practical step forward in addressing limitations of static rules for AML and session-based threats. The explicit numerical results, latency figures, and dual-stream design provide concrete, falsifiable claims that strengthen the contribution relative to purely conceptual work.

major comments (2)

[Experiments section (synthetic event log)] Experiments section (synthetic event log): No description is supplied of the data-generation procedure, the parameter ranges used to enforce indistinguishability of BEC redirection and layering from legitimate flows, or any statistical comparison of the synthetic distributions against real banking logs. This is load-bearing for the central claim because the reported F1 deltas (0.225 and 0.154) rest entirely on the assumption that the simulation produces attacks that are separable only at the level the three-component model can exploit.
[Results and evaluation] Results and evaluation: The manuscript reports point F1 values without error bars, confidence intervals, or statistical significance tests comparing the fusion model to the two baselines; the precise fusion rule (weighting, voting, or learned combination of LSTM, velocity, and graph outputs) is also omitted. These omissions directly affect assessment of whether the performance advantage is robust or an artifact of the particular synthetic realization.

minor comments (2)

[Abstract] The abstract states that the agent 'combines' the three modules but does not indicate whether this occurs at the feature, score, or decision level; a short clarifying sentence would improve readability without altering the technical content.
[Results tables/figures] Table or figure captions for the performance results should explicitly note that all metrics derive from the synthetic log rather than real transaction data.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and constructive comments. We address each major comment below and commit to revising the manuscript to improve clarity and completeness.

read point-by-point responses

Referee: Experiments section (synthetic event log): No description is supplied of the data-generation procedure, the parameter ranges used to enforce indistinguishability of BEC redirection and layering from legitimate flows, or any statistical comparison of the synthetic distributions against real banking logs. This is load-bearing for the central claim because the reported F1 deltas (0.225 and 0.154) rest entirely on the assumption that the simulation produces attacks that are separable only at the level the three-component model can exploit.

Authors: We agree that a detailed account of the synthetic data generation process is essential for validating the central claims. In the revised version, we will add a comprehensive description of the data-generation procedure in the Experiments section, including the specific parameter ranges employed to simulate BEC redirection and layering attacks such that they are statistically indistinguishable from legitimate flows at the transaction level. We will also include any statistical comparisons performed between the synthetic distributions and available real banking log characteristics, noting the limitations inherent to synthetic data. revision: yes
Referee: Results and evaluation: The manuscript reports point F1 values without error bars, confidence intervals, or statistical significance tests comparing the fusion model to the two baselines; the precise fusion rule (weighting, voting, or learned combination of LSTM, velocity, and graph outputs) is also omitted. These omissions directly affect assessment of whether the performance advantage is robust or an artifact of the particular synthetic realization.

Authors: We acknowledge these omissions in the original manuscript. We will revise the Results and evaluation section to include a detailed description of the precise fusion rule used to combine the LSTM, statistical, and graph module outputs. We will also add error bars, confidence intervals, and statistical significance tests comparing the models, using bootstrap methods or multiple runs as appropriate to demonstrate robustness. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical F1 scores are direct measurements on synthetic data with no self-referential derivations

full rationale

The paper describes a three-component architecture (LSTM + statistical monitor + graph module) and reports F1 scores (0.787/0.867) obtained by running the model on a fixed synthetic event log of 237,669 transactions and 113,508 sessions. No equations, parameter-fitting steps, or self-citations are presented that would make the reported metrics equivalent to the inputs by construction. The results are computed outputs on externally generated data rather than renamed fits or self-defined quantities. The synthetic-data realism assumption is a modeling limitation but does not create circularity in the derivation chain.

Axiom & Free-Parameter Ledger

3 free parameters · 1 axioms · 0 invented entities

The architecture relies on standard components but the integration method and the validity of the synthetic benchmark are key unstated elements.

free parameters (3)

LSTM model hyperparameters
Sequence length, hidden units, and training parameters for behavioral modeling are not specified but required for the model.
Statistical velocity and threshold values
Parameters for the velocity/threshold monitor tuned to detect high-velocity events.
Graph pattern thresholds (fan-in, fan-out, pass-through ratio)
Cutoffs for identifying money-laundering relationship patterns.

axioms (1)

domain assumption The synthetic data generation process produces realistic distributions of legitimate and fraudulent events that match real banking environments.
All performance claims depend on this; the paper uses it to benchmark the model.

pith-pipeline@v0.9.1-grok · 5878 in / 1644 out tokens · 74686 ms · 2026-06-27T00:34:50.597903+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 9 canonical work pages

[1]

Long short-term memory

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[2]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” inProc. 8th IEEE Int. Conf. Data Mining (ICDM), Pisa, Italy, 2008, pp. 413–422. doi: 10.1109/ICDM.2008.17

work page doi:10.1109/icdm.2008.17 2008
[3]

A survey of network anomaly detection techniques,

M. Ahmed, A. N. Mahmood, and J. Hu, “A survey of network anomaly detection techniques,”J. Netw. Comput. Appl., vol. 60, pp. 19–31, 2016. doi: 10.1016/j.jnca.2015.11.016

work page doi:10.1016/j.jnca.2015.11.016 2016
[4]

Anomaly detection: A survey.ACM Computing Surveys, 41(3):1–58, 2009

V . Chandola, A. Banerjee, and V . Kumar, “Anomaly detection: a survey,”ACM Comput. Surv., vol. 41, no. 3, p. 15, 2009. doi: 10.1145/1541880.1541882

work page doi:10.1145/1541880.1541882 2009
[5]

A finan- cial fraud detection model based on LSTM deep learning tech- nique,

Y . Alghofaili, A. Albattah, and M. A. Rassam, “A finan- cial fraud detection model based on LSTM deep learning tech- nique,”J. Appl. Secur. Res., vol. 15, no. 4, pp. 498–516, 2020. doi: 10.1080/19361610.2020.1815491

work page doi:10.1080/19361610.2020.1815491 2020
[6]

Feature engineering strategies for credit card fraud detection,

A. C. Bahnsen, D. Aouada, A. Stojanovic, and B. Ottersten, “Feature engineering strategies for credit card fraud detection,”Expert Syst. Appl., vol. 51, pp. 134–142, 2016. doi: 10.1016/j.eswa.2015.12.030

work page doi:10.1016/j.eswa.2015.12.030 2016
[7]

Anti-money laundering in Bitcoin: ex- perimenting with graph convolutional networks for financial forensics,

M. Weber, G. Domeniconi, J. Chen, D. K. I. Weidele, C. Bellei, T. Robinson, and C. E. Leiserson, “Anti-money laundering in Bitcoin: ex- perimenting with graph convolutional networks for financial forensics,” inProc. KDD 2019 Workshop FinancialCrime, 2019

2019
[8]

Semi-supervised classification with graph convolutional networks,

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” inProc. 5th Int. Conf. Learning Representa- tions (ICLR), Toulon, France, 2017

2017
[9]

Finding money launderers using heterogeneous graph neural networks,

W. W. Lo, G. Opedal, and T. Verdonck, “Finding money launderers using heterogeneous graph neural networks,”Intelligent Systems with Applications, vol. 25, 2025. doi: 10.1016/j.iswa.2025.200479

work page doi:10.1016/j.iswa.2025.200479 2025
[10]

Financial fraud detection using graph neural networks: a systematic review,

S. Motie and B. Raahemi, “Financial fraud detection using graph neural networks: a systematic review,”Expert Syst. Appl., vol. 240, p. 122156,
[11]

doi: 10.1016/j.eswa.2023.122156

work page doi:10.1016/j.eswa.2023.122156 2023
[12]

Internet Crime Report 2023,

FBI Internet Crime Complaint Center (IC3), “Internet Crime Report 2023,” Federal Bureau of Investigation, Washington, DC, 2024. [On- line]. Available: https://www.ic3.gov/Media/PDF/AnnualReport/2023 IC3Report.pdf

2023
[13]

What is business email compromise (BEC)?

Palo Alto Networks Unit 42, “What is business email compromise (BEC)?” 2024. [Online]. Avail- able: https://www.paloaltonetworks.com/cyberpedia/ what-is-business-email-compromise-bec-tactics-and-prevention

2024
[14]

Network intrusion datasets: a survey, limitations, and recommendations,

P. Goldschmidt and D. Chud ´a, “Network intrusion datasets: a survey, limitations, and recommendations,”Computers & Security, vol. 156, p. 104510, 2025. doi: 10.1016/j.cose.2025.104510

work page doi:10.1016/j.cose.2025.104510 2025

[1] [1]

Long short-term memory

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997

[2] [2]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” inProc. 8th IEEE Int. Conf. Data Mining (ICDM), Pisa, Italy, 2008, pp. 413–422. doi: 10.1109/ICDM.2008.17

work page doi:10.1109/icdm.2008.17 2008

[3] [3]

A survey of network anomaly detection techniques,

M. Ahmed, A. N. Mahmood, and J. Hu, “A survey of network anomaly detection techniques,”J. Netw. Comput. Appl., vol. 60, pp. 19–31, 2016. doi: 10.1016/j.jnca.2015.11.016

work page doi:10.1016/j.jnca.2015.11.016 2016

[4] [4]

Anomaly detection: A survey.ACM Computing Surveys, 41(3):1–58, 2009

V . Chandola, A. Banerjee, and V . Kumar, “Anomaly detection: a survey,”ACM Comput. Surv., vol. 41, no. 3, p. 15, 2009. doi: 10.1145/1541880.1541882

work page doi:10.1145/1541880.1541882 2009

[5] [5]

A finan- cial fraud detection model based on LSTM deep learning tech- nique,

Y . Alghofaili, A. Albattah, and M. A. Rassam, “A finan- cial fraud detection model based on LSTM deep learning tech- nique,”J. Appl. Secur. Res., vol. 15, no. 4, pp. 498–516, 2020. doi: 10.1080/19361610.2020.1815491

work page doi:10.1080/19361610.2020.1815491 2020

[6] [6]

Feature engineering strategies for credit card fraud detection,

A. C. Bahnsen, D. Aouada, A. Stojanovic, and B. Ottersten, “Feature engineering strategies for credit card fraud detection,”Expert Syst. Appl., vol. 51, pp. 134–142, 2016. doi: 10.1016/j.eswa.2015.12.030

work page doi:10.1016/j.eswa.2015.12.030 2016

[7] [7]

Anti-money laundering in Bitcoin: ex- perimenting with graph convolutional networks for financial forensics,

M. Weber, G. Domeniconi, J. Chen, D. K. I. Weidele, C. Bellei, T. Robinson, and C. E. Leiserson, “Anti-money laundering in Bitcoin: ex- perimenting with graph convolutional networks for financial forensics,” inProc. KDD 2019 Workshop FinancialCrime, 2019

2019

[8] [8]

Semi-supervised classification with graph convolutional networks,

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” inProc. 5th Int. Conf. Learning Representa- tions (ICLR), Toulon, France, 2017

2017

[9] [9]

Finding money launderers using heterogeneous graph neural networks,

W. W. Lo, G. Opedal, and T. Verdonck, “Finding money launderers using heterogeneous graph neural networks,”Intelligent Systems with Applications, vol. 25, 2025. doi: 10.1016/j.iswa.2025.200479

work page doi:10.1016/j.iswa.2025.200479 2025

[10] [10]

Financial fraud detection using graph neural networks: a systematic review,

S. Motie and B. Raahemi, “Financial fraud detection using graph neural networks: a systematic review,”Expert Syst. Appl., vol. 240, p. 122156,

[11] [11]

doi: 10.1016/j.eswa.2023.122156

work page doi:10.1016/j.eswa.2023.122156 2023

[12] [12]

Internet Crime Report 2023,

FBI Internet Crime Complaint Center (IC3), “Internet Crime Report 2023,” Federal Bureau of Investigation, Washington, DC, 2024. [On- line]. Available: https://www.ic3.gov/Media/PDF/AnnualReport/2023 IC3Report.pdf

2023

[13] [13]

What is business email compromise (BEC)?

Palo Alto Networks Unit 42, “What is business email compromise (BEC)?” 2024. [Online]. Avail- able: https://www.paloaltonetworks.com/cyberpedia/ what-is-business-email-compromise-bec-tactics-and-prevention

2024

[14] [14]

Network intrusion datasets: a survey, limitations, and recommendations,

P. Goldschmidt and D. Chud ´a, “Network intrusion datasets: a survey, limitations, and recommendations,”Computers & Security, vol. 156, p. 104510, 2025. doi: 10.1016/j.cose.2025.104510

work page doi:10.1016/j.cose.2025.104510 2025