pith. machine review for the scientific record. sign in

arxiv: 2604.12241 · v1 · submitted 2026-04-14 · 💻 cs.DC

Recognition: unknown

BlazingAML: High-Throughput Anti-Money Laundering (AML) via Multi-Stage Graph Mining

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:58 UTC · model grok-4.3

classification 💻 cs.DC
keywords anti-money launderinggraph miningmulti-stage patternsdomain-specific compilerhigh-performance computingfinancial fraud detectionGPU accelerationscalable analytics
0
0 comments X

The pith

BlazingAML detects money laundering patterns with the same accuracy as prior methods but runs 210x faster on CPUs and 333x faster on GPUs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Money laundering schemes in financial networks often involve fuzzy structural and temporal variations that are hard to capture without either missing cases or incurring high computational costs. BlazingAML introduces a multi-stage framework that breaks complex schemes into logical stages linked by standard graph operations, letting analysts express many pattern variants through a small set of unified primitives. A domain-specific compiler then converts these high-level descriptions directly into optimized parallel code for CPU and GPU hardware, removing the need for manual low-level programming. On IBM AML datasets the system matches the F1 scores of existing approaches while delivering the reported speedups and better scaling to larger inputs.

Core claim

The paper claims that a multi-stage graph abstraction for money laundering patterns, paired with a compiler that maps high-level descriptions to efficient CPU and GPU code, enables accurate detection without exhaustive pattern enumeration and achieves substantial throughput gains over state-of-the-art methods.

What carries the argument

The multi-stage framework that decomposes laundering schemes into logical stages connected by graph operations, together with the domain-specific compiler that applies optimizations and generates parallel code.

If this is right

  • Financial analysts can specify detection rules at a high level without enumerating variants or writing parallel code.
  • AML systems can handle significantly larger transaction volumes in the same time window.
  • Detection pipelines become easier to maintain and update when new pattern families appear.
  • The same abstraction and compiler approach could reduce engineering effort for other graph-based monitoring tasks.
  • Real-time or near-real-time screening becomes feasible on commodity hardware clusters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The multi-stage idea may transfer to other domains that require detecting variable graph patterns, such as supply-chain fraud or network intrusion.
  • Compiler-generated code could be further extended to heterogeneous systems that include FPGAs or specialized accelerators.
  • If the framework generalizes, regulators might adopt standardized high-level pattern languages for compliance reporting.
  • Scalability gains suggest the method could support continent-scale transaction graphs if data partitioning is added.

Load-bearing premise

Complex laundering patterns that vary in structure and timing can be expressed completely through a fixed set of multi-stage graph primitives without losing detection accuracy or requiring the compiler to produce incorrect code.

What would settle it

A concrete laundering scheme that cannot be represented in the multi-stage framework or that produces measurably lower F1 scores when run through the generated code compared with a manually optimized baseline.

Figures

Figures reproduced from arXiv: 2604.12241 by Arjun Laxman, Haojie Ye, Krisztian Flautner, Nishil Talati, Yichao Yuan.

Figure 1
Figure 1. Figure 1: Overview and contributions of BlazingAML. due to the fuzzy nature of laundering patterns. Money launder￾ing schemes have structural and temporal fuzziness [4]: structural fuzziness involves varying numbers of intermediate accounts with the same topology, while temporal fuzziness allows flexible timing ordering between transactions. These characteristics create two challenges for scalable AML systems: expre… view at source ↗
Figure 2
Figure 2. Figure 2: Representative graph patterns illustrating layering [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Fuzziness illustrated in the scatter-gather money [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Example of (a) scatter-gather and (b) 4-cycle pat [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Compiler-generated pseudo-code for scatter-gather and 4-cycle pattern mining. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: BlazingAML Scatter-Gather pattern mining end-to-end throughput normalized to GFP [4]. LI-Small HI-Small LI-Medium HI-Medium LI-Large HI-Large GM 1 4 16 64 256 Cycle Perf. vs. GFP 64-thread (x) 1.0 10.9 22.8 42.1 68.4 86 113 127 122 122 156 GFP (64 threads) BlazingAML (1 thread) BlazingAML (2 threads) BlazingAML (4 threads) BlazingAML (8 threads) BlazingAML (16 threads) BlazingAML (32 threads) BlazingAML (6… view at source ↗
Figure 7
Figure 7. Figure 7: BlazingAML Cycle pattern mining end-to-end throughput normalized to GFP [4]. LI-Small HI-Small LI-Medium HI-Medium LI-Large HI-Large GM 0.25 1 4 16 64 Fan Perf. vs. GFP 64-thread (x) 1.0 0.6 1.2 2.3 4.4 7.5 10.9 11.4 8.2 6.4 26.7 GFP (64 threads) BlazingAML (1 thread) BlazingAML (2 threads) BlazingAML (4 threads) BlazingAML (8 threads) BlazingAML (16 threads) BlazingAML (32 threads) BlazingAML (64 threads)… view at source ↗
Figure 8
Figure 8. Figure 8: BlazingAML Fan-in and Fan-out pattern mining combined end-to-end throughput normalized to GFP [4]. LI-Small HI-Small LI-Medium HI-Medium LI-Large HI-Large 0 10 20 30 40 Normalized Stack Perf. 1.0 1.8 3.6 7.1 12.9 22.9 25.6 25.8 24.2 33.5 BlazingAML (1 thread) BlazingAML (2 threads) BlazingAML (4 threads) BlazingAML (8 threads) BlazingAML (16 threads) BlazingAML (32 threads) BlazingAML (64 threads) BlazingA… view at source ↗
Figure 9
Figure 9. Figure 9: BlazingAML Stack pattern mining end-to-end throughput normalized to single-thread CPU. BlazingAML less than 8 threads. BlazingAML demonstrates con￾sistent improvement up to 32 threads (11.4×) before experienc￾ing performance degradation at higher thread counts (8.2× at 128 threads). For GPU, the data structure of neighborhood search is further optimized in CUDA to achieve a better speedup for basic pattern… view at source ↗
Figure 10
Figure 10. Figure 10: Scalability study of BlazingAML Scatter-Gather pattern mining throughput normalized to GFP [4] on Trovares [30] 10K – 100M edge dataset. The pattern mining results demonstrate both strong parallel scalability and effective load distribution in the generated code across diverse graph mining workloads. The results confirm that BlazingAML’s domain-specific compiler produces significantly more efficient code … view at source ↗
Figure 11
Figure 11. Figure 11: F1 score of BlazingAML when using different shapes as mining features. The F1 score increases as more features (the number of participating shapes for each transactional edge) are included and then used for training and inference via the XGB Boost library. BlazingAML preserves the output quality of the GFP library while providing a faster and more flexible mining framework [PITH_FULL_IMAGE:figures/full_f… view at source ↗
Figure 12
Figure 12. Figure 12: Performance study of BlazingAML compared with FraudGT. BlazingAML processes 4.9× higher number of edges per second on average. the ML framework difference (feature extension + XGB in [4] and Transformer-based model in [19]) [PITH_FULL_IMAGE:figures/full_fig_p012_12.png] view at source ↗
read the original abstract

Money laundering detection faces challenges due to excessive false positives and inadequate adaptation to sophisticated multi-stage schemes that exploit modern financial networks. Graph analytics and AI are promising tools, but they struggle with the fuzziness of laundering patterns, which exhibit structural and temporal variations. Conventional data mining techniques require the detailed enumeration of pattern variants, which not only complicates the analyst's task to specify them, but also leads to large run-time overheads and difficulty training accurate AI models. The paper presents BlazingAML, a scalable AML system design that introduces: 1. A novel multi-stage framework for expressing fuzzy money laundering patterns 2. A domain-specific compiler that transforms high-level pattern descriptions into high-performance code for CPU and GPU back-ends The multi-stage abstraction decomposes complex laundering schemes into logical stages connected by graph operations, enabling diverse patterns to be expressed using unified primitives while capturing structural and temporal fuzziness. The compiler applies sophisticated optimizations, eliminating manual parallel programming requirements for financial analysts. Evaluation on IBM AML datasets shows BlazingAML achieves the same F1 score as state-of-the-art approaches while delivering 210x and 333x higher speedup on CPU and GPU respectively, with superior scalability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents BlazingAML, a scalable AML detection system that introduces a multi-stage framework for expressing fuzzy money laundering patterns (capturing structural and temporal variations via unified graph primitives) and a domain-specific compiler that generates optimized parallel code for CPU and GPU backends without manual tuning. The central empirical claim is that evaluation on IBM AML datasets yields the same F1 score as state-of-the-art approaches while delivering 210x CPU and 333x GPU speedups with superior scalability.

Significance. If the performance and accuracy claims hold under rigorous evaluation, the work would offer a meaningful practical advance for high-throughput graph analytics in financial networks, addressing the tension between pattern fuzziness and computational cost. The multi-stage abstraction and compiler design could reduce the burden of variant enumeration and parallel programming for analysts.

major comments (2)
  1. [Abstract and Evaluation] Abstract and Evaluation section: The headline result (identical F1 score to SOTA plus 210x/333x speedups) is load-bearing, yet the provided description supplies no dataset statistics (e.g., number of transactions, nodes, edges, or laundering instances), no list of baselines with their parallelization/hardware details, no error bars or statistical significance tests, and no per-pattern accuracy breakdown. This leaves open whether the multi-stage primitives exactly reproduce prior detection power on all fuzzy variants or whether baselines were sequential reference implementations.
  2. [Multi-stage framework] Multi-stage framework description: The claim that the abstraction 'enables diverse patterns to be expressed using unified primitives while capturing structural and temporal fuzziness' without loss of accuracy or need for detailed variant enumeration is central, but no concrete example is given of a specific laundering scheme, its decomposition into stages, the graph operations used, and the resulting compiler output. Without this, it is difficult to assess generality or correctness of the compiler transformations.
minor comments (1)
  1. [Abstract] The abstract states 'superior scalability' without defining the metric (e.g., strong vs. weak scaling, maximum graph size tested) or providing scaling plots/tables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate the requested clarifications and additions.

read point-by-point responses
  1. Referee: [Abstract and Evaluation] Abstract and Evaluation section: The headline result (identical F1 score to SOTA plus 210x/333x speedups) is load-bearing, yet the provided description supplies no dataset statistics (e.g., number of transactions, nodes, edges, or laundering instances), no list of baselines with their parallelization/hardware details, no error bars or statistical significance tests, and no per-pattern accuracy breakdown. This leaves open whether the multi-stage primitives exactly reproduce prior detection power on all fuzzy variants or whether baselines were sequential reference implementations.

    Authors: We agree that the original evaluation section lacked these supporting details, which weakens the presentation of the central claims. In the revised manuscript we have added full IBM AML dataset statistics (transactions, nodes, edges, and laundering instance counts), a complete list of baselines with their parallelization strategies and hardware configurations, error bars from repeated runs, and statistical significance tests (paired t-tests). We also include a per-pattern accuracy breakdown confirming that the multi-stage primitives reproduce prior detection power on fuzzy variants with no loss of accuracy. Baselines were not limited to sequential references; comparisons use appropriately parallelized SOTA implementations on equivalent hardware, and the reported speedups reflect these fair conditions. revision: yes

  2. Referee: [Multi-stage framework] Multi-stage framework description: The claim that the abstraction 'enables diverse patterns to be expressed using unified primitives while capturing structural and temporal fuzziness' without loss of accuracy or need for detailed variant enumeration is central, but no concrete example is given of a specific laundering scheme, its decomposition into stages, the graph operations used, and the resulting compiler output. Without this, it is difficult to assess generality or correctness of the compiler transformations.

    Authors: We acknowledge that the lack of a concrete example hinders assessment of the framework's generality and the compiler's transformations. The revised manuscript now contains a detailed example of a representative multi-stage laundering scheme (a temporal layering pattern exhibiting both structural and temporal fuzziness). It shows the decomposition into stages, the specific unified graph primitives used, how fuzziness is captured without enumerating variants, and excerpts of the resulting compiler-generated optimized code for CPU and GPU backends. This addition directly demonstrates the abstraction's correctness and practical utility. revision: yes

Circularity Check

0 steps flagged

No circularity: system design and empirical evaluation with no derivations or self-referential predictions

full rationale

The paper presents a multi-stage framework and domain-specific compiler for AML pattern detection, followed by empirical performance measurements on IBM datasets. No equations, fitted parameters, or predictive derivations appear in the abstract or described structure. Claims of equivalent F1 scores and speedups are direct experimental results rather than outputs derived from the inputs by construction. Self-citations, if present, are not load-bearing for any uniqueness theorem or ansatz. The work is self-contained as an engineering contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Abstract-only review supplies no explicit free parameters, detailed axioms, or independent evidence for new entities; the framework and compiler are described at high level only.

axioms (1)
  • domain assumption Financial transactions can be modeled as graphs where nodes represent entities and edges represent transfers
    Implicit foundation for the graph mining and multi-stage operations.
invented entities (2)
  • Multi-stage framework no independent evidence
    purpose: Decompose complex laundering schemes into logical stages connected by graph operations to capture fuzziness
    Core novel abstraction claimed in the abstract.
  • Domain-specific compiler no independent evidence
    purpose: Transform high-level pattern descriptions into optimized CPU and GPU code
    Second core contribution claimed in the abstract.

pith-pipeline@v0.9.0 · 5530 in / 1260 out tokens · 86020 ms · 2026-05-10T15:58:18.276669+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 9 canonical work pages

  1. [1]

    Erik Altman, Jovan Blanuša, Luc Von Niederhäusern, Béni Egressy, Andreea Anghel, and Kubilay Atasu. 2024. Realistic synthetic financial transactions for anti-money laundering models.Advances in Neural Information Processing Systems36 (2024)

  2. [2]

    BBC. 2018. Commonwealth Bank offers to pay record fine in laundering case. Accessed: 2025-08-13

  3. [3]

    Jiang Bian, Abdullah Al Arafat, Haoyi Xiong, Jing Li, Li Li, Hongyang Chen, Jun Wang, Dejing Dou, and Zhishan Guo. 2022. Machine learning in real-time Internet of Things (IoT) systems: A survey.IEEE Internet of Things Journal9, 11 (2022), 8364–8386

  4. [4]

    Jovan Blanuša, Maximo Cravero Baraja, Andreea Anghel, Luc von Niederhäusern, Erik Altman, Haris Pozidis, and Kubilay Atasu. 2024. Graph Feature Preprocessor: Real-time Extraction of Subgraph-based Features from Transaction Graphs.arXiv preprint arXiv:2402.08593(2024)

  5. [5]

    Mário Cardoso, Pedro Saleiro, and Pedro Bizarro. 2022. LaundroGraph: Self- supervised graph representation learning for anti-money laundering. InProceed- ings of the Third ACM International Conference on AI in Finance. 130–138

  6. [6]

    Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. InProceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785–794

  7. [7]

    Dawei Cheng, Yujia Ye, Sheng Xiang, Zhenwei Ma, Ying Zhang, and Changjun Jiang. 2023. Anti-money laundering by group-aware deep graph learning.IEEE Transactions on Knowledge and Data Engineering35, 12 (2023), 12444–12457

  8. [8]

    Bruno Deprez, Toon Vanderschueren, Bart Baesens, Tim Verdonck, and Wouter Verbeke. 2025. Network Analytics for Anti-Money Laundering – A Systematic Literature Review and Experimental Evaluation. arXiv:2405.19383 [cs.SI] https: //arxiv.org/abs/2405.19383

  9. [9]

    Ahmad Naser Eddin, Jacopo Bono, David Aparício, David Polido, João Tiago Ascensão, Pedro Bizarro, and Pedro Ribeiro. 2022. Anti-Money Laundering Alert Optimization Using Machine Learning with Graphs. arXiv:2112.07508 [cs.LG] https://arxiv.org/abs/2112.07508

  10. [10]

    Béni Egressy, Luc Von Niederhäusern, Jovan Blanuša, Erik Altman, Roger Wat- tenhofer, and Kubilay Atasu. 2024. Provably powerful graph neural networks for directed multigraphs. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 11838–11846

  11. [11]

    Europol. 2017. From Suspicion to Action – Converting Financial Intelligence into Greater Operational Impact. https://www.europol.europa.eu/publications- documents/suspicion-to-action-converting-financial-intelligence-greater- operational-impact

  12. [12]

    Jiani Fan, Ziyao Liu, Hongyang Du, Jiawen Kang, Dusit Niyato, and Kwok- Yan Lam. 2024. Improving security in IoT-based human activity recognition: a correlation-based anomaly detection approach.IEEE Internet of Things Journal (2024)

  13. [13]

    Jiani Fan, Lwin Khin Shar, Ruichen Zhang, Ziyao Liu, Wenzhuo Yang, Dusit Niyato, Bomin Mao, and Kwok-Yan Lam. 2025. Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and Directions.arXiv preprint arXiv:2503.10058(2025)

  14. [14]

    2022.Virtual assets: red flag indicators

    FATF. 2022.Virtual assets: red flag indicators. Technical Report. Financial Action Task Force (FATF)

  15. [15]

    Kasra Jamshidi, Rakesh Mahadasa, and Keval Vora. 2020. Peregrine: a pattern- aware graph mining system. InProceedings of the Fifteenth European Conference on Computer Systems. 1–16

  16. [16]

    Kasra Jamshidi, Mugilan Mariappan, and Keval Vora. 2022. Anti-vertex for neighborhood constraints in subgraph queries. InProceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). 1–9

  17. [17]

    Fredrik Johannessen and Martin Jullum. 2023. Finding Money Launderers Using Heterogeneous Graph Neural Networks. arXiv:2307.13499 [cs.LG] https://arxiv. org/abs/2307.13499

  18. [18]

    KPMG. 2014. Global Anti-Money Laundering Survey 2014

  19. [19]

    Junhong Lin, Xiaojie Guo, Yada Zhu, Samuel Mitchell, Erik Altman, and Julian Shun. 2024. FraudGT: a simple, effective, and efficient graph transformer for financial fraud detection. InProceedings of the 5th ACM International Conference on AI in Finance. 292–300

  20. [20]

    Wai Weng Lo, Gayan K Kulatilleke, Mohanad Sarhan, Siamak Layeghy, and Marius Portmann. 2023. Inspection-L: self-supervised GNN node embeddings for money laundering detection in bitcoin.Applied Intelligence53, 16 (2023), 19406–19417

  21. [21]

    Patrick Mackey, Katherine Porterfield, Erin Fitzhenry, Sutanay Choudhury, and George Chin. 2018. A chronological edge-driven approach to temporal subgraph isomorphism. In2018 IEEE international conference on big data (big data). IEEE, 3972–3979

  22. [22]

    2025.Cypher Query Language

    Neo4j, Inc. 2025.Cypher Query Language. https://neo4j.com/docs/cypher- manual/current/introduction/ Declarative graph query language

  23. [23]

    Ashwin Paranjape, Austin R Benson, and Jure Leskovec. 2017. Motifs in temporal networks. InProceedings of the tenth ACM international conference on web search and data mining. 601–610

  24. [24]

    Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao Schardl, and Charles Leiserson. 2020. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. InProceedings of the AAAI conference on artificial intelligence, Vol. 34. 5363–5370

  25. [25]

    Stephen Schneider. 2004. Money laundering in Canada: a quantitative analysis of Royal Canadian Mounted Police cases.Journal of Financial Crime11, 3 (2004), 282–291

  26. [26]

    Shivani Singh, Razia Sulthana, Tanvi Shewale, Vinay Chamola, Abderrahim Benslimane, and Biplab Sikdar. 2021. Machine-learning-assisted security and privacy provisioning for edge computing: A survey.IEEE Internet of Things Journal9, 1 (2021), 236–260

  27. [27]

    Kiwhan Song, Mohamed Ali Dhraief, Muhua Xu, Locke Cai, Xuhao Chen, Arvind, and Jie Chen. 2024. Identifying Money Laundering Subgraphs on the Blockchain. arXiv:2410.08394 [cs.LG] https://arxiv.org/abs/2410.08394

  28. [28]

    Nishil Talati, Haojie Ye, Sanketh Vedula, Kuan-Yu Chen, Yuhan Chen, Daniel Liu, Yichao Yuan, David Blaauw, Alex Bronstein, Trevor Mudge, et al. 2022. Mint: An accelerator for mining temporal motifs. In2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1270–1287

  29. [29]

    Maria Paola Tatulli, Tommaso Paladini, Mario D’Onghia, Michele Carminati, and Stefano Zanero. 2023. HAMLET: A transformer based approach for money laundering detection. InInternational Symposium on Cyber Security, Cryptology, and Machine Learning. Springer, 234–250

  30. [30]

    Trovares. 2024. Temporal Triangles xGT Datasets. https://datasets.trovares.com/ synthetic/TT/index.html#pre-generated-datasets. [Accessed 25-12-2024]

  31. [31]

    Aashma Uprety and Danda B Rawat. 2020. Reinforcement learning for iot security: A comprehensive survey.IEEE Internet of Things Journal8, 11 (2020), 8693–8706

  32. [32]

    Mark Weber, Jie Chen, Toyotaro Suzumura, Aldo Pareja, Tengfei Ma, Hiroki Kanezashi, Tim Kaler, Charles E Leiserson, and Tao B Schardl. 2018. Scal- able graph learning for anti-money laundering: A first look.arXiv preprint arXiv:1812.00076(2018)

  33. [33]

    Mark Weber, Giacomo Domeniconi, Jie Chen, Daniel Karl I Weidele, Claudio Bellei, Tom Robinson, and Charles E Leiserson. 2019. Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591(2019)

  34. [34]

    Ernst & Young. 2020. Economic crime in a digital age. https://assets.ey. com/content/dam/ey-sites/ey-com/en_gl/topics/assurance/assurance-pdfs/ey- economic-crime-digital-age.pdf

  35. [35]

    Yichao Yuan, Haojie Ye, Sanketh Vedula Wynn Kaza, and Nishil Talati. 2023. Everest: GPU-Accelerated System For Mining Temporal Motifs.arXiv preprint arXiv:2310.02800(2023)

  36. [36]

    Ruichen Zhang, Hongyang Du, Yinqiu Liu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, and Dong In Kim. 2024. Generative AI agents with large language model for satellite networks via a mixture of experts transmission. IEEE Journal on Selected Areas in Communications(2024)