SilIF: Silhouette-Augmented Isolation Forest for Unsupervised Transaction Fraud Detection

Venkatakrishnan Gopalakrishnan

arxiv: 2605.26135 · v1 · pith:373TBZRKnew · submitted 2026-05-21 · 💻 cs.LG

SilIF: Silhouette-Augmented Isolation Forest for Unsupervised Transaction Fraud Detection

Venkatakrishnan Gopalakrishnan This is my paper

Pith reviewed 2026-06-30 16:57 UTC · model grok-4.3

classification 💻 cs.LG

keywords isolation forestsilhouette scorefraud detectionanomaly detectionunsupervised learningtransaction datapath length vectors

0 comments

The pith

SilIF adds a silhouette score from clustered path length vectors to Isolation Forest and raises AUC-PR by 0.008 on real fraud data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SilIF to improve Isolation Forest for spotting fraudulent transactions when labels are unavailable. It extracts each transaction's path lengths across the forest trees as a vector, clusters these vectors into groups, and computes a silhouette score that shows how well the point belongs to its group versus others. This silhouette value is blended with the original Isolation Forest score through a single parameter alpha. Tests on a benchmark of roughly 590,000 real transactions show a modest but consistent gain, while the same step produces no gain on a synthetic dataset.

Core claim

SilIF extracts a vector of per-tree path lengths for each transaction, clusters these fingerprints into structural groups, and computes a silhouette score measuring fit to the assigned group versus the nearest alternative. The silhouette signal is combined with the base Isolation Forest score via a hyperparameter alpha. On the IEEE-CIS Fraud Detection benchmark of approximately 590K transactions, alpha set to 1.0 yields an average AUC-PR gain of 0.0080 over plain Isolation Forest across five seeds, with SilIF ahead on every seed. The paper also reports no gain on a synthetic credit-card dataset and describes conditions that separate the two outcomes.

What carries the argument

silhouette score computed on clustered per-tree path length vectors, which supplies a structural-fit signal combined with the base isolation score

If this is right

SilIF raises AUC-PR by 0.0080 on the IEEE-CIS benchmark and wins on all five seeds tested.
The gain is statistically supported by a paired t-test p-value of 0.046.
No improvement occurs on the synthetic Sparkov credit-card dataset.
A single parameter alpha lets users control how much the silhouette signal contributes.
The method stays as scalable and easy to deploy as standard Isolation Forest.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Real transaction data may contain more varied structural clusters than synthetic data, which is why the silhouette layer adds value only in the former case.
Users can decide whether to apply SilIF by measuring its effect on a held-out portion of their own data.
The same silhouette extraction from path-length vectors could be tested on other tree-based anomaly detectors.

Load-bearing premise

The per-tree path length vectors contain additional structural information that produces a silhouette signal meaningfully independent of the original isolation score.

What would settle it

If the silhouette scores correlate strongly with the base isolation scores or if the average AUC-PR gain disappears on several additional large real fraud datasets, the added layer would not be contributing independent value.

Figures

Figures reproduced from arXiv: 2605.26135 by Venkatakrishnan Gopalakrishnan.

read the original abstract

Unsupervised anomaly detection is widely used in transaction fraud detection where labels are scarce. Isolation Forest (IF) is among the most popular classical methods due to its scalability and ease of deployment. We propose SilIF, an augmentation of Isolation Forest that adds a silhouette-based scoring layer computed in a representation space induced by the trees of the forest. For each point, we extract a vector of per-tree path lengths, cluster these "fingerprints" into structural groups, and compute a silhouette score that measures how well the point fits its assigned group versus the nearest alternative. The silhouette signal is combined with the base IF score via a single hyperparameter alpha. On the IEEE-CIS Fraud Detection benchmark (~590K transactions, 3.5% fraud), SilIF with alpha=1.0 improves over plain Isolation Forest by +0.0080 AUC-PR on average across five seeds, with SilIF winning on all five seeds (paired t-test p=0.046). We also report results on a synthetic credit-card dataset (Sparkov) where the silhouette augmentation does not improve over plain IF, and we characterize the conditions that distinguish the two outcomes. The paper presents SilIF as a tunable, easy-to-deploy enhancement to Isolation Forest with honest reporting of when it helps and when it does not. Code at https://github.com/venkat15vk/silif-anomaly-detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SilIF reports a small consistent +0.008 AUC-PR lift on one fraud set by clustering Isolation Forest path-length vectors and adding a silhouette term, but shows no lift on a second set and leaves the independence of that term untested.

read the letter

The core result is a modest, seed-consistent improvement from the silhouette layer on IEEE-CIS, with an honest negative on Sparkov. The authors release code and flag when the trick fails, which is useful for practitioners.

What is actually new is the specific step of turning per-tree path lengths into fingerprints, clustering them, and feeding the silhouette score into a linear combination controlled by alpha. This is a routine but legitimate extension of Isolation Forest rather than a new theoretical framework. The paper does well by documenting the negative result and by making the implementation available.

The main soft spot is that nothing in the reported experiments shows the silhouette signal carries information orthogonal to the original isolation score. Path lengths are already the raw material for the base score, so the clusters and silhouette could largely restate the same signal. No correlation, residual plot, or ablation is mentioned that would rule this out. The five-seed paired test reaches p=0.046, but the effect size remains tiny and the independence assumption is still doing the heavy lifting.

This work is for applied researchers who already run Isolation Forest on transaction data and want one more knob to tune. Readers chasing large methodological advances will find little here. It is coherent and transparent enough to deserve referee time; the experiments are reproducible and the negative result is reported plainly.

I would send it to review rather than desk-reject, mainly to let referees check the clustering details and ask for an explicit test of whether the added term is redundant.

Referee Report

2 major / 1 minor

Summary. The paper proposes SilIF, an augmentation to Isolation Forest that adds a silhouette-based scoring layer from clustering per-tree path length vectors. The silhouette signal is linearly combined with the base IF score using hyperparameter alpha. On the IEEE-CIS Fraud Detection benchmark, SilIF with alpha=1.0 achieves an average +0.0080 improvement in AUC-PR over plain IF across five seeds, winning on all seeds (paired t-test p=0.046). No improvement is observed on the Sparkov dataset.

Significance. If the silhouette augmentation provides genuinely independent structural information, this represents a lightweight, deployable enhancement to a standard method in unsupervised fraud detection. The release of code and the honest characterization of when the method helps versus when it does not are positive features.

major comments (2)

[Experimental evaluation on IEEE-CIS] The central claim that the silhouette layer augments the isolation score requires that the silhouette score carries information orthogonal to the base IF anomaly score. No correlation analysis between the two scores, nor an ablation study comparing SilIF to IF with a random or null silhouette component, is reported. This leaves open the possibility that the observed gain is due to the linear combination mechanics rather than added structural signal.
[Statistical analysis] The paired t-test with p=0.046 is reported for five seeds. With such a small number of replicates and a modest effect size (+0.008 AUC-PR), the result is sensitive to seed choice; the manuscript should report the individual seed AUC-PR values or bootstrap confidence intervals to substantiate robustness.

minor comments (1)

The abstract mentions 'characterize the conditions that distinguish the two outcomes' but the full text should ensure this characterization is detailed enough for readers to predict applicability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the evidence needed to support the contribution of the silhouette layer. We address each major comment below.

read point-by-point responses

Referee: [Experimental evaluation on IEEE-CIS] The central claim that the silhouette layer augments the isolation score requires that the silhouette score carries information orthogonal to the base IF anomaly score. No correlation analysis between the two scores, nor an ablation study comparing SilIF to IF with a random or null silhouette component, is reported. This leaves open the possibility that the observed gain is due to the linear combination mechanics rather than added structural signal.

Authors: We agree that explicit evidence of orthogonality would strengthen the central claim. In the revised manuscript we will add (i) a Pearson correlation analysis between the base Isolation Forest anomaly scores and the silhouette scores on the IEEE-CIS data and (ii) an ablation in which the silhouette component is replaced by random values sampled from the empirical distribution of silhouette scores. The absence of improvement on the Sparkov dataset already indicates that the gain is not an artifact of the linear combination alone, but we will make this argument explicit with the new experiments. revision: yes
Referee: [Statistical analysis] The paired t-test with p=0.046 is reported for five seeds. With such a small number of replicates and a modest effect size (+0.008 AUC-PR), the result is sensitive to seed choice; the manuscript should report the individual seed AUC-PR values or bootstrap confidence intervals to substantiate robustness.

Authors: We acknowledge that five replicates are modest and that reporting only the mean and p-value limits assessment of robustness. In the revision we will add a table of per-seed AUC-PR values for both methods and will report bootstrap confidence intervals (with 10,000 resamples) for the mean difference in AUC-PR. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical method with no derivation reducing to inputs

full rationale

The paper introduces SilIF as an algorithmic augmentation to Isolation Forest: per-tree path lengths are extracted, clustered, and used to compute a silhouette score that is linearly combined with the base IF score via a single tunable hyperparameter alpha. All claims are empirical (AUC-PR gains on IEEE-CIS, no gain on Sparkov) with explicit reporting of tuning and negative results. No equations, predictions, or first-principles results are presented that reduce by construction to fitted parameters or self-citations. The central result is a benchmark comparison, not a derivation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The method depends on one explicit hyperparameter and one structural assumption about the informativeness of path-length vectors; no new entities are postulated.

free parameters (1)

alpha
Weight blending the silhouette score with the base Isolation Forest score; set to 1.0 for the reported positive result.

axioms (1)

domain assumption Path-length vectors from the forest trees form clusters whose silhouette scores capture anomaly-relevant structure not already encoded in the isolation depth.
Invoked when the silhouette layer is added and when the authors interpret why the method succeeds on one dataset but not the other.

pith-pipeline@v0.9.1-grok · 5776 in / 1297 out tokens · 32519 ms · 2026-06-30T16:57:41.137975+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 2 canonical work pages

[1]

Anomaly detection: A survey,

V . Chandola, A. Banerjee, and V . Kumar, “Anomaly detection: A survey,” ACM Computing Surveys, vol. 41, no. 3, pp. 1–58, 2009

2009
[2]

Fraud dataset benchmark and applications.arXiv preprint arXiv:2208.14417, 2022

P. Grover, J. Xu, J. Tittelfitz, A. Cheng, Z. Li, J. Zablocki, J. Liu, and H. Zhou, “Fraud dataset benchmark and applications,”arXiv preprint arXiv:2208.14417, 2022

work page arXiv 2022
[3]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in2008 Eighth IEEE International Conference on Data Mining. IEEE, 2008, pp. 413– 422

2008
[4]

Isolation-based anomaly detection,

——, “Isolation-based anomaly detection,”ACM Transactions on Knowledge Discovery from Data, vol. 6, no. 1, pp. 1–39, 2012

2012
[5]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,

P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,”Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987

1987
[6]

IEEE- CIS fraud detection,

IEEE Computational Intelligence Society and Vesta Corporation, “IEEE- CIS fraud detection,” Kaggle Competition, 2019, https://www.kaggle. com/c/ieee-fraud-detection

2019
[7]

Credit card transactions fraud detection dataset,

K. Shenoy, “Credit card transactions fraud detection dataset,” Kaggle Dataset, generated with Sparkov simulator, 2020, https://www.kaggle. com/datasets/kartik2112/fraud-detection

2020
[8]

Sparkov data generation,

B. Harris, “Sparkov data generation,” GitHub repository, 2019, https: //github.com/namebrandon/Sparkov Data Generation

2019
[9]

Extended isolation for- est,

S. Hariri, M. Carrasco Kind, and R. J. Brunner, “Extended isolation for- est,”IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 4, pp. 1479–1489, 2021

2021
[10]

Deep isolation forest for anomaly detection,

H. Xu, G. Pang, Y . Wang, and Y . Wang, “Deep isolation forest for anomaly detection,” inIEEE Transactions on Knowledge and Data Engineering, 2023

2023
[11]

Improved anomaly detection by using the attention-based isolation forest,

L. Utkin, A. Ageev, A. Konstantinov, and V . Muliukha, “Improved anomaly detection by using the attention-based isolation forest,”Algo- rithms, vol. 16, no. 1, p. 19, 2022

2022
[12]

Robust random cut forest based anomaly detection on streams,

S. Guha, N. Mishra, G. Roy, and O. Schrijvers, “Robust random cut forest based anomaly detection on streams,” inInternational Conference on Machine Learning. PMLR, 2016, pp. 2712–2721

2016
[13]

Lof: Identifying density-based local outliers,

M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: Identifying density-based local outliers,” inProceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 93–104

2000
[14]

Discovering cluster-based local outliers,

Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,” Pattern Recognition Letters, vol. 24, no. 9-10, pp. 1641–1650, 2003

2003
[15]

Applied machine learning to anomaly detection in enterprise purchase processes,

A. Herreros-Mart ´ınez, R. Magdalena-Benedicto, J. Vila-Franc ´es, A. J. Serrano-L´opez, and S. P ´erez-D´ıaz, “Applied machine learning to anomaly detection in enterprise purchase processes,”arXiv preprint arXiv:2405.14754, 2024

work page arXiv 2024
[16]

Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,

M. Goldstein and A. Dengel, “Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,” inKI-2012: Poster and Demo Track, 2012, pp. 59–63

2012
[17]

Ecod: Unsupervised outlier detection using empirical cumulative distribution functions,

Z. Li, Y . Zhao, N. Botta, C. Ionescu, and X. Hu, “Ecod: Unsupervised outlier detection using empirical cumulative distribution functions,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 181–12 193, 2023

2023
[18]

Efficient algorithms for mining outliers from large data sets,

S. Ramaswamy, R. Rastogi, and K. Shim, “Efficient algorithms for mining outliers from large data sets,” inProceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 427–438

2000
[19]

Deep learning for anomaly detection: A review,

G. Pang, C. Shen, L. Cao, and A. V . D. Hengel, “Deep learning for anomaly detection: A review,”ACM Computing Surveys, vol. 54, no. 2, pp. 1–38, 2021

2021
[20]

ADBench: Anomaly detection benchmark,

S. Han, X. Hu, H. Huang, M. Jiang, and Y . Zhao, “ADBench: Anomaly detection benchmark,” inAdvances in Neural Information Processing Systems, 2022

2022
[21]

Web-scale k-means clustering,

D. Sculley, “Web-scale k-means clustering,” inProceedings of the 19th International Conference on World Wide Web, 2010, pp. 1177–1178

2010

[1] [1]

Anomaly detection: A survey,

V . Chandola, A. Banerjee, and V . Kumar, “Anomaly detection: A survey,” ACM Computing Surveys, vol. 41, no. 3, pp. 1–58, 2009

2009

[2] [2]

Fraud dataset benchmark and applications.arXiv preprint arXiv:2208.14417, 2022

P. Grover, J. Xu, J. Tittelfitz, A. Cheng, Z. Li, J. Zablocki, J. Liu, and H. Zhou, “Fraud dataset benchmark and applications,”arXiv preprint arXiv:2208.14417, 2022

work page arXiv 2022

[3] [3]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in2008 Eighth IEEE International Conference on Data Mining. IEEE, 2008, pp. 413– 422

2008

[4] [4]

Isolation-based anomaly detection,

——, “Isolation-based anomaly detection,”ACM Transactions on Knowledge Discovery from Data, vol. 6, no. 1, pp. 1–39, 2012

2012

[5] [5]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,

P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,”Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987

1987

[6] [6]

IEEE- CIS fraud detection,

IEEE Computational Intelligence Society and Vesta Corporation, “IEEE- CIS fraud detection,” Kaggle Competition, 2019, https://www.kaggle. com/c/ieee-fraud-detection

2019

[7] [7]

Credit card transactions fraud detection dataset,

K. Shenoy, “Credit card transactions fraud detection dataset,” Kaggle Dataset, generated with Sparkov simulator, 2020, https://www.kaggle. com/datasets/kartik2112/fraud-detection

2020

[8] [8]

Sparkov data generation,

B. Harris, “Sparkov data generation,” GitHub repository, 2019, https: //github.com/namebrandon/Sparkov Data Generation

2019

[9] [9]

Extended isolation for- est,

S. Hariri, M. Carrasco Kind, and R. J. Brunner, “Extended isolation for- est,”IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 4, pp. 1479–1489, 2021

2021

[10] [10]

Deep isolation forest for anomaly detection,

H. Xu, G. Pang, Y . Wang, and Y . Wang, “Deep isolation forest for anomaly detection,” inIEEE Transactions on Knowledge and Data Engineering, 2023

2023

[11] [11]

Improved anomaly detection by using the attention-based isolation forest,

L. Utkin, A. Ageev, A. Konstantinov, and V . Muliukha, “Improved anomaly detection by using the attention-based isolation forest,”Algo- rithms, vol. 16, no. 1, p. 19, 2022

2022

[12] [12]

Robust random cut forest based anomaly detection on streams,

S. Guha, N. Mishra, G. Roy, and O. Schrijvers, “Robust random cut forest based anomaly detection on streams,” inInternational Conference on Machine Learning. PMLR, 2016, pp. 2712–2721

2016

[13] [13]

Lof: Identifying density-based local outliers,

M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: Identifying density-based local outliers,” inProceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 93–104

2000

[14] [14]

Discovering cluster-based local outliers,

Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,” Pattern Recognition Letters, vol. 24, no. 9-10, pp. 1641–1650, 2003

2003

[15] [15]

Applied machine learning to anomaly detection in enterprise purchase processes,

A. Herreros-Mart ´ınez, R. Magdalena-Benedicto, J. Vila-Franc ´es, A. J. Serrano-L´opez, and S. P ´erez-D´ıaz, “Applied machine learning to anomaly detection in enterprise purchase processes,”arXiv preprint arXiv:2405.14754, 2024

work page arXiv 2024

[16] [16]

Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,

M. Goldstein and A. Dengel, “Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,” inKI-2012: Poster and Demo Track, 2012, pp. 59–63

2012

[17] [17]

Ecod: Unsupervised outlier detection using empirical cumulative distribution functions,

Z. Li, Y . Zhao, N. Botta, C. Ionescu, and X. Hu, “Ecod: Unsupervised outlier detection using empirical cumulative distribution functions,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 181–12 193, 2023

2023

[18] [18]

Efficient algorithms for mining outliers from large data sets,

S. Ramaswamy, R. Rastogi, and K. Shim, “Efficient algorithms for mining outliers from large data sets,” inProceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 427–438

2000

[19] [19]

Deep learning for anomaly detection: A review,

G. Pang, C. Shen, L. Cao, and A. V . D. Hengel, “Deep learning for anomaly detection: A review,”ACM Computing Surveys, vol. 54, no. 2, pp. 1–38, 2021

2021

[20] [20]

ADBench: Anomaly detection benchmark,

S. Han, X. Hu, H. Huang, M. Jiang, and Y . Zhao, “ADBench: Anomaly detection benchmark,” inAdvances in Neural Information Processing Systems, 2022

2022

[21] [21]

Web-scale k-means clustering,

D. Sculley, “Web-scale k-means clustering,” inProceedings of the 19th International Conference on World Wide Web, 2010, pp. 1177–1178

2010