Detecting Localized Density Anomalies in Multivariate Data via Coin-Flip Statistics
Pith reviewed 2026-05-22 22:37 UTC · model grok-4.3
The pith
EagleEye detects local density anomalies by testing whether k-nearest-neighbor membership sequences follow a binomial distribution.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EagleEye assigns each point an anomaly score by encoding its ordered k-nearest-neighbour list as a binary membership sequence and testing whether the cumulative number of successes in this sequence is consistent with a binomial null model. In the presence of a genuine local anomaly, neighbours will preferentially belong to one of the two datasets, yielding an excess of successes relative to the binomial null model. These local, pointwise detections are consolidated into interpretable anomaly sets through a deterministic refinement procedure that can also estimate the irreducible background and local density anomaly purity.
What carries the argument
Binary membership sequence of ordered k-nearest neighbors tested against a binomial null model for excess successes.
If this is right
- It can identify new physics signals at particle colliders even with systematic background differences.
- It reveals localized spatiotemporal changes in climate temperature patterns.
- It provides both pointwise anomaly scores and consolidated anomaly sets with purity estimates.
- It works on artificial data with known localized densities.
Where Pith is reading between the lines
- The method could extend to detecting anomalies in single datasets by comparing to a reference distribution.
- Refinement procedure might be useful for estimating anomaly sizes in other density-based methods.
- Binomial testing on neighbor sequences may generalize to other local statistics in high-dimensional data.
Load-bearing premise
In the absence of a local anomaly the sequence of kNN dataset memberships behaves exactly like independent coin flips with probability set by the global sample fractions.
What would settle it
Applying EagleEye to two samples drawn from identical distributions with no local differences should produce anomaly scores consistent with the binomial null model, with no excess detections after correction.
Figures
read the original abstract
Detecting localized differences between two samples is a central task in scientific data analysis, required for the identification of signal events, regime changes, or model mismatch. We introduce EagleEye, a method that pinpoints local over- and under-densities in multivariate feature spaces. EagleEye assigns each point an anomaly score by encoding its ordered k-nearest-neighbour list as a binary membership sequence and testing whether the cumulative number of successes in this sequence is consistent with a binomial (coin-flipping) null model. In the presence of a genuine local anomaly, neighbours will preferentially belong to one of the two datasts, yielding an excess of ``successes'' relative to the binomial null model. These local, pointwise detections are consolidated into interpretable anomaly sets through a deterministic refinement procedure that can also estimate the irreducible background and local density anomaly purity. We demonstrate EagleEye's efficacy in three scenarios. We first consider an artificial data example with known localized over- and under-densities. Second, we demonstrate how EagleEye may be used for new physics searches at particle collider experiments in the presence of systematic background modelling differences. Finally, we conduct a climate analysis study that reveals localized changes in spatiotemporal temperature-pattern recurrence.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces EagleEye, a method for detecting localized over- and under-densities between two multivariate samples. Each point receives an anomaly score by encoding its ordered k-nearest-neighbor list as a binary membership sequence and testing the cumulative successes against a binomial null model with success probability equal to the global dataset fraction; significant deviations flag anomalies, which are then consolidated via a deterministic refinement procedure that also estimates background and purity. The approach is demonstrated on synthetic data with known anomalies, particle-collider new-physics searches, and climate temperature-pattern analysis.
Significance. If the central construction is valid, EagleEye would supply an interpretable, computationally lightweight pointwise anomaly detector that directly yields statistical significance statements and consolidated anomaly sets, with potential utility in scientific domains requiring localized density comparisons. The binomial framing is simple and avoids explicit density estimation, which is a strength if the null is appropriately calibrated.
major comments (2)
- [Abstract and method description] Abstract and central construction of the anomaly score: the binomial null model treats the ordered kNN label sequence as i.i.d. Bernoulli trials with success probability equal to the global fraction. Under the global null, labels are sampled without replacement from a finite combined pool, so the exact distribution of the cumulative count is hypergeometric (or a distance-ordered variant thereof); the binomial therefore understates null variability and miscalibrates per-point p-values. This assumption is load-bearing for all downstream anomaly scores and refinement steps. The artificial-data demonstration does not expose the mismatch because the simulated densities are spatially uniform.
- [Demonstrations] Demonstrations (all three scenarios): no quantitative performance metrics, error bars, false-positive rates, or comparisons against baselines are reported. Without these, it is impossible to verify that the detected anomalies correspond to the claimed localized density differences rather than artifacts of the binomial approximation.
minor comments (1)
- [Abstract] The abstract uses inconsistent quotation marks around “successes”; standardize notation for the binary sequence and cumulative count throughout.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address the two major comments point by point below, indicating where revisions will be made.
read point-by-point responses
-
Referee: [Abstract and method description] Abstract and central construction of the anomaly score: the binomial null model treats the ordered kNN label sequence as i.i.d. Bernoulli trials with success probability equal to the global fraction. Under the global null, labels are sampled without replacement from a finite combined pool, so the exact distribution of the cumulative count is hypergeometric (or a distance-ordered variant thereof); the binomial therefore understates null variability and miscalibrates per-point p-values. This assumption is load-bearing for all downstream anomaly scores and refinement steps. The artificial-data demonstration does not expose the mismatch because the simulated densities are spatially uniform.
Authors: We agree that the exact null distribution is hypergeometric (or a distance-ordered variant) because labels are drawn without replacement. The binomial model is used as a computationally convenient approximation that becomes accurate for large N, which is the regime of the intended applications. The ordering by distance further modifies the exact distribution, but the binomial still provides a conservative ranking of anomalies. We will add an explicit discussion of this approximation, its validity conditions, and a small-scale simulation comparing binomial versus hypergeometric p-values in the revised manuscript. revision: partial
-
Referee: [Demonstrations] Demonstrations (all three scenarios): no quantitative performance metrics, error bars, false-positive rates, or comparisons against baselines are reported. Without these, it is impossible to verify that the detected anomalies correspond to the claimed localized density differences rather than artifacts of the binomial approximation.
Authors: The three demonstrations are chosen to illustrate applicability across scientific domains rather than to serve as a benchmark study. We acknowledge that the absence of quantitative metrics, error bars, and baseline comparisons limits the ability to assess performance rigorously. In the revision we will add (i) a quantitative synthetic benchmark with known ground-truth anomalies, reporting precision-recall and false-positive rates with error bars over multiple realizations, and (ii) comparisons against standard baselines (local density ratio, LOF, and isolation forest) on the same data. revision: yes
Circularity Check
No circularity: binomial null is external reference, not self-derived
full rationale
The paper constructs the per-point anomaly score by encoding the ordered kNN membership sequence as a binary string and comparing the cumulative successes to a binomial(k, p_global) null, where p_global is the overall dataset fraction. This null model is an external statistical benchmark drawn from standard Bernoulli-trial assumptions rather than any fitted parameter or self-referential equation extracted from the local neighborhood itself. No derivation step reduces the anomaly score to a quantity already determined by the input data labels or by a self-citation chain; the subsequent refinement into anomaly sets is a deterministic post-processing rule that does not feed back into the score definition. The method therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- k (number of nearest neighbors)
- significance threshold for binomial test
axioms (1)
- domain assumption Under the null of no local anomaly, each neighbor's dataset membership is an independent Bernoulli trial with success probability equal to the global proportion of points from each dataset.
Forward citations
Cited by 1 Pith paper
-
Unsupervised Domain Shift Detection with Interpretable Subspace Attribution
An unsupervised method detects domain shifts via localized density anomaly search in feature space, attributes the shift to a minimal subspace, and extracts balanced subsets from two unlabeled datasets.
Reference graph
Works this paper leans on
-
[1]
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Sch¨ olkopf, and Alexander Smola. A kernel two-sample test. Journal of Machine Learning Research , 13(25):723–773, 2012
work page 2012
-
[2]
Testing for equal distributions in high dimension
G´ abor J Sz´ ekely, Maria L Rizzo, et al. Testing for equal distributions in high dimension. InterStat, 5(16.10):1249–1272, 2004
work page 2004
-
[3]
G´ abor J. Sz´ ekely and Maria L. Rizzo. Energy statistics: A class of statistics based on distances. Journal of Statistical Planning and Inference , 143(8):1249–1272, 2013
work page 2013
-
[4]
Jerome H. Friedman and Lawrence C. Rafsky. Multivariate generalizations of the wald-wolfowitz and smirnov two-sample tests. The Annals of Statistics , 7(4):697–717, 1979
work page 1979
- [5]
-
[6]
Anomaly detection with density estimation
Benjamin Nachman and David Shih. Anomaly detection with density estimation. Phys. Rev. D, 101:075042, Apr 2020
work page 2020
-
[7]
Classifying anomalies through outer density estimation
Anna Hallin, Joshua Isaacson, Gregor Kasieczka, Claudius Krause, Benjamin Nachman, Tobias Quadfasel, Matthias Schlaffer, David Shih, and Manuel Sommerhalder. Classifying anomalies through outer density estimation. Phys. Rev. D , 106:055006, Sep 2022
work page 2022
-
[8]
Aaditya Ramdas, Nestor G. Trillos, and Marco Cuturi. On wasserstein two-sample testing and related families of nonparametric tests. Entropy, 19(2):47, 2015
work page 2015
-
[9]
Pablo Lemos, Sammy Sharief, Nikolay Malkin, Laurence Perreault-Levasseur, and Yashar Heza- veh. Pqmass: Probabilistic assessment of the quality of generative models using probability mass estimation. arXiv preprint, 2024
work page 2024
-
[10]
Lukas Ruff, Jens R. Kauffmann, Robert A. Vandermeulen, Gr´ egoire Montavon, Klaus-Robert M¨ uller, and Marius Kloft. A unifying review of deep and shallow anomaly detection.Proceedings of the IEEE , 109(5):756–795, 2021
work page 2021
-
[11]
Automatic topography of high-dimensional data sets by non-parametric density peak clustering
Maria d’Errico, Elena Facco, Alessandro Laio, and Alex Rodriguez. Automatic topography of high-dimensional data sets by non-parametric density peak clustering. Information Sciences , 560:476–492, 2021
work page 2021
-
[12]
Dadapy: Distance-based analysis of data- manifolds in python
Aldo Glielmo, Iuri Macocco, Diego Doimo, Matteo Carli, Claudio Zeni, Romina Wild, Maria d’Errico, Alex Rodriguez, and Alessandro Laio. Dadapy: Distance-based analysis of data- manifolds in python. Patterns, page 100589, 2022
work page 2022
-
[13]
R. Fr¨ uhwirth and R. K. Bock. Data analysis techniques for high-energy physics experiments , volume 11. Cambridge University Press, 2000
work page 2000
-
[14]
G. Cowan. Statistical Data Analysis . Oxford science publications. Clarendon Press, 1998
work page 1998
-
[15]
Lyndon Evans and Philip Bryant. Lhc machine. Journal of Instrumentation , 3(08):S08001, 2008
work page 2008
-
[16]
Peter W. Higgs. Broken symmetries and the masses of gauge bosons. Physical Review Letters , 13(16):508–509, 1964
work page 1964
-
[17]
ATLAS Collaboration. Observation of a new particle in the search for the standard model higgs boson with the atlas detector at the lhc. Physics Letters B , 716(1):1–29, 2012
work page 2012
-
[18]
Observation of a new boson at a mass of 125 gev with the cms experiment at the lhc
CMS Collaboration. Observation of a new boson at a mass of 125 gev with the cms experiment at the lhc. Physics Letters B , 716(1):30–61, 2012
work page 2012
-
[19]
The lhc olympics 2020 a commu- nity challenge for anomaly detection in high energy physics
Gregor Kasieczka, Benjamin Nachman, David Shih, Oz Amram, Anders Andreassen, Kees Benk- endorfer, Blaz Bortolato, Gustaaf Brooijmans, Florencia Canelli, Jack H Collins, Biwei Dai, Felipe F De Freitas, Barry M Dillon, Ioan-Mihail Dinu, Zhongtian Dong, Julien Donini, Javier Duarte, D A Faroughy, Julia Gonski, Philip Harris, Alan Kahn, Jernej F Kamenik, Char...
work page 2020
-
[20]
The motivation and status of two-body resonance decays after the lhc run 2 and beyond
Jeong Han Kim, Kyoungchul Kong, Benjamin Nachman, and Daniel Whiteson. The motivation and status of two-body resonance decays after the lhc run 2 and beyond. Journal of High Energy Physics, 2020(4), April 2020
work page 2020
-
[21]
Matteo Cacciari, Gavin P. Salam, and Gregory Soyez. The anti-k(t) jet clustering algorithm. Journal of High Energy Physics , 2008(04):063, 2008
work page 2008
-
[22]
Collins, Pablo Mart´ ın-Ramiro, Benjamin Nachman, and David Shih
Jack H. Collins, Pablo Mart´ ın-Ramiro, Benjamin Nachman, and David Shih. Comparing weak- and unsupervised methods for resonant anomaly detection. European Physical Journal C , 81(7):617, 2021
work page 2021
-
[23]
Identifying boosted objects with n-subjettiness
Jesse Thaler and Ken Van Tilburg. Identifying boosted objects with n-subjettiness. Journal of High Energy Physics , 2011(03):015, 2011
work page 2011
-
[24]
Blaˇ z Bortolato, Aleks Smolkoviˇ c, Barry M. Dillon, and Jernej F. Kamenik. Bump hunting in latent space. Physical Review D , 105(11):115009, 2022
work page 2022
-
[25]
Modern machine learning for lhc physicists
Tilman Plehn, Anja Butter, Barry Dillon, Theo Heimel, Claudius Krause, and Ramon Winter- halder. Modern machine learning for lhc physicists. arXiv, 2024
work page 2024
-
[26]
Asymptotic formulae for likelihood- based tests of new physics
Glen Cowan, Kyle Cranmer, Eilam Gross, and Ofer Vitells. Asymptotic formulae for likelihood- based tests of new physics. Eur. Phys. J. C , 71:1554, 2011. [Erratum: Eur.Phys.J.C 73, 2501 (2013)]
work page 2011
-
[27]
NOAA Physical Sciences Laboratory. Ncep-ncar reanalysis 1, 2024
work page 2024
-
[28]
Sebastian Springer et al. Unsupervised detection of large-scale weather patterns in the northern hemisphere via markov state modelling: from blockings to teleconnections. npj Climate and Atmospheric Science, 7(105), 2024
work page 2024
-
[29]
The teacher’s corner: A comparison of binomial approximations to the hypergeometric distribution
HD Brunk, James E Holstein, and Frederick Williams. The teacher’s corner: A comparison of binomial approximations to the hypergeometric distribution. The American Statistician , 22(1):24–26, 1968
work page 1968
-
[30]
M. Tanner. Shorter signals for improved signal to noise ratio, the influence of poisson distribu- tion. Journal of Analytical Atomic Spectrometry , 25:405–407, 2010
work page 2010
-
[31]
DS4ASTRO: Data Science methods for Multi-Messenger Astrophysics & Multi-Survey Cosmology
Jason Gallicchio and Matthew D. Schwartz. Quark and Gluon Jet Substructure. JHEP, 04:090, 2013. 16 Acknowledgements AS was partially supported by the grant “DS4ASTRO: Data Science methods for Multi-Messenger Astrophysics & Multi-Survey Cosmology”, in the framework of the PRO3 ‘Programma Congiunto’ (DM n. 289/2021) of the Italian Ministry for University an...
work page 2013
-
[32]
Impact of Increased Cardinality with Altered Local Density In this scenario, the background sample size |B| is increased by an order of magnitude. Consequently, the anomalous points in the test set become more diluted, simultaneously resulting in an increased cardinality of the repˆ echage set,|Yanom α=1 |. 0 200 400 600 800 1000 Contamination of 0 200 40...
-
[33]
Impact of Increased Cardinality with Unaltered Local Density In the second scenario, the background is augmented by sampling additional points from a separate Gaussian mix- ture. This strategy preserves the local density contrast between the anomalies and the background, thereby maintaining stable detection performance despite the increased overall cardin...
work page 1900
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.