pith. machine review for the scientific record. sign in

arxiv: 2604.13512 · v1 · submitted 2026-04-15 · ⚛️ nucl-ex · physics.data-an

Recognition: unknown

Physics-driven Comparative Analysis of Various Statistical Distance Metrics and Normalizing Functions

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:25 UTC · model grok-4.3

classification ⚛️ nucl-ex physics.data-an
keywords statistical distance metricsnormalizing functionsKr-83 decayHPGe spectrometerparameter stabilityprobability distributionsHellinger distanceWasserstein distance
0
0 comments X

The pith

Kr-83 decay data shows a dimensionless parameter stays stable under sample and normalization changes for certain distance metrics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares several statistical distance metrics using electron and photon events from Kr-83 decays recorded by an HPGe spectrometer. It defines a dimensionless Parameter of Interest from the resulting probability distributions and checks whether that parameter holds steady when sample size, discretization length, or normalizing functions are altered. The analysis identifies which of the tested metrics preserve this stability and proposes required properties for any normalizing function used in such work. A reader would care because distribution comparisons underpin hypothesis tests, machine learning, and optimization across physics, so knowing which metrics remain reliable under realistic data variations helps avoid artifacts in conclusions.

Core claim

Using electron and photon events from the decay of Kr-83 collected with a high-purity germanium spectrometer under cryo-vacuum conditions, the analysis shows that the dimensionless Parameter of Interest remains stable under changes in sample size, discretization length, and the choice of normalizing function for specific distance metrics among Hellinger, Wasserstein, sqrt(JS), L infinity, Kolmogorov-Smirnov, and Fisher-Rao. The paper also proposes a list of required properties for normalizing functions used in such comparisons.

What carries the argument

The dimensionless Parameter of Interest (PoI) derived from the PDF/PMF of electron and photon events, whose stability is tested against variations in sample size, discretization length, and normalizing functions to rank the distance metrics.

If this is right

  • The PoI remains stable for particular metrics even when sample sizes change.
  • Stability holds across different discretization lengths for the same metrics.
  • Certain normalizing functions preserve PoI consistency better than others.
  • The proposed list of properties can guide selection or design of normalizing functions for reliable distribution comparisons.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same stability test could be applied to data from other nuclear decays or detectors to check whether the same metrics rank highest.
  • In particle physics or imaging applications that rely on comparing binned distributions, preferring these stable metrics might reduce sensitivity to experimental details.
  • The listed properties for normalizing functions could be used to construct new functions optimized for a given detector's energy resolution.

Load-bearing premise

The single chosen PoI and the specific Kr-83 HPGe dataset are representative enough to support general conclusions about the relative merits of the distance metrics.

What would settle it

Repeating the exact PoI stability tests on a different isotope or on Monte Carlo simulated events and finding that the metrics previously reported as stable now produce varying PoI values would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.13512 by Bloomington, IN 47405, Indiana University, Matter, Nafis Fuad (Center for Exploration of Energy, USA).

Figure 1
Figure 1. Figure 1: FIG. 1 [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Examples of signals in HPGe detectors. Each signal consists of samples taken at [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Spectrum of [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Averaged waveforms from populations of two types of signal events considered in this analysis. The distinctiveness of [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. PMFs of the selected electron and photon events. It can be visually noticed the two distributions are disjoint, but not [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6 [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7. Effect of normalization w.r.t. no normalization, calculated by [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: FIG. 8. Standard deviations of the distances measured by all metrics under specific normalization functions. It can be observed [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9. Discretization length (unitless) is varied to investigate the stability of [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: FIG. 10. Effect of sample sizes used to generate the PDF/PMFs to the [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
read the original abstract

Comparison of two probability density/mass functions (PDF/PMFs) is ubiquitous in various forms of scientific analysis, including machine learning, optimization problems, and hypothesis tests. A copious amount of distance metrics have already been proposed and are regularly being used in this regard. In this document, we report a data-driven systematic comparison among a few of such metrics. The metrics considered here are Hellinger distance, Wasserstein distances (1D), $\sqrt{JS}$ distance, $L_\infty$ norm, Kolmogorov-Smirnov distance, and Fisher-Rao metric. We perform this comparison using electron and photon events from a decaying \iso{Kr}{83} isotope, collected through an HPGe spectrometer operating under cryo-vacuum conditions. To accomplish this, first, a dimensionless Parameter of Interest (PoI) was established, then PDF/PMFs were generated from the data, and finally the stabilities of the PoI under various criteria, such as sample size, discretization length, and normalizing functions, were studied and the results were summarized. In this report, we also propose a list of properties that a normalizing function should have and utilize them in the comparison.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript conducts a data-driven comparison of statistical distance metrics (Hellinger distance, 1D Wasserstein distances, √JS distance, L∞ norm, Kolmogorov-Smirnov distance, and Fisher-Rao metric) applied to electron and photon events from Kr-83 decay measured with an HPGe spectrometer. A dimensionless Parameter of Interest (PoI) is defined from the data; PDFs/PMFs are constructed; and the stability of this PoI is examined under changes in sample size, discretization length, and choice of normalizing function. The authors also propose a list of desirable properties for normalizing functions and use them to evaluate the metrics.

Significance. If the reported stabilities hold and the metric ranking generalizes, the work supplies concrete guidance for choosing distance measures in nuclear-spectroscopy analyses where binning and normalization choices are routine. The explicit list of normalizing-function properties adds a modest conceptual contribution. The use of real experimental HPGe data rather than purely synthetic distributions is a positive feature.

major comments (2)
  1. [Results / stability analysis] The stability conclusions for the chosen distance metrics rest exclusively on the single Kr-83 HPGe electron/photon dataset. Because the underlying spectral shape may interact favorably with particular metrics, the ranking cannot yet be treated as general without either (i) tests on additional distributions (multi-modal, continuous non-spectral, or non-nuclear) or (ii) a theoretical argument linking metric properties to PoI invariance independent of the PDF.
  2. [Section on normalizing functions] The proposed list of properties that a normalizing function “should have” is presented without derivation from first principles or verification that the properties are necessary and sufficient outside the specific PoI and spectral shape used here. This makes it difficult to assess whether the properties are broadly prescriptive or post-hoc for the Kr-83 case.
minor comments (2)
  1. Quantitative measures of stability (e.g., standard deviation or range of the PoI across repeated subsamples) should be reported explicitly alongside any qualitative statements; error bars or tabulated values would allow readers to judge the practical significance of observed differences.
  2. The abstract and introduction would benefit from a concise, explicit definition or formula for the dimensionless PoI early in the text so that its construction and claimed independence from the distance metrics can be evaluated without searching later sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and have revised the text to clarify the scope and limitations of the study.

read point-by-point responses
  1. Referee: The stability conclusions for the chosen distance metrics rest exclusively on the single Kr-83 HPGe electron/photon dataset. Because the underlying spectral shape may interact favorably with particular metrics, the ranking cannot yet be treated as general without either (i) tests on additional distributions (multi-modal, continuous non-spectral, or non-nuclear) or (ii) a theoretical argument linking metric properties to PoI invariance independent of the PDF.

    Authors: We agree that the stability analysis and metric comparisons are performed exclusively on the Kr-83 electron and photon spectra from the HPGe detector. The manuscript is framed as a data-driven case study using this experimental dataset, which is representative of nuclear-spectroscopy applications. In the revised version we have added an explicit statement in the discussion and conclusions sections noting that the observed PoI stabilities and relative metric performance are tied to the shape of these particular spectra. We clarify that broader claims of generality would require either additional empirical tests on other distributions or a theoretical derivation, neither of which is provided here. revision: yes

  2. Referee: The proposed list of properties that a normalizing function “should have” is presented without derivation from first principles or verification that the properties are necessary and sufficient outside the specific PoI and spectral shape used here. This makes it difficult to assess whether the properties are broadly prescriptive or post-hoc for the Kr-83 case.

    Authors: The listed properties were identified empirically by examining which normalization choices preserved the stability of the defined PoI across changes in sample size and binning for the Kr-83 data. No first-principles derivation or proof of necessity/sufficiency is given because the list is presented as a practical guide emerging from this specific analysis. In the revised manuscript we have rephrased the relevant section to describe the properties as empirically motivated suggestions for analyses of this type, rather than as generally prescriptive, and we have added a short caveat regarding their potential context dependence. revision: partial

Circularity Check

0 steps flagged

No significant circularity; purely data-driven empirical comparison.

full rationale

The paper describes establishing a dimensionless PoI from Kr-83 HPGe data, generating PDFs/PMFs, and empirically testing PoI stability under changes in sample size, discretization length, and normalizing functions. No equations, derivations, fitted parameters, or self-citations are shown that would make any result equivalent to its inputs by construction. The comparison and proposed properties for normalizing functions are presented as data-driven observations rather than theoretical reductions or self-referential definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the PoI is described as dimensionless but its construction is not given.

pith-pipeline@v0.9.0 · 5522 in / 1157 out tokens · 75453 ms · 2026-05-10T12:25:30.209412+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 20 canonical work pages

  1. [1]

    Hellinger distance [1] H(p, q) = rX 1 2 (√p− √q)2 (1)

  2. [2]

    Wasserstein-1 distance [13, 16, 17] W1(p, q) = X |F −1 p −F −1 q |(2) where,F −1 p andF −1 q are the quantile functions ofpandq

  3. [3]

    Wasserstein-2 distance [14] W2(p, q) = rX F −1p −F −1q 2 (3) where,F −1 p andF −1 q are the quantile functions ofpandq

  4. [4]

    √Jensen-Shannon [5, 9] p J S(p, q) = s X 1 2 plog 2p p+q +qlog 2q p+q (4) 5.L ∞ norm/Chebyshev distance L∞(p, q) = lim r→∞ X |p−q| r 1 r = max(|p−q|)(5)

  5. [5]

    Kolmogorov-Smirnov distance KS(p, q) = max(F p −F q)(6) where,F p andF q are the CDFs ofpandq

  6. [6]

    A few other closed-form results can be found in [18]

    Fisher-Rao distance Closed-form expression of Fisher-Rao distance for a categorical distribution was used. A few other closed-form results can be found in [18]. F R(p, q) = 2 π arccos X √pq (7) 2 where,p(x)andq(x)in are either PDF or PMF in the usual sense -p(x)andp m(x)are PDF and PMF respectively on the parameter spaceχ={x}if they satisfy following prop...

  7. [7]

    1, not equal; e.g

    True domains and ranges of then(x)s are supersets of domains and ranges defined in Def. 1, not equal; e.g. true domain ofn1(x)is(−1,∞)and ofn 2(x)is(−∞,∞)\ {−1}, not[0,∞). 3 10 4 10 2 100 102 104 106 108 1010 x 0.0 0.2 0.4 0.6 0.8 1.0n(x) log(1 + x) 1 + log(1 + x) x 1 + x 1 e x 2 arctan(x) FIG. 1.n(x)∀x:x∈ {−5≤log(x)≤10}. It can be observed thatn 2−4(x)sa...

  8. [8]

    Valls, and Henrik Boström

    Ricardo Aler, José M. Valls, and Henrik Boström. Study of hellinger distance as a splitting metric for random forests in balanced and imbalanced classification datasets.Expert Systems with Applications, 149:113264, 2020. URLhttps: //www.sciencedirect.com/science/article/abs/pii/S0957417420300890

  9. [9]

    Wasserstein GAN

    Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan.arXiv preprint, 2017. URLhttps://arxiv.org/ pdf/1701.07875.pdf

  10. [10]

    Distance measures in genetic algorithms

    Yong-Hyuk Kim and Byung-Ro Moon. Distance measures in genetic algorithms. In Kalyanmoy Deb, editor,Genetic and Evolutionary Computation – GECCO 2004, volume 3103 ofLecture Notes in Computer Science, pages 392–403, Berlin, 10 Heidelberg, 2004. Springer. doi:10.1007/978-3-540-24855-2_43. URLhttps://link.springer.com/chapter/10.1007/ 978-3-540-24855-2_43

  11. [11]

    Divergence measures based on the shannon entropy.IEEE Transactions on Information Theory, 37(1):145–151, 1991

    Jian Lin. Divergence measures based on the shannon entropy.IEEE Transactions on Information Theory, 37(1):145–151, 1991

  12. [12]

    Endres and Johannes E

    Dominik M. Endres and Johannes E. Schindelin. A new metric for probability distributions.IEEE Transactions on Infor- mation Theory, 49(7):1858–1860, 2003. doi:10.1109/TIT.2003.813506. URLhttps://research-repository.st-andrews. ac.uk/bitstream/handle/10023/1591/Endres2003-IEEETransInfTheory49-NewMetric.pdf

  13. [13]

    The divergence and bhattacharyya distance measures in signal selection.IEEE Transactions on Com- munication Technology, 15(1):52–60, feb 1967

    Thomas Kailath. The divergence and bhattacharyya distance measures in signal selection.IEEE Transactions on Com- munication Technology, 15(1):52–60, feb 1967. doi:10.1109/TCOM.1967.1089532

  14. [14]

    Shastry, Mihir Gadgil, and Ayanendranath Basu

    Shivakumar Jolad, Ahmed Roman, Mahesh C. Shastry, Mihir Gadgil, and Ayanendranath Basu. A new family of bounded divergence measures and application to signal detection.arXiv preprint, 2016. URLhttps://arxiv.org/pdf/1201.0418

  15. [15]

    Revisiting chernoff information with likelihood ratio exponential families.arXiv preprint arXiv:2207.03745, aug 2022

    Frank Nielsen. Revisiting chernoff information with likelihood ratio exponential families.arXiv preprint arXiv:2207.03745, aug 2022. URLhttps://arxiv.org/abs/2207.03745. arXiv:2207.03745v4

  16. [16]

    Casas, P

    M. Casas, P. W. Lamberti, A. Plastino, and A. R. Plastino. Jensen-shannon divergence, fisher information, and woot- ters’ hypothesis.arXiv preprint arXiv:quant-ph/0407147v1, July 2004. URLhttps://arxiv.org/pdf/quant-ph/0407147. Retrieved from https://arxiv.org/pdf/quant-ph/0407147

  17. [17]

    Topological data analysis in information space

    Herbert Edelsbrunner, Žiga Virk, and Hubert Wagner. Topological data analysis in information space. In35th International Symposium on Computational Geometry (SoCG 2019), volume 129 ofLeibniz International Proceed- ings in Informatics, pages 31:1–31:14, Dagstuhl, Germany, 2019. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.SoCG.2019...

  18. [18]

    Metrics induced by jensen-shannon and related divergences on positive definite matrices.arXiv preprint arXiv:1911.02643, 2019

    Suvrit Sra. Metrics induced by jensen-shannon and related divergences on positive definite matrices.arXiv preprint arXiv:1911.02643, 2019. URLhttps://arxiv.org/abs/1911.02643

  19. [19]

    On a generalization of the jensen–shannon divergence and the jensen–shannon centroid.Entropy, 22(2): 221, 2020

    Frank Nielsen. On a generalization of the jensen–shannon divergence and the jensen–shannon centroid.Entropy, 22(2): 221, 2020. doi:10.3390/e22020221. URLhttps://www.mdpi.com/1099-4300/22/2/221

  20. [20]

    On wasserstein two sample testing and related families of nonparametric tests.arXiv preprint, 2015

    Aaditya Ramdas, Nicolás García Trillos, and Marco Cuturi. On wasserstein two sample testing and related families of nonparametric tests.arXiv preprint, 2015. URLhttps://arxiv.org/pdf/1509.02237.pdf

  21. [21]

    Wasserstein distributional learning via majorization- minimization

    Chengliang Tang, Nathan Lenssen, Ying Wei, and Tian Zheng. Wasserstein distributional learning via majorization- minimization. InProceedings of the 26th International Conference on Artificial Intelligence and Statistics, volume 206 of Proceedings of Machine Learning Research, pages 1417–1436, Valencia, Spain, 2023. PMLR. URLhttps://proceedings. mlr.press/...

  22. [22]

    Bruno and I

    J. Bruno and I. Weiss. Metric axioms: A structural study.arXiv preprint, 2014. URLhttps://arxiv.org/pdf/1311. 0297.pdf

  23. [23]

    From gan to wgan.arXiv preprint arXiv:1904.08994, 2019

    Lilian Weng. From gan to wgan.arXiv preprint arXiv:1904.08994, 2019. URLhttps://arxiv.org/abs/1904.08994

  24. [24]

    SciPy Developers

    "SciPy Developers". scipy.stats.wasserstein_distance — scipy v1.11.1 manual, 2023. URLhttps://docs.scipy.org/doc/ scipy/reference/generated/scipy.stats.wasserstein_distance.html

  25. [25]

    Miyamoto, Fábio C

    Henrique K. Miyamoto, Fábio C. C. Meneghetti, Julianna Pinele, and Sueli I. R. Costa. On closed-form expressions for the fisher–rao distance.arXiv preprint, 2024. URLhttps://arxiv.org/pdf/2304.14885.pdf

  26. [26]

    Metric transforms yielding gromov hyperbolic spaces.arXiv preprint arXiv:1710.05078v2, July 2018

    George Dragomir and Andrew Nicas. Metric transforms yielding gromov hyperbolic spaces.arXiv preprint arXiv:1710.05078v2, July 2018. URLhttps://arxiv.org/pdf/1710.05078. Retrieved from https://arxiv.org/pdf/1710.05078

  27. [27]

    Niculescu

    Constantin P. Niculescu. Old and new on strongly subadditive/superadditive functions.arXiv preprint, 2025. URL https://arxiv.org/pdf/2501.13695v1. Dedicated to Professor Lars-Erik Persson, on the occasion of his 80th anniversary

  28. [28]

    Radford, Oliver Schulz, and Michael Willers

    Frank Edzards, Lukas Hauertmann, Iris Abt, Chris Gooch, Björn Lehnert, Xiang Liu, Susanne Mertens, David C. Radford, Oliver Schulz, and Michael Willers. Surface characterization of p-type point contact germanium detectors.arXiv preprint,

  29. [29]

    Preprint submitted to Eur

    URLhttps://arxiv.org/pdf/2105.14487.pdf. Preprint submitted to Eur. Phys. J. C

  30. [30]

    P. N. Luke, F. S. Goulding, N. W. Madden, and R. H. Pehl. Low capacitance large volume shaped-field germanium detector. InIEEE Nuclear Science Symposium, Orlando, FL, USA, 1988. Lawrence Berkeley Laboratory report LBL- 25694, CONF-881103-35

  31. [31]

    The majorana demonstrator readout electronics system.Journal of Instrumen- tation, 17(05):T05003, may 2022

    Majorana Collaboration and Abgrall et al. The majorana demonstrator readout electronics system.Journal of Instrumen- tation, 17(05):T05003, may 2022. doi:10.1088/1748-0221/17/05/T05003. URLhttps://dx.doi.org/10.1088/1748-0221/ 17/05/T05003

  32. [32]

    Nudat 3.0: Decay radiation search results for83kr, 2025

    National Nuclear Data Center. Nudat 3.0: Decay radiation search results for83kr, 2025. URLhttps://www.nndc.bnl. gov/nudat3/decaysearchdirect.jsp?nuc=83Kr&unc=nds

  33. [33]

    and Arnquist et al.α-event characterization and rejection in point-contact hpge detectors.Eur

    Gruszko, J. and Arnquist et al.α-event characterization and rejection in point-contact hpge detectors.Eur. Phys. J. C, 82(3):212, 2022. doi:10.1140/epjc/s10052-022-10161-y. URLhttps://link.springer.com/article/10.1140/epjc/ s10052-022-10161-y

  34. [34]

    H. Tan, M. Momayezi, A. Fallu-Labruyere, Y. X. Chu, and W. K. Warburton. A fast digital filter algorithm for gamma- ray spectroscopy with double-exponential decaying scintillators.IEEE Transactions on Nuclear Science, 51(4):1541– 1546, 2004. doi:10.1109/TNS.2004.832984. URLhttps://s3.us-west-1.amazonaws.com/download.xia.com/documents/ publications/2004-ta...

  35. [35]

    Majorana Collaboration, S. I. Alvis, et al. Multisite event discrimination for the Majorana demonstrator.Physical Review C, 99(6):065501, June 2019. doi:10.1103/PhysRevC.99.065501. Published 7 June 2019, received 17 January 2019. 11