Area-based epigraph and hypograph indices for functional outlier detection
Pith reviewed 2026-05-19 06:25 UTC · model grok-4.3
The pith
New area-based epigraph and hypograph indices detect functional outliers by measuring deviation areas.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that the Area-Based Epigraph Index (ABEI) and Area-Based Hypograph Index (ABHI) quantify the area between functions and thereby detect both magnitude and shape outliers. Incorporating these indices computed on the original curves and on their first and second derivatives allows recasting functional outlier detection as a multivariate problem to which conventional techniques can be applied.
What carries the argument
The Area-Based Epigraph Index (ABEI) and Area-Based Hypograph Index (ABHI) that compute the integrated area between a given curve and all others in the sample.
If this is right
- Outlier detection in functional data now accounts for magnitude deviations in addition to shape anomalies.
- The EHyOut procedure proves stable and competitive or superior in extensive simulation studies under varied contamination.
- Applications to Spanish weather data and United Nations population data demonstrate the method's ability to identify meaningful outliers.
Where Pith is reading between the lines
- One could explore whether using only first derivatives or adding third derivatives alters detection power in specific applications.
- The framework might integrate with other functional data tools such as functional principal components for preprocessing.
- Performance could vary with the choice of multivariate outlier method, suggesting comparisons across several detectors.
Load-bearing premise
Quantifying the area between curves in ABEI and ABHI simultaneously captures magnitude and shape deviations, and combining these with derivative information allows multivariate outlier detection to identify functional outliers reliably.
What would settle it
Simulation results in which curves deviate substantially in magnitude but EHyOut does not flag them as outliers more effectively than shape-based alternatives.
read the original abstract
Detecting outliers in Functional Data Analysis is challenging because curves can stray from the majority in many different ways. The Modified Epigraph Index (MEI) and Modified Hypograph Index (MHI) rank functions by the fraction of the domain on which one curve lies above or below another. While effective for spotting shape anomalies, their construction limits their ability to flag magnitude outliers. This paper introduces two new metrics, the Area-Based Epigraph Index (ABEI) and Area-Based Hypograph Index (ABHI) that quantify the area between curves, enabling simultaneous sensitivity to both magnitude and shape deviations. Building on these indices, we present EHyOut, a robust procedure that recasts functional outlier detection as a multivariate problem: for every curve, and for its first and second derivatives, we compute ABEI and ABHI and then apply multivariate outlier-detection techniques to the resulting feature vectors. Extensive simulations show that EHyOut remains stable across a wide range of contamination settings and often outperforms established benchmark methods. Moreover, applications to Spanish weather data and United Nations world population data further illustrate the practical utility and meaningfulness of this methodology.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Area-Based Epigraph Index (ABEI) and Area-Based Hypograph Index (ABHI) to address limitations of the Modified Epigraph Index (MEI) and Modified Hypograph Index (MHI) in functional outlier detection. These new indices quantify the area between curves rather than the fraction of the domain, enabling sensitivity to both magnitude and shape deviations. The EHyOut procedure computes ABEI and ABHI on each curve and on its first and second derivatives, yielding six-dimensional feature vectors that are then passed to standard multivariate outlier detectors. Simulations across contamination settings are reported to show stability and frequent outperformance of benchmarks, with applications to Spanish weather data and UN world population data illustrating practical use.
Significance. If the results hold, the contribution lies in a direct, area-based extension of epigraph/hypograph ideas that simultaneously targets magnitude and shape outliers without requiring new parametric assumptions. The simulation design and real-data examples supply concrete evidence of utility; the method's recasting of the problem as a fixed-dimensional multivariate task is a clear practical strength.
major comments (1)
- [Section describing EHyOut construction and derivative feature extraction] The central claim that the six-dimensional feature vector (ABEI/ABHI on the curve plus first and second derivatives) reliably flags both magnitude and shape outliers depends on the second-derivative indices. In discretely observed data these derivatives are obtained only after smoothing, yet the manuscript reports no sensitivity analysis to bandwidth choice or to additive noise levels. If the smoothing parameter is misspecified, the area indices on the second derivative can be dominated by estimation artifacts rather than genuine curvature deviations, undermining the claim that the procedure simultaneously captures magnitude and shape outliers.
minor comments (1)
- [Abstract] The abstract states that 'extensive simulations show stability across a wide range of contamination settings' but does not enumerate the specific contamination fractions, sample sizes, or performance metrics (e.g., true-positive rate, false-positive rate) used to support that statement.
Simulated Author's Rebuttal
We thank the referee for the positive summary and for identifying a key robustness issue in the EHyOut construction. We address the single major comment below and agree that additional analysis is warranted to support the claims about derivative-based features.
read point-by-point responses
-
Referee: The central claim that the six-dimensional feature vector (ABEI/ABHI on the curve plus first and second derivatives) reliably flags both magnitude and shape outliers depends on the second-derivative indices. In discretely observed data these derivatives are obtained only after smoothing, yet the manuscript reports no sensitivity analysis to bandwidth choice or to additive noise levels. If the smoothing parameter is misspecified, the area indices on the second derivative can be dominated by estimation artifacts rather than genuine curvature deviations, undermining the claim that the procedure simultaneously captures magnitude and shape outliers.
Authors: We agree that the reliability of the second-derivative indices in the six-dimensional feature vector hinges on the quality of the smoothing step, and that an explicit sensitivity study to bandwidth and noise level was omitted from the original manuscript. Our simulations used standard smoothing procedures and produced stable results across contamination settings, but this does not substitute for a targeted robustness check. In the revision we will add a new subsection (and corresponding figures) that varies the smoothing bandwidth over a grid of values and adds controlled noise levels to the observed curves. Preliminary checks already indicate that outlier-detection performance remains largely unchanged for bandwidths within a factor of two of the default choice and for moderate noise; these results will be reported to substantiate that the procedure is not driven by estimation artifacts. revision: yes
Circularity Check
No circularity: new indices defined directly from areas; outlier procedure uses off-the-shelf methods
full rationale
The paper's central construction defines ABEI and ABHI explicitly as area-based extensions of the existing MEI/MHI indices, then computes these six scalars (on the curve plus first and second derivatives) and feeds them into standard multivariate outlier detectors. No step reduces a claimed result to a fitted parameter or self-citation by construction; the performance claims rest on external simulations and real-data applications rather than tautological re-use of the same quantities. The derivation chain is therefore self-contained against the benchmarks it invokes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Curves are sufficiently smooth to admit first and second derivatives that can be meaningfully computed and compared.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ABEI_n(x) = sum_i ∫_I (x_i(t) - x(t))_+ dt and ABHI_n(x) = sum_i ∫_I (x(t) - x_i(t))_+ dt; EHyOut applies COM to the resulting 6-D feature vectors from curves + first + second derivatives.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The methodology recasts functional outlier detection as a multivariate problem using ABEI/ABHI on smoothed data and derivatives.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
barticle [author] Arribas-Gil , Ana A. Romo , Juan J. ( 2014 ). Shape outlier detection and visualization for functional data: the outliergram . Biostatistics 15 603--619 . barticle
work page 2014
-
[2]
bbook [author] Barnett , Vic V. , Lewis , Toby T. et al. ( 1994 ). Outliers in statistical data 3 . Wiley New York . bbook
work page 1994
-
[3]
barticle [author] Cabana , Elisa E. , Lillo , Rosa E R. E. Laniado , Henry H. ( 2021 ). Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators . Statistical papers 62 1583--1609 . barticle
work page 2021
-
[4]
barticle [author] Chicco , Davide D. Jurman , Giuseppe G. ( 2020 ). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation . BMC genomics 21 1--13 . barticle
work page 2020
-
[5]
barticle [author] Cuesta-Albertos , Juan Antonio J. A. Nieto-Reyes , Alicia A. ( 2008 ). The random Tukey depth . Computational Statistics & Data Analysis 52 4979--4988 . barticle
work page 2008
-
[6]
barticle [author] Cuevas , Antonio A. , Febrero , Manuel M. Fraiman , Ricardo R. ( 2006 ). On the use of the bootstrap for estimating functions with functional data . Computational statistics & data analysis 51 1063--1074 . barticle
work page 2006
-
[7]
barticle [author] Dai , Wenlin W. Genton , Marc G M. G. ( 2018 ). Multivariate functional data visualization and outlier detection . Journal of Computational and Graphical Statistics 27 923--934 . barticle
work page 2018
-
[8]
barticle [author] Dai , Wenlin W. , Mrkvička , Tomáš T. , Sun , Ying Y. Genton , Marc G. M. G. ( 2020 ). Functional outlier detection and taxonomy by sequential transformations . Computational Statistics & Data Analysis 149 106960 . barticle
work page 2020
- [9]
-
[10]
De La Fuente , Manuel Oviedo M
barticle [author] Febrero-Bande , Manuel M. De La Fuente , Manuel Oviedo M. O. ( 2012 ). Statistical computing in functional data analysis: The R package fda. usc . Journal of statistical Software 51 1--28 . barticle
work page 2012
-
[11]
bbook [author] Ferraty , Fr \'e d \'e ric F. Vieu , Philippe P. ( 2006 ). Nonparametric functional data analysis: theory and practice . Springer Science & Business Media . bbook
work page 2006
-
[12]
barticle [author] Filzmoser , Peter P. , Garrett , Robert G R. G. Reimann , Clemens C. ( 2005 ). Multivariate outlier detection in exploration geochemistry . Computers & geosciences 31 579--587 . barticle
work page 2005
-
[13]
barticle [author] Fraiman , Ricardo R. Muniz , Graciela G. ( 2001 ). Trimmed means for functional data . Test 10 419--440 . barticle
work page 2001
-
[14]
bincollection [author] Franco-Pereira , A. M A. M. , Lillo , R. E. R. E. Romo , J. J. ( 2011 ). Extremality for functional data . In Recent advances in functional data analysis and related topics , ( F F. Ferraty , ed.) 14 651-676 . Springer, New York . bincollection
work page 2011
-
[15]
barticle [author] Franco-Pereira , A. M A. M. Lillo , R. E R. E. ( 2020 ). Rank tests for functional data based on the epigraph, the hypograph and associated graphical representations . Advances in Data Analysis and Classification 14 651--676 . 10.1007/s11634-019-00380-9 barticle
-
[16]
barticle [author] Gnanadesikan , Ramanathan R. Kettenring , John R J. R. ( 1972 ). Robust estimates, residuals, and outlier detection with multiresponse data . Biometrics 81--124 . barticle
work page 1972
-
[17]
barticle [author] Hardin , Johanna J. Rocke , David M D. M. ( 2005 ). The Distribution of Robust Distances . Journal of Computational and Graphical Statistics 14 928--946 . barticle
work page 2005
-
[18]
bbook [author] Hawkins , Douglas M D. M. ( 1980 ). Identification of outliers 11 . Springer . bbook
work page 1980
-
[19]
binproceedings [author] Hernández , Nicolás N. Muñoz , Alberto A. ( 2016 ). Kernel Depth Measures for Functional Data with Application to Outlier Detection . In Artificial Neural Networks and Machine Learning – ICANN 2016 ( Alessandro E. P. A. E. P. Villa , Paolo P. Masulli Antonio Javier A. J. Pons Rivero , eds.). Lecture Notes in Computer Science 235--2...
work page 2016
-
[20]
barticle [author] Herrmann , Moritz M. Scheipl , Fabian F. ( 2021 ). A Geometric Perspective on Functional Outlier Detection . Stats 4 . barticle
work page 2021
-
[21]
bbook [author] Horv \'a th , Lajos L. Kokoszka , Piotr P. ( 2012 ). Inference for functional data with applications 200 . Springer Science & Business Media . bbook
work page 2012
-
[22]
bbook [author] Hsing , Tailen T. Eubank , Randall R. ( 2015 ). Theoretical foundations of functional data analysis, with an introduction to linear operators 997 . John Wiley & Sons . bbook
work page 2015
-
[23]
barticle [author] Huang , Huang H. Sun , Ying Y. ( 2019 ). A decomposition of total variation depth for understanding functional outliers . Technometrics . barticle
work page 2019
-
[24]
barticle [author] Jiménez-Varón , Cristian F. C. F. , Harrou , Fouzi F. Sun , Ying Y. ( 2024 ). Pointwise data depth for univariate and multivariate functional outlier detection . Environmetrics e2851 . barticle
work page 2024
-
[25]
barticle [author] López-Pintado , S. S. Romo , J. J. ( 2009 ). On the concept of depth for functional data . American Statistical Association 104 327-332 . barticle
work page 2009
-
[26]
barticle [author] López-Pintado , S. S. Romo , J. J. ( 2011 ). A half-region depth for functional data . Computational Statistics and Data Analysis 55 1679-1695 . barticle
work page 2011
-
[27]
bmanual [author] Maechler , Martin M. , Rousseeuw , Peter P. , Croux , Christophe C. , Todorov , Valentin V. , Ruckstuhl , Andreas A. , Salibian-Barrera , Matias M. , Verbeke , Tobias T. , Koller , Manuel M. , Conceicao , Eduardo L. T. E. L. T. di Palma , Maria Anna M. A. ( 2024 ). robustbase: Basic Robust Statistics R package version 0.99-4-1 . bmanual
work page 2024
-
[28]
barticle [author] Mahalanobis , Prasanta Chandra P. C. ( 1936 ). On the generalized distance in statistics . Proceedings of the National Institute of Sciences (Calcutta) 2 49--55 . barticle
work page 1936
-
[29]
barticle [author] Maronna , Ricardo A R. A. Zamar , Ruben H R. H. ( 2002 ). Robust estimates of location and dispersion for high-dimensional datasets . Technometrics 44 307--317 . barticle
work page 2002
-
[30]
barticle [author] Martin-Barragan , B B. , Lillo , RE R. Romo , J J. ( 2016 ). Functional boxplots based on epigraphs and hypographs . Journal of Applied Statistics 43 1088--1103 . barticle
work page 2016
-
[31]
barticle [author] Matthews , Brian W B. W. ( 1975 ). Comparison of the predicted and observed secondary structure of T4 phage lysozyme . Biochimica et Biophysica Acta (BBA)-Protein Structure 405 442--451 . barticle
work page 1975
-
[32]
barticle [author] Nagy , Stanislav S. , Gijbels , Irène I. Hlubinka , Daniel D. ( 2017 ). Depth- Based Recognition of Shape Outlying Functions . Journal of Computational and Graphical Statistics 26 . barticle
work page 2017
-
[33]
barticle [author] Ojo , Oluwasegun Taiwo O. T. , Fern \'a ndez Anta , Antonio A. , Lillo , Rosa E R. E. Sguera , Carlo C. ( 2022 ). Detecting and classifying outliers in big functional data . Advances in Data Analysis and Classification 16 725--760 . barticle
work page 2022
-
[34]
bmanual [author] Pulido , B. B. ( 2024 ). ehymet: Epigraph and Hypograph Based Methodology for Outlier Detection in Functional Data R package version 0.1.1 . bmanual
work page 2024
-
[35]
barticle [author] Pulido , Bel \'e n B. , Franco-Pereira , Alba M A. M. Lillo , Rosa E R. E. ( 2023 ). A fast epigraph and hypograph-based approach for clustering functional data . Statistics and Computing 33 36 . 10.1007/s11222-023-10213-7 barticle
-
[36]
barticle [author] Pulido , Belén B. , Franco-Pereira , Alba M. A. M. Lillo , Rosa E. R. E. ( 2025 ). Clustering multivariate functional data using the epigraph and hypograph indices: a case study on Madrid air quality . Stoch Environ Res Risk Assess . 10.1007/s00477-025-02986-2 barticle
-
[37]
bbook [author] Ramsay , J. O. J. O. Silverman , B. W. B. W. ( 2005 ). Functional Data Analysis , 2 ed. Springer . bbook
work page 2005
-
[38]
barticle [author] Rousseeuw , Peter J P. J. Driessen , Katrien Van K. V. ( 1999 ). A fast algorithm for the minimum covariance determinant estimator . Technometrics 41 212--223 . barticle
work page 1999
-
[39]
barticle [author] Sajesh , TA T. Srinivasan , MR M. ( 2012 ). Outlier detection for high dimensional data using the Comedian approach . Journal of statistical computation and simulation 82 745--757 . barticle
work page 2012
-
[40]
barticle [author] Sguera , Carlo C. , Galeano , Pedro P. Lillo , Rosa R. ( 2014 ). Spatial depth-based classification for functional data . Test 23 725--750 . barticle
work page 2014
-
[41]
barticle [author] Sun , Ying Y. Genton , Marc G. M. G. ( 2011 ). Functional Boxplots . Journal of Computational and Graphical Statistics 20 316--334 . barticle
work page 2011
-
[42]
bmanual [author] Todorov , V. V. ( 2025 ). rrcov: Scalable Robust Estimators with High Breakdown Point R package version 1.7.7 . bmanual
work page 2025
-
[43]
barticle [author] Vardi , Yehuda Y. Zhang , Cun-Hui C.-H. ( 2000 ). The multivariate L1-median and associated data depth . Proceedings of the National Academy of Sciences 97 1423--1426 . barticle
work page 2000
-
[44]
barticle [author] Wang , Jane-Ling J.-L. , Chiou , Jeng-Min J.-M. M \"u ller , Hans-Georg H.-G. ( 2016 ). Functional data analysis . Annual Review of Statistics and its Application 3 257--295 . barticle
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.