Two-stage Ensemble Clustering of Functional Data Using Random Projections

Anirvan Chakraborty; Shyamal K. De; Sourav Chakrabarty

arxiv: 2605.22110 · v1 · pith:N5DUVK2Znew · submitted 2026-05-21 · 📊 stat.ME

Two-stage Ensemble Clustering of Functional Data Using Random Projections

Sourav Chakrabarty , Anirvan Chakraborty , Shyamal K. De This is my paper

Pith reviewed 2026-05-22 04:09 UTC · model grok-4.3

classification 📊 stat.ME

keywords functional dataclusteringrandom projectionsGaussian processesensemble methodsMADD dissimilaritytwo-stage algorithm

0 comments

The pith

A two-stage method using random projections clusters functional data with higher accuracy than current approaches.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a clustering technique for functional data that first projects each curve onto many independent Gaussian process realizations and groups them using a high-dimensional dissimilarity measure called MADD. This initial partition then informs the estimation of a covariance operator, which is used to generate more targeted random projections in a second stage for refined clustering. A cost function finally chooses the best result among options. The approach handles irregular observations and is shown through tests to work well on various functional data types where traditional methods struggle.

Core claim

The central discovery is that an ensemble of random projections from Gaussian processes can capture differences between functional populations at a population level, and that refining these projections in a second stage using labels from the first stage leads to improved clustering performance across a range of settings.

What carries the argument

The two-stage clustering procedure that employs prespecified Gaussian random projections and the MADD dissimilarity for initial grouping, followed by covariance-driven projections for refinement.

If this is right

The method applies to irregular and partially observed functional data without special adjustments.
Extensive simulations demonstrate superior accuracy compared to many existing clustering techniques.
Real-data applications confirm the method's practical effectiveness.
Population-level analysis explains why the random projections distinguish distributional differences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the first stage provides reasonable starting labels, the second stage can significantly improve separation by focusing projections on data-specific structures.
This framework might generalize to clustering other high-dimensional objects by adapting the projection families.
Selecting the optimal clustering via the normalized cost function could be tested on larger datasets to assess scalability.

Load-bearing premise

The first stage clustering with fixed projection families produces initial labels accurate enough that the covariance estimated from them improves the separation achieved by the second stage projections.

What would settle it

A simulation study in which the first-stage clusters are forced to be random or incorrect, followed by checking whether the second stage still yields better final clustering than a single-stage approach.

Figures

Figures reproduced from arXiv: 2605.22110 by Anirvan Chakraborty, Shyamal K. De, Sourav Chakrabarty.

**Figure 2.** Figure 2: Plots of Berkeley and Medfly datasets showing the different clusters [PITH_FULL_IMAGE:figures/full_fig_p026_2.png] view at source ↗

**Figure 3.** Figure 3: Plots of Meatspectrum and Flours datasets (manually fragmented) [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗

read the original abstract

We propose a computationally simple framework for clustering functional data based on Gaussian-process-generated random projections. In this approach, each curve is first projected onto a large collection of independent Gaussian process realizations. The resulting high-dimensional representations are clustered using the Mean Absolute Difference of Distances (MADD), a dissimilarity measure well suited for high-dimensional settings. A population-level analysis of this dissimilarity provides insight into how random projections help capture distributional differences between functional populations. We introduce a second stage of clustering to additionally leverage on data-driven projection directions. Thus, in Stage I, an initial clustering is obtained using a set of prespecified projection families. In Stage II, this partition is refined by constructing Gaussian random projections based on an estimated covariance operator that uses the first stage of cluster labels. Finally, a normalized cost function is used to select the optimal clustering among candidate solutions. The proposed clustering algorithm is broadly applicable to diverse functional data regimes including irregular and partially observed data. Through extensive simulations and real-data applications, we show that the proposed method achieves a high degree of accuracy and outperforms many of the state-of-the-art methods across a wide range of functional data settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The two-stage method refines initial MADD clusters from fixed GP projections by estimating covariance from those labels for better directions, which is a workable practical idea but rests on the first stage being accurate enough.

read the letter

The main takeaway is that this paper describes a two-stage clustering procedure for functional data. Stage I projects curves onto prespecified Gaussian process realizations, clusters the high-dimensional results with MADD dissimilarity, and Stage II uses the resulting labels to estimate a covariance operator, builds data-driven projections from it, and refines the partition before picking the best one with a normalized cost function. The population-level analysis of MADD is included to show how the projections capture group differences. This specific label-informed refinement step combined with the cost selection does not appear in the cited prior work, so that combination is the new element. It handles irregular and partially observed curves without extra alignment steps, which is a real plus for applied work. The approach is straightforward enough that someone could implement and test it on their own data. The soft spot is the dependence on Stage I labels. If the prespecified projection families miss key structure in messy data, the covariance estimate will be off and Stage II may not improve separation or could make it worse. The stress-test note is on target here, and without explicit sensitivity checks or fallback options the robustness claim stays moderate. The outperformance over state-of-the-art methods is asserted from simulations and real-data examples, but the abstract leaves the exact baselines, run-to-run variability, and implementation details unclear, so those numbers need closer inspection. This is aimed at statisticians and applied researchers who cluster longitudinal or functional observations in fields like biology or economics. A reader looking for a usable algorithm for sparse curves would get something concrete to try. It deserves peer review because the algorithmic framework is clear, the scope is broad, and the core idea is grounded enough to warrant referee time even if the experiments need tightening.

Referee Report

2 major / 2 minor

Summary. The paper proposes a two-stage clustering framework for functional data. Stage I projects each curve onto realizations from prespecified Gaussian process families and applies the Mean Absolute Difference of Distances (MADD) dissimilarity for initial clustering. Stage II refines the partition by estimating a covariance operator from the Stage I labels to produce data-driven projections. A normalized cost function selects the final clustering. The method is presented as applicable to irregular and partially observed data, with claims of high accuracy and outperformance of state-of-the-art methods based on extensive simulations and real-data examples. A population-level analysis of MADD is used to justify the projection approach.

Significance. If the two-stage refinement reliably improves upon Stage I without excessive sensitivity to label errors, the framework would offer a practical, computationally simple tool for functional data clustering that handles missingness and high dimensionality better than many existing approaches. The explicit use of both fixed and adaptive random projections, together with the MADD analysis, provides a clear algorithmic contribution that could be adopted in applied settings where functional observations are incomplete.

major comments (2)

[Stage II] Section on Stage II and population-level analysis of MADD: The central accuracy claim rests on Stage II improving separation by using labels from Stage I to estimate the covariance operator. However, no sensitivity analysis, breakdown-point bounds, or simulations with controlled Stage I error rates are provided to show when this refinement yields gains versus degradation; this is load-bearing because the method is explicitly iterative and the population analysis assumes sufficiently accurate initial labels.
[Simulations and real-data applications] Simulations and real-data sections: The abstract and description assert outperformance across a wide range of settings, yet the visible material lacks explicit reporting of baseline implementations, number of Monte Carlo replications, error-bar statistics, or exact parameter choices for the competing methods; without these, the quantitative support for the superiority claim cannot be fully evaluated.

minor comments (2)

[Methods] The notation and definition of the normalized cost function used for final selection could be stated more explicitly, including its dependence on the projection dimension and number of clusters.
[Figures] Figure captions for the simulation results should include the exact functional forms and noise levels used in each scenario to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the detailed and constructive feedback on our paper. The comments highlight important aspects that will enhance the presentation and validation of our two-stage clustering method. We respond to each major comment below and outline the planned revisions.

read point-by-point responses

Referee: [Stage II] Section on Stage II and population-level analysis of MADD: The central accuracy claim rests on Stage II improving separation by using labels from Stage I to estimate the covariance operator. However, no sensitivity analysis, breakdown-point bounds, or simulations with controlled Stage I error rates are provided to show when this refinement yields gains versus degradation; this is load-bearing because the method is explicitly iterative and the population analysis assumes sufficiently accurate initial labels.

Authors: We agree that demonstrating the robustness of Stage II to potential errors in the initial clustering is crucial for the method's reliability. While the population-level analysis of MADD assumes sufficiently accurate labels to justify the projection approach, we will strengthen the manuscript by adding a sensitivity analysis. This will include simulations with controlled misclassification rates in Stage I labels and an examination of when the refinement step improves or degrades performance. We will also include a brief discussion of the breakdown behavior in the revised text. revision: yes
Referee: [Simulations and real-data applications] Simulations and real-data sections: The abstract and description assert outperformance across a wide range of settings, yet the visible material lacks explicit reporting of baseline implementations, number of Monte Carlo replications, error-bar statistics, or exact parameter choices for the competing methods; without these, the quantitative support for the superiority claim cannot be fully evaluated.

Authors: We acknowledge that these details are essential for reproducibility and evaluation. In the revised manuscript, we will explicitly report the number of Monte Carlo replications performed, include error bars or standard error statistics in the simulation results, provide descriptions or references for the baseline implementations, and specify the exact parameter choices and settings used for the competing methods. revision: yes

Circularity Check

0 steps flagged

No circularity: explicit two-stage algorithm with external empirical validation

full rationale

The paper presents an algorithmic procedure rather than a mathematical derivation of a target quantity. Stage I applies prespecified GP projections and MADD dissimilarity to obtain initial labels; Stage II then estimates the covariance operator from those labels to generate data-driven projections. This dependence is an explicit iterative design choice, not a self-definitional loop or a fitted parameter renamed as a prediction. Population-level analysis of MADD is invoked to motivate why projections capture distributional differences, but the analysis is presented as interpretive support rather than a load-bearing uniqueness theorem. All performance claims rest on simulations and real-data applications that serve as external benchmarks, not on internal reduction to the method's own inputs. No self-citation chains, ansatz smuggling, or renaming of known results appear in the provided description.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard functional data analysis assumptions plus the suitability of Gaussian processes for generating useful random projections; no new free parameters or invented entities are introduced beyond the algorithmic choices.

axioms (2)

domain assumption Gaussian process realizations provide sufficiently rich random directions to capture distributional differences between functional populations.
Invoked in the description of Stage I projections and the population-level analysis of MADD.
domain assumption Initial cluster labels from prespecified projections are accurate enough to yield a useful covariance estimate for Stage II refinement.
Central to the justification of the two-stage procedure.

pith-pipeline@v0.9.0 · 5733 in / 1458 out tokens · 45522 ms · 2026-05-22T04:09:30.820970+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Unsupervised curve clustering using b-splines

Christophe Abraham, Pierre-Andr´ e Cornillon, ERIC Matzner-Løber, and Nicolas Molinari. Unsupervised curve clustering using b-splines. Scandinavian journal of statistics, 30(3):581–595, 2003

work page 2003
[2]

Learning mixtures of gaussian processes through random projection

Emmanuel Akeweje and Mimi Zhang. Learning mixtures of gaussian processes through random projection. InProceedings of the 41st In- ternational Conference on Machine Learning, Proceedings of Machine Learning Research, 2024

work page 2024
[3]

Model-based clustering of functional data via mixtures of t distributions.Advances in Data Analysis and Classification, 18(3):563–595, 2024

Cristina Anton and Iain Smith. Model-based clustering of functional data via mixtures of t distributions.Advances in Data Analysis and Classification, 18(3):563–595, 2024

work page 2024
[4]

Carey, Pablo Liedo, Hans-Georg M¨ uller, Jane-Ling Wang, and Jeng-Min Chiou

James R. Carey, Pablo Liedo, Hans-Georg M¨ uller, Jane-Ling Wang, and Jeng-Min Chiou. Relationship of age patterns of fecundity to mortality, longevity, and lifetime reproduction in a large cohort of mediterranean fruit fly females.The Journals of Gerontology: Series A: Biological Sciences and Medical Sciences, 53A(4):B245–B251, 1998

work page 1998
[5]

Sparse and smooth functional data clustering.Statistical Papers, 65(2):795–825, 2024

Fabio Centofanti, Antonio Lepore, and Biagio Palumbo. Sparse and smooth functional data clustering.Statistical Papers, 65(2):795–825, 2024

work page 2024
[6]

Optimally weighted l2 distance for functional data.Biometrics, 70(3):516–525, 2014

Huaihou Chen, Philip T Reiss, and Thaddeus Tarpey. Optimally weighted l2 distance for functional data.Biometrics, 70(3):516–525, 2014

work page 2014
[7]

Clus- tering brain signals: A robust approach using functional data ranking

Tianbo Chen, Ying Sun, Carolina Euan, and Hernando Ombao. Clus- tering brain signals: A robust approach using functional data ranking. Journal of Classification, 38(3):425–442, 2021

work page 2021
[8]

Chiou and P.-L

J.-M. Chiou and P.-L. Li. Functional clustering and identifying sub- structures of longitudinal data.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4):679–699, 2007

work page 2007
[9]

A sharp form of the Cram´ er–Wold theorem.Journal of Theoretical Probability, 20(2):201–209, 2007

Juan Antonio Cuesta-Albertos, Ricardo Fraiman, and Thomas Rans- ford. A sharp form of the Cram´ er–Wold theorem.Journal of Theoretical Probability, 20(2):201–209, 2007

work page 2007
[10]

Clustering functional data into groups by using projections.Journal of the Royal Statistical Society Series B: Statistical Methodology, 81(2):271–304, 2019

Aurore Delaigle, Peter Hall, and Tung Pham. Clustering functional data into groups by using projections.Journal of the Royal Statistical Society Series B: Statistical Methodology, 81(2):271–304, 2019

work page 2019
[11]

Curves discrimination: a nonpara- metric functional approach.Computational Statistics & Data Analysis, 44(1-2):161–173, 2003

Fr´ ed´ eric Ferraty and Philippe Vieu. Curves discrimination: a nonpara- metric functional approach.Computational Statistics & Data Analysis, 44(1-2):161–173, 2003

work page 2003
[12]

Giacofci, S

M. Giacofci, S. Lambert-Lacroix, G. Marot, and F. Picard. Wavelet- Based Clustering for Mixed-Effects Functional Models in High Dimen- sion.Biometrics, 69(1):31–40, 02 2013

work page 2013
[13]

Functional neural networks: shift invariant models for functional data with applications to eeg classification

Florian Heinrichs, Mavin Heim, and Corinna Weber. Functional neural networks: shift invariant models for functional data with applications to eeg classification. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023

work page 2023
[14]

Springer Science & Business Media, 2012

Lajos Horv´ ath and Piotr Kokoszka.Inference for functional data with applications, volume 200. Springer Science & Business Media, 2012

work page 2012
[15]

Society for Industrial and Applied Mathematics, Philadelphia, PA, 2021

Tsung-Yu Hsieh, Yiwei Sun, Suhang Wang, and Vasant Honavar.Func- tional Autoencoders for Functional Data Representation Learning, pages 666–674. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2021

work page 2021
[16]

Funclust: A curves clustering method using functional random variables density approximation.Neu- rocomputing, 112:164–171, 2013

Julien Jacques and Cristian Preda. Funclust: A curves clustering method using functional random variables density approximation.Neu- rocomputing, 112:164–171, 2013. Advances in artificial neural networks, machine learning, and computational intelligence

work page 2013
[17]

Functional clus- ter analysis via orthonormalized gaussian basis expansions and its ap- plication.Journal of classification, 27:211–230, 2010

Mitsunori Kayano, Koji Dozono, and Sadanori Konishi. Functional clus- ter analysis via orthonormalized gaussian basis expansions and its ap- plication.Journal of classification, 27:211–230, 2010

work page 2010
[18]

MAGMA: inference and prediction using multi-task Gaussian processes with common mean.Machine Learning, 111(5):1821–1849, 2022

Arthur Leroy, Pierre Latouche, Benjamin Guedj, and Servane Gey. MAGMA: inference and prediction using multi-task Gaussian processes with common mean.Machine Learning, 111(5):1821–1849, 2022

work page 2022
[19]

Classification of functional data: A segmenta- tion approach.Computational Statistics & Data Analysis, 52(10):4790– 4800, 2008

Bin Li and Qingzhao Yu. Classification of functional data: A segmenta- tion approach.Computational Statistics & Data Analysis, 52(10):4790– 4800, 2008

work page 2008
[20]

K-means algorithms for functional data.Neurocom- puting, 151:231–245, 2015

Mar´ ıa Luz L´ opez Garc´ ıa, Ricardo Garc´ ıa-R´ odenas, and Antonia Gonz´ alez G´ omez. K-means algorithms for functional data.Neurocom- puting, 151:231–245, 2015

work page 2015
[21]

Parameter clustering in bayesian functional principal component analysis of neuroscientific data.Statistics in Medicine, 40(1):167–184, 2021

Nicol` o Margaritella, Vanda In´ acio, and Ruth King. Parameter clustering in bayesian functional principal component analysis of neuroscientific data.Statistics in Medicine, 40(1):167–184, 2021

work page 2021
[22]

A k-means procedure based on a mahalanobis type distance for clustering multivariate functional data.Statistical Methods & Appli- cations, 28:301–322, 2019

Andrea Martino, Andrea Ghiglietti, Francesca Ieva, and Anna Maria Paganoni. A k-means procedure based on a mahalanobis type distance for clustering multivariate functional data.Statistical Methods & Appli- cations, 28:301–322, 2019

work page 2019
[23]

Objective criteria for the evaluation of clustering methods.Journal of the American Statistical Association, 66(336):846– 850, 1971

William M Rand. Objective criteria for the evaluation of clustering methods.Journal of the American Statistical Association, 66(336):846– 850, 1971

work page 1971
[24]

Multi- variate functional data clustering using adaptive density peak detection

Rui Ren, Kuangnan Fang, Qingzhao Zhang, and Xiaofeng Wang. Multi- variate functional data clustering using adaptive density peak detection. Statistics in Medicine, 42(10):1565–1582, 2023

work page 2023
[25]

Support vector machine for functional data classification.Neurocomputing, 69(7-9):730–742, 2006

Fabrice Rossi and Nathalie Villa. Support vector machine for functional data classification.Neurocomputing, 69(7-9):730–742, 2006

work page 2006
[26]

Soham Sarkar and Anil K. Ghosh. On perfect clustering of high dimen- sion, low sample size data.IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(9):2257–2272, 2019. Published online 2019, print issue 2020

work page 2019
[27]

Singh, Shirley Coyle, and Mimi Zhang

Samuel V. Singh, Shirley Coyle, and Mimi Zhang. Shape-informed clus- tering of multi-dimensional functional data via deep functional autoen- coders. InAdvances in Neural Information Processing Systems (NeurIPS 2025), San Diego, CA, USA, Dec 2025. NeurIPS 2025 poster / proceed- ings (OpenReview)

work page 2025
[28]

Phase and amplitude- based clustering for functional data.Computational Statistics & Data Analysis, 56(7):2360–2374, 2012

Leen Slaets, Gerda Claeskens, and Mia Hubert. Phase and amplitude- based clustering for functional data.Computational Statistics & Data Analysis, 56(7):2360–2374, 2012

work page 2012
[29]

Sriperumbudur, Kenji Fukumizu, and Gert R.G

Bharath K. Sriperumbudur, Kenji Fukumizu, and Gert R.G. Lanck- riet. Universality, characteristic kernels and rkhs embedding of mea- sures.Journal of Machine Learning Research, 12(70):2389–2410, 2011

work page 2011
[30]

Functional data analysis for sparse longitudinal data.Journal of the American statistical association, 100(470):577–590, 2005

Fang Yao, Hans-Georg M¨ uller, and Jane-Ling Wang. Functional data analysis for sparse longitudinal data.Journal of the American statistical association, 100(470):577–590, 2005

work page 2005

[1] [1]

Unsupervised curve clustering using b-splines

Christophe Abraham, Pierre-Andr´ e Cornillon, ERIC Matzner-Løber, and Nicolas Molinari. Unsupervised curve clustering using b-splines. Scandinavian journal of statistics, 30(3):581–595, 2003

work page 2003

[2] [2]

Learning mixtures of gaussian processes through random projection

Emmanuel Akeweje and Mimi Zhang. Learning mixtures of gaussian processes through random projection. InProceedings of the 41st In- ternational Conference on Machine Learning, Proceedings of Machine Learning Research, 2024

work page 2024

[3] [3]

Model-based clustering of functional data via mixtures of t distributions.Advances in Data Analysis and Classification, 18(3):563–595, 2024

Cristina Anton and Iain Smith. Model-based clustering of functional data via mixtures of t distributions.Advances in Data Analysis and Classification, 18(3):563–595, 2024

work page 2024

[4] [4]

Carey, Pablo Liedo, Hans-Georg M¨ uller, Jane-Ling Wang, and Jeng-Min Chiou

James R. Carey, Pablo Liedo, Hans-Georg M¨ uller, Jane-Ling Wang, and Jeng-Min Chiou. Relationship of age patterns of fecundity to mortality, longevity, and lifetime reproduction in a large cohort of mediterranean fruit fly females.The Journals of Gerontology: Series A: Biological Sciences and Medical Sciences, 53A(4):B245–B251, 1998

work page 1998

[5] [5]

Sparse and smooth functional data clustering.Statistical Papers, 65(2):795–825, 2024

Fabio Centofanti, Antonio Lepore, and Biagio Palumbo. Sparse and smooth functional data clustering.Statistical Papers, 65(2):795–825, 2024

work page 2024

[6] [6]

Optimally weighted l2 distance for functional data.Biometrics, 70(3):516–525, 2014

Huaihou Chen, Philip T Reiss, and Thaddeus Tarpey. Optimally weighted l2 distance for functional data.Biometrics, 70(3):516–525, 2014

work page 2014

[7] [7]

Clus- tering brain signals: A robust approach using functional data ranking

Tianbo Chen, Ying Sun, Carolina Euan, and Hernando Ombao. Clus- tering brain signals: A robust approach using functional data ranking. Journal of Classification, 38(3):425–442, 2021

work page 2021

[8] [8]

Chiou and P.-L

J.-M. Chiou and P.-L. Li. Functional clustering and identifying sub- structures of longitudinal data.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4):679–699, 2007

work page 2007

[9] [9]

A sharp form of the Cram´ er–Wold theorem.Journal of Theoretical Probability, 20(2):201–209, 2007

Juan Antonio Cuesta-Albertos, Ricardo Fraiman, and Thomas Rans- ford. A sharp form of the Cram´ er–Wold theorem.Journal of Theoretical Probability, 20(2):201–209, 2007

work page 2007

[10] [10]

Clustering functional data into groups by using projections.Journal of the Royal Statistical Society Series B: Statistical Methodology, 81(2):271–304, 2019

Aurore Delaigle, Peter Hall, and Tung Pham. Clustering functional data into groups by using projections.Journal of the Royal Statistical Society Series B: Statistical Methodology, 81(2):271–304, 2019

work page 2019

[11] [11]

Curves discrimination: a nonpara- metric functional approach.Computational Statistics & Data Analysis, 44(1-2):161–173, 2003

Fr´ ed´ eric Ferraty and Philippe Vieu. Curves discrimination: a nonpara- metric functional approach.Computational Statistics & Data Analysis, 44(1-2):161–173, 2003

work page 2003

[12] [12]

Giacofci, S

M. Giacofci, S. Lambert-Lacroix, G. Marot, and F. Picard. Wavelet- Based Clustering for Mixed-Effects Functional Models in High Dimen- sion.Biometrics, 69(1):31–40, 02 2013

work page 2013

[13] [13]

Functional neural networks: shift invariant models for functional data with applications to eeg classification

Florian Heinrichs, Mavin Heim, and Corinna Weber. Functional neural networks: shift invariant models for functional data with applications to eeg classification. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023

work page 2023

[14] [14]

Springer Science & Business Media, 2012

Lajos Horv´ ath and Piotr Kokoszka.Inference for functional data with applications, volume 200. Springer Science & Business Media, 2012

work page 2012

[15] [15]

Society for Industrial and Applied Mathematics, Philadelphia, PA, 2021

Tsung-Yu Hsieh, Yiwei Sun, Suhang Wang, and Vasant Honavar.Func- tional Autoencoders for Functional Data Representation Learning, pages 666–674. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2021

work page 2021

[16] [16]

Funclust: A curves clustering method using functional random variables density approximation.Neu- rocomputing, 112:164–171, 2013

Julien Jacques and Cristian Preda. Funclust: A curves clustering method using functional random variables density approximation.Neu- rocomputing, 112:164–171, 2013. Advances in artificial neural networks, machine learning, and computational intelligence

work page 2013

[17] [17]

Functional clus- ter analysis via orthonormalized gaussian basis expansions and its ap- plication.Journal of classification, 27:211–230, 2010

Mitsunori Kayano, Koji Dozono, and Sadanori Konishi. Functional clus- ter analysis via orthonormalized gaussian basis expansions and its ap- plication.Journal of classification, 27:211–230, 2010

work page 2010

[18] [18]

MAGMA: inference and prediction using multi-task Gaussian processes with common mean.Machine Learning, 111(5):1821–1849, 2022

Arthur Leroy, Pierre Latouche, Benjamin Guedj, and Servane Gey. MAGMA: inference and prediction using multi-task Gaussian processes with common mean.Machine Learning, 111(5):1821–1849, 2022

work page 2022

[19] [19]

Classification of functional data: A segmenta- tion approach.Computational Statistics & Data Analysis, 52(10):4790– 4800, 2008

Bin Li and Qingzhao Yu. Classification of functional data: A segmenta- tion approach.Computational Statistics & Data Analysis, 52(10):4790– 4800, 2008

work page 2008

[20] [20]

K-means algorithms for functional data.Neurocom- puting, 151:231–245, 2015

Mar´ ıa Luz L´ opez Garc´ ıa, Ricardo Garc´ ıa-R´ odenas, and Antonia Gonz´ alez G´ omez. K-means algorithms for functional data.Neurocom- puting, 151:231–245, 2015

work page 2015

[21] [21]

Parameter clustering in bayesian functional principal component analysis of neuroscientific data.Statistics in Medicine, 40(1):167–184, 2021

Nicol` o Margaritella, Vanda In´ acio, and Ruth King. Parameter clustering in bayesian functional principal component analysis of neuroscientific data.Statistics in Medicine, 40(1):167–184, 2021

work page 2021

[22] [22]

A k-means procedure based on a mahalanobis type distance for clustering multivariate functional data.Statistical Methods & Appli- cations, 28:301–322, 2019

Andrea Martino, Andrea Ghiglietti, Francesca Ieva, and Anna Maria Paganoni. A k-means procedure based on a mahalanobis type distance for clustering multivariate functional data.Statistical Methods & Appli- cations, 28:301–322, 2019

work page 2019

[23] [23]

Objective criteria for the evaluation of clustering methods.Journal of the American Statistical Association, 66(336):846– 850, 1971

William M Rand. Objective criteria for the evaluation of clustering methods.Journal of the American Statistical Association, 66(336):846– 850, 1971

work page 1971

[24] [24]

Multi- variate functional data clustering using adaptive density peak detection

Rui Ren, Kuangnan Fang, Qingzhao Zhang, and Xiaofeng Wang. Multi- variate functional data clustering using adaptive density peak detection. Statistics in Medicine, 42(10):1565–1582, 2023

work page 2023

[25] [25]

Support vector machine for functional data classification.Neurocomputing, 69(7-9):730–742, 2006

Fabrice Rossi and Nathalie Villa. Support vector machine for functional data classification.Neurocomputing, 69(7-9):730–742, 2006

work page 2006

[26] [26]

Soham Sarkar and Anil K. Ghosh. On perfect clustering of high dimen- sion, low sample size data.IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(9):2257–2272, 2019. Published online 2019, print issue 2020

work page 2019

[27] [27]

Singh, Shirley Coyle, and Mimi Zhang

Samuel V. Singh, Shirley Coyle, and Mimi Zhang. Shape-informed clus- tering of multi-dimensional functional data via deep functional autoen- coders. InAdvances in Neural Information Processing Systems (NeurIPS 2025), San Diego, CA, USA, Dec 2025. NeurIPS 2025 poster / proceed- ings (OpenReview)

work page 2025

[28] [28]

Phase and amplitude- based clustering for functional data.Computational Statistics & Data Analysis, 56(7):2360–2374, 2012

Leen Slaets, Gerda Claeskens, and Mia Hubert. Phase and amplitude- based clustering for functional data.Computational Statistics & Data Analysis, 56(7):2360–2374, 2012

work page 2012

[29] [29]

Sriperumbudur, Kenji Fukumizu, and Gert R.G

Bharath K. Sriperumbudur, Kenji Fukumizu, and Gert R.G. Lanck- riet. Universality, characteristic kernels and rkhs embedding of mea- sures.Journal of Machine Learning Research, 12(70):2389–2410, 2011

work page 2011

[30] [30]

Functional data analysis for sparse longitudinal data.Journal of the American statistical association, 100(470):577–590, 2005

Fang Yao, Hans-Georg M¨ uller, and Jane-Ling Wang. Functional data analysis for sparse longitudinal data.Journal of the American statistical association, 100(470):577–590, 2005

work page 2005