Gaussian mixture models in Hilbert spaces via kernel methods

Antonio \'Alvarez-L\'opez; Daniel L\'opez-Montero; Marcos Matabuena

arxiv: 2605.05996 · v1 · submitted 2026-05-07 · 📊 stat.ML · cs.LG

Gaussian mixture models in Hilbert spaces via kernel methods

Daniel L\'opez-Montero , Antonio \'Alvarez-L\'opez , Marcos Matabuena This is my paper

Pith reviewed 2026-05-08 05:12 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords Gaussian mixture modelskernel mean embeddingsHilbert spacesfunctional dataclusteringinfinite-dimensional dataoptimizationapproximation theory

0 comments

The pith

Gaussian mixture models can be defined for data in Hilbert spaces using kernel mean embeddings to achieve dense approximations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a Gaussian mixture model tailored for random objects taking values in Hilbert spaces, such as time series of functions or graph data. By relying on kernel mean embeddings, the components of the mixture can be represented without needing explicit densities, which are often problematic in infinite dimensions. The authors provide optimization procedures for estimating the model parameters from data and prove that these procedures are well-defined. They further show that such mixtures can approximate any probability measure on the Hilbert space arbitrarily well. This is relevant for clustering applications in fields where data naturally live in high- or infinite-dimensional spaces, like medical functional data or network structures.

Core claim

The central contribution is a kernel-based Gaussian mixture model for Hilbert-space valued data. The model uses kernel mean embeddings to define mixture components, allowing for efficient optimization of the parameters. Theoretical analysis confirms the algorithm's validity and demonstrates that the class of such models is dense in the space of all probability measures on the Hilbert space.

What carries the argument

Kernel mean embeddings of Gaussian mixture components, which map the mixtures into a reproducing kernel Hilbert space to enable computation and approximation in the original infinite-dimensional space.

If this is right

Mixtures defined this way can densely approximate arbitrary probability distributions on Hilbert spaces.
The optimization algorithms provide practical estimation for clustering infinite-dimensional observations.
The approach applies directly to L² functional data and to random graphs represented in Laplacian spaces.
Theoretical guarantees ensure the model remains well-defined even when dimensions are infinite.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

One could test whether this embedding approach outperforms standard dimensionality reduction techniques in clustering accuracy for functional datasets.
The density result suggests potential use in nonparametric density estimation tasks within Hilbert spaces.
Extensions to time-series dependencies or other kernel choices might broaden applicability to dynamic data without additional assumptions.

Load-bearing premise

That the kernel mean embeddings preserve enough structure from the Gaussian components in the Hilbert space to allow both practical optimization and dense approximation of measures.

What would settle it

Observing a specific probability measure on a Hilbert space, such as a non-Gaussian distribution with certain smoothness properties, that cannot be approximated closer than some positive distance by any finite Gaussian mixture under the kernel embedding.

Figures

Figures reproduced from arXiv: 2605.05996 by Antonio \'Alvarez-L\'opez, Daniel L\'opez-Montero, Marcos Matabuena.

**Figure 1.** Figure 1: Representative structured data samples. Left: Hourly continuous glucose monitoring paths. Center: Signals on a fixed graph with varying node intensities. Right: Correlation matrices. Motivation. The motivation for this work is to perform clustering of dynamic, time-varying functional objects [9, 10] that take values in a possibly infinite-dimensional space X view at source ↗

**Figure 2.** Figure 2: Numerical sensitivity of the MMD objective to various hyperparameters and sample sizes. view at source ↗

**Figure 3.** Figure 3: Temporal glucose mixture in H1 . Left: Empirical (top) and model-predicted (bottom) mean glucose surfaces. Top right: Global cluster weights πk(t) (solid lines) with group-level posteriors for control (dotted) γ¯ctrl(t) and treatment (dashed) γ¯treat(t). Bottom right: Learned means. We parameterize π(t) = softmax z(t) , where the logit path z: [0, T] → R K is the solution to a neural ODE [67], initialize… view at source ↗

**Figure 4.** Figure 4: Correlation-based temporal mixture in Sym(24). Left: MMD2 training loss (top) and cluster weights πk(t) (bottom). Right: Learned mean correlation matrices. 0 200 0.02 0.03 Loss Training loss (MMD²) 5 10 15 20 Treatment week 0.0 0.5 1.0 Probability (t): cluster weights 1(t) 2(t) 3(t) Week 1.4 Week 8.0 Week 14.7 Week 21.4 Control Treatment Similar (high) Less similar view at source ↗

**Figure 5.** Figure 5: Individual similarity graph temporal mixture. view at source ↗

**Figure 6.** Figure 6: R d Gaussian mixture recovery. Left: MMD2 training loss. Right: Empirical histogram of the samples overlaid with the true (blue) and learned (dashed coral) densities. Applicability. The measure-theoretic components of our framework—including Gaussian mixtures, Radon–Nikodym responsibilities, and MMD weak density arguments—extend naturally to separable Banach spaces. While orthogonal projections are unavail… view at source ↗

**Figure 7.** Figure 7: L 2 (0, 1; R 2 ) mixture with K = 5. (a) Raw trajectories (dimension 1) overlaid with true means. (b) True (dashed) versus predicted (solid) mean functions. (c) True versus predicted mixture weights. (d) MMD2 training loss. mixture of Gaussian processes [12, 38, 71]. Experiment. We generate multivariate functional data in X = L 2 (0, 1; R 2 ) from K = 5 Gaussian components with weights π = (0.30, 0.25, 0.2… view at source ↗

**Figure 8.** Figure 8: L 2 ([0, 1]2 ) mixture with K = 3. Left columns: True mean surfaces mk(s, t) (top row) and predicted surfaces (bottom row), aligned by color. Right column: MMD2 training loss (top) and true versus predicted mixture weights (bottom). x y z Data on S 2 True 1 True 2 True 3 Pred 1 Pred 2 Pred 3 0 100 200 300 400 Epoch 10 3 10 2 M M D2 MMD2 training loss k=1 k=2 k=3 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40… view at source ↗

**Figure 9.** Figure 9: L 2 (SO(3)) mixture with K = 3. Left: Data projected on S 2 with true and learned component directions. Middle: MMD2 training loss. Right: True versus predicted mixture weights. F.5 Graph signals We next test the method on graph-structured data, where the Hilbert-space geometry is induced by the graph Laplacian. Let G = (V, E) be a finite weighted graph with Laplacian L = D − W. For α > 0, we equip R |V | … view at source ↗

**Figure 10.** Figure 10: Graph-signal mixture with K = 3. Left block: True (top) versus predicted (bottom) mean signals per component, plotted on the shared Erdős–Rényi graph using a common colormap. Right column: MMD2 training loss (top) and true versus predicted weights (bottom). 0 2 4 t 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 x0(t) (a) Sample paths (state dim 0) 0 100 200 300 400 Epoch 10 1 10 0 M M D2 (b) MMD2 training loss 0… view at source ↗

**Figure 11.** Figure 11: Linear SDE system identification via MMD. view at source ↗

**Figure 12.** Figure 12: Representative QM9 molecules per component. Columns correspond to learned components view at source ↗

**Figure 13.** Figure 13: Most representative NTU skeleton sequences per component. Columns correspond to learned view at source ↗

read the original abstract

Modern datasets across many disciplines increasingly consist of time-evolving, potentially infinite-dimensional random objects, such as dynamic functional data, which are naturally modeled in Hilbert spaces. In these settings, characterizing probability measures, for example, through densities, can be ill-defined or technically challenging. Motivated by clustering applications, we propose a Gaussian mixture framework for Hilbert-space-valued data based on kernel mean embeddings and develop efficient optimization algorithms for estimation. We establish theoretical guarantees showing that the proposed algorithm is well defined and that the model yields a dense class of approximations in infinite-dimensional spaces. We evaluate the framework through extensive experiments on diverse structures and data geometries, including $L^2$-functional data and random graphs in Laplacian spaces arising in modern medical applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Kernel GMMs via mean embeddings give a workable clustering route for functional and graph data in Hilbert spaces, but the density guarantee needs explicit conditions on the kernel and the Gaussian family that the abstract does not supply.

read the letter

The paper sets up Gaussian mixture models for Hilbert-space objects by replacing direct densities with kernel mean embeddings. This produces an estimation procedure that stays well-defined even when the data are infinite-dimensional, such as L2 curves or Laplacian representations of graphs. The authors supply optimization routines and run them on medical-style examples, which is the practical payoff for people who already work with functional or network data. That combination is the actual new piece; prior kernel GMM work stayed in Euclidean settings, and the extension here is framed for the infinite-dimensional case with a density claim attached. The experiments target the right data geometries, so the application side is pointed in a useful direction. The soft spot is the theoretical core. The claim that the mixtures are dense in the embedded measure space requires both that the kernel is characteristic and that finite Gaussian mixtures can approximate arbitrary measures in the induced weak topology. The abstract states the result without showing these steps or ruling out counter-examples, and the stress-test concern about the missing bridge between those two properties still stands on the given text. No derivations, error bounds, or explicit assumptions appear in the summary, which leaves the guarantee at the level of assertion rather than demonstrated fact. Experiments are described as extensive but without numbers, baselines, or variability measures, so the practical improvement is not yet visible. This is for readers already doing kernel methods or functional data analysis who need a clustering tool that respects the Hilbert structure. It is worth a serious referee because the target application area is active and the algorithmic framing is concrete; referees can ask for the missing conditions on the kernel and the approximation argument, plus proper quantitative results. I would not cite it yet, but I would send it out for review rather than desk-reject.

Referee Report

2 major / 1 minor

Summary. The paper proposes a Gaussian mixture model framework for Hilbert-space-valued data (e.g., functional data or graph Laplacians) that replaces direct density modeling with kernel mean embeddings of Gaussian components. It develops associated optimization algorithms for parameter estimation and asserts two main theoretical results: (i) the algorithm is well-defined, and (ii) the resulting model class is dense in the space of probability measures on the Hilbert space. The claims are supported by experiments on L²-functional data and random graphs arising in medical applications.

Significance. If the density and well-definedness guarantees can be established rigorously, the work would supply a practical, density-free route to clustering and approximation of measures on infinite-dimensional spaces where classical densities are unavailable. The kernel-embedding approach naturally accommodates the cited data geometries and could extend existing GMM methodology beyond Euclidean settings. The experiments on medical graph data hint at downstream utility, but the absence of quantitative metrics, baselines, or error bars in the abstract leaves the practical significance difficult to gauge.

major comments (2)

[Abstract] Abstract: The central claim that 'the model yields a dense class of approximations in infinite-dimensional spaces' is load-bearing for the theoretical contribution. This requires that finite mixtures of kernel mean embeddings of Gaussians are dense in the image of the embedding map over all probability measures. The manuscript appears to rely on kernel universality without an explicit argument that the restricted Gaussian-component family is dense in the weak topology induced by the RKHS; the skeptic correctly flags that injectivity of the embedding (characteristic kernel) does not automatically imply the Gaussian restriction is dense. A concrete counter-example exclusion or additional regularity condition on the kernel and the Gaussian family is needed.
[Abstract] Abstract and theoretical sections: No derivation, proof sketch, or error analysis is supplied for either the well-definedness of the optimization algorithm or the density guarantee, despite the abstract asserting 'theoretical guarantees.' Without these, it is impossible to verify that the algorithm avoids degeneracy or that the approximation error can be controlled uniformly over the Hilbert space.

minor comments (1)

[Abstract] Experiments are described only qualitatively ('extensive experiments on diverse structures'); quantitative results, error bars, baseline comparisons, and specific performance metrics should be added to allow assessment of practical performance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and additions to the theoretical arguments.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'the model yields a dense class of approximations in infinite-dimensional spaces' is load-bearing for the theoretical contribution. This requires that finite mixtures of kernel mean embeddings of Gaussians are dense in the image of the embedding map over all probability measures. The manuscript appears to rely on kernel universality without an explicit argument that the restricted Gaussian-component family is dense in the weak topology induced by the RKHS; the skeptic correctly flags that injectivity of the embedding (characteristic kernel) does not automatically imply the Gaussian restriction is dense. A concrete counter-example exclusion or additional regularity condition on the kernel and the Gaussian family is needed.

Authors: We agree that an explicit argument is required to establish density of the Gaussian-component mixtures in the weak topology induced by the RKHS, beyond mere universality of the kernel. The manuscript invokes the characteristic property to guarantee injectivity of the embedding, but does not fully detail why the Gaussian restriction preserves density. In the revision we will add a proof sketch in the theoretical section (and appendix) showing that, under the stated assumptions on the kernel (universal/characteristic) and with the Gaussian family parameterized by means and covariances that are dense in the Hilbert space, finite mixtures of their embeddings are dense in the image of the embedding map. We will also include a brief discussion excluding counterexamples by appealing to the approximation power of Gaussians under the kernel metric. revision: yes
Referee: [Abstract] Abstract and theoretical sections: No derivation, proof sketch, or error analysis is supplied for either the well-definedness of the optimization algorithm or the density guarantee, despite the abstract asserting 'theoretical guarantees.' Without these, it is impossible to verify that the algorithm avoids degeneracy or that the approximation error can be controlled uniformly over the Hilbert space.

Authors: We acknowledge that the abstract asserts theoretical guarantees while the main text provides only high-level statements without full derivations or error bounds. In the revised manuscript we will expand the theoretical sections to include (i) a derivation establishing well-definedness of the optimization algorithm together with regularization conditions that prevent degeneracy, and (ii) a proof sketch plus error analysis for the density result that yields uniform approximation bounds over bounded sets in the Hilbert space. These additions will be placed in the main theoretical development and supported by an appendix containing the complete arguments. revision: yes

Circularity Check

0 steps flagged

No circularity: density and well-definedness claims are external theorems

full rationale

The paper's central claims rest on establishing that the GMM kernel embedding algorithm is well-defined and that finite mixtures yield dense approximations in the RKHS. These are presented as theoretical results derived from properties of kernel mean embeddings and Gaussian mixtures, not as quantities defined in terms of themselves or as fitted parameters relabeled as predictions. No equations or self-citations are shown reducing the density guarantee to a tautology (e.g., no self-definitional embedding or ansatz smuggled via prior work by the same authors). The derivation chain therefore remains self-contained against external kernel universality results and does not collapse by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no specific free parameters, axioms, or invented entities can be extracted. The approach relies on standard kernel mean embeddings and Hilbert-space structure, but details of any kernel choices, regularization, or convergence assumptions are absent.

pith-pipeline@v0.9.0 · 5422 in / 1106 out tokens · 69367 ms · 2026-05-08T05:12:49.646034+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages

[1]

J. O. Ramsay et al.Functional Data Analysis. Springer Series in Statistics. Springer New York, 2005. 10

work page 2005
[2]

Springer Series in Statistics

Frédéric Ferraty et al.Nonparametric Functional Data Analysis. Springer Series in Statistics. Springer New York, 2006

work page 2006
[3]

Model-Based Clustering, Discriminant Analysis, and Density Estimation

Chris Fraley et al. “Model-Based Clustering, Discriminant Analysis, and Density Estimation”. In: Journal of the American Statistical Association97.458 (2002), pp. 611–631

work page 2002
[4]

Maximum Likelihood from Incomplete Data Via theEMAlgorithm

A. P. Dempster et al. “Maximum Likelihood from Incomplete Data Via theEMAlgorithm”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology39.1 (1977), pp. 1–22

work page 1977
[5]

Encyclopedia of Mathematics and Its Applications

Giuseppe Da Prato et al.Stochastic Equations in Infinite Dimensions. Encyclopedia of Mathematics and Its Applications. Cambridge University Press, 2014

work page 2014
[6]

Haoyu Lu et al.Sequential Monte Carlo with Gaussian Mixture Approximation for Infinite-Dimensional Statistical Inverse Problems. 2026

work page 2026
[7]

A Kernel Two-Sample Test

Arthur Gretton et al. “A Kernel Two-Sample Test”. In:Journal of Machine Learning Research13.25 (2012), pp. 723–773

work page 2012
[8]

Universality, Characteristic Kernels and RKHS Embedding of Measures

Bharath K. Sriperumbudur et al. “Universality, Characteristic Kernels and RKHS Embedding of Measures”. In:Journal of Machine Learning Research12.70 (2011), pp. 2389–2410

work page 2011
[9]

Functional Models for Time-Varying Random Objects

Paromita Dubey et al. “Functional Models for Time-Varying Random Objects”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology82.2 (2020), pp. 275–327

work page 2020
[10]

Modeling Time-Varying Random Objects and Dynamic Networks

Paromita Dubey et al. “Modeling Time-Varying Random Objects and Dynamic Networks”. In:Journal of the American Statistical Association117.540 (2022), pp. 2252–2267

work page 2022
[11]

Trial of hybrid closed-loop control in young children with type 1 diabetes

R Paul Wadwa et al. “Trial of hybrid closed-loop control in young children with type 1 diabetes”. In: New England Journal of Medicine388.11 (2023), pp. 991–1001

work page 2023
[12]

Antonio Álvarez-López et al.Continuous-Time Learning of Probability Distributions: A Case Study in a Digital Trial of Young Children with Type 1 Diabetes. 2026. arXiv:2603.24427

work page arXiv 2026
[13]

François-Xavier Briol et al.A Dictionary of Closed-Form Kernel Mean Embeddings. 2025

work page 2025
[14]

Karhunen–Loève Decomposition of Gaussian Measures on Banach Spaces

Xavier Bay et al. “Karhunen–Loève Decomposition of Gaussian Measures on Banach Spaces”. In: Probability and Mathematical Statistics39.2 (2019), pp. 279–297

work page 2019
[15]

Model-Based Clustering of Time Series in Group-Specific Functional Subspaces

Charles Bouveyron et al. “Model-Based Clustering of Time Series in Group-Specific Functional Subspaces”. In:Advances in Data Analysis and Classification5.4 (2011), pp. 281–300

work page 2011
[16]

Functional Clustering and Identifying Substructures of Longitudinal Data

Jeng-Min Chiou et al. “Functional Clustering and Identifying Substructures of Longitudinal Data”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology69.4 (2007), pp. 679–699

work page 2007
[17]

Clustering for Sparsely Sampled Functional Data

Gareth M James et al. “Clustering for Sparsely Sampled Functional Data”. In:Journal of the American Statistical Association98.462 (2003), pp. 397–408

work page 2003
[18]

Wavelet-Based Clustering for Mixed-Effects Functional Models in High Dimension

M. Giacofci et al. “Wavelet-Based Clustering for Mixed-Effects Functional Models in High Dimension”. In:Biometrics69.1 (2013), pp. 31–40

work page 2013
[19]

Funclust: A Curves Clustering Method Using Functional Random Variables Density Approximation

Julien Jacques et al. “Funclust: A Curves Clustering Method Using Functional Random Variables Density Approximation”. In:Neurocomputing112 (2013), pp. 164–171

work page 2013
[20]

Vladimir Bogachev.Gaussian Measures. Vol. 62. Mathematical Surveys and Monographs. American Mathematical Society, 1998

work page 1998
[21]

Defining Probability Density for a Distribution of Random Functions

Aurore Delaigle et al. “Defining Probability Density for a Distribution of Random Functions”. In: The Annals of Statistics38.2 (2010)

work page 2010
[22]

K-Means Algorithms for Functional Data

María Luz López García et al. “K-Means Algorithms for Functional Data”. In:Neurocomputing151 (2015), pp. 231–245

work page 2015
[23]

A Comparison of Hierarchical Methods for Clustering Functional Data

Laura Ferreira et al. “A Comparison of Hierarchical Methods for Clustering Functional Data”. In: Communications in Statistics - Simulation and Computation38.9 (2009), pp. 1925–1949

work page 2009
[24]

A Hilbert Space Embedding for Distributions

Alex Smola et al. “A Hilbert Space Embedding for Distributions”. In:Algorithmic Learning Theory. Ed. by Marcus Hutter et al. Springer, 2007, pp. 13–31

work page 2007
[25]

Kernel Mean Embedding of Distributions: A Review and Beyond

Krikamol Muandet et al. “Kernel Mean Embedding of Distributions: A Review and Beyond”. In: Foundations and Trends®in Machine Learning10.1–2 (2017), pp. 1–141

work page 2017
[26]

Springer US, 2004

Alain Berlinet et al.Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer US, 2004. 11

work page 2004
[27]

Equivalence of Distance-Based and RKHS-based Statistics in Hypothesis Testing

Dino Sejdinovic et al. “Equivalence of Distance-Based and RKHS-based Statistics in Hypothesis Testing”. In:The Annals of Statistics41.5 (2013)

work page 2013
[28]

Generative Moment Matching Networks

Yujia Li et al. “Generative Moment Matching Networks”. In:Proceedings of the 32nd International Conference on Machine Learning. PMLR, 2015, pp. 1718–1727

work page 2015
[29]

François-Xavier Briol et al.Statistical Inference for Generative Models with Maximum Mean Discrep- ancy. 2019

work page 2019
[30]

Finite Sample Properties of Parametric MMD Estimation: Robustness to Misspecification and Dependence

Badr-Eddine Chérief-Abdellatif et al. “Finite Sample Properties of Parametric MMD Estimation: Robustness to Misspecification and Dependence”. In:Bernoulli28.1 (2022), pp. 181–213

work page 2022
[31]

Minimax Estimation of Kernel Mean Embeddings

Ilya Tolstikhin et al. “Minimax Estimation of Kernel Mean Embeddings”. In:Journal of Machine Learning Research18.86 (2017), pp. 1–47

work page 2017
[32]

arXiv.org

Guilherme França et al.Kernel K-Groups via Hartigan’s Method. arXiv.org. 2017

work page 2017
[33]

Kernel Biclustering Algorithm in Hilbert Spaces

Marcos Matabuena et al. “Kernel Biclustering Algorithm in Hilbert Spaces”. In:Advances in Data Analysis and Classification(2025)

work page 2025
[34]

An Analysis of Distributional Reinforcement Learning with Gaussian Mixtures

Mathis Antonetti et al. “An Analysis of Distributional Reinforcement Learning with Gaussian Mixtures”. In:Transactions on Machine Learning Research(2025)

work page 2025
[35]

A Kernel Two-Sample Test for Functional Data

George Wynne et al. “A Kernel Two-Sample Test for Functional Data”. In:Journal of Machine Learning Research23.73 (2022), pp. 1–51

work page 2022
[36]

D. M. Titterington et al.Statistical Analysis of Finite Mixture Distributions. Wiley, 1985

work page 1985
[37]

Wasserstein Distributional Learning via Majorization-Minimization

Chengliang Tang et al. “Wasserstein Distributional Learning via Majorization-Minimization”. In: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 10703–10731

work page 2023
[38]

Mixtures of Gaussian Processes

Volker Tresp. “Mixtures of Gaussian Processes”. In:Advances in Neural Information Processing Systems. Vol. 13. MIT Press, 2000

work page 2000
[39]

Estimating Mixture of Gaussian Processes by Kernel Smoothing

Mian Huang et al. “Estimating Mixture of Gaussian Processes by Kernel Smoothing”. In:Journal of Business & Economic Statistics32.2 (2014), pp. 259–270

work page 2014
[40]

Clustering Gene Expression Time Series Data Using an Infinite Gaussian Process Mixture Model

Ian C. McDowell et al. “Clustering Gene Expression Time Series Data Using an Infinite Gaussian Process Mixture Model”. In:PLOS Computational Biology14.1 (2018), e1005896

work page 2018
[41]

Statistical Aspects of Wasserstein Distances

Victor M. Panaretos et al. “Statistical Aspects of Wasserstein Distances”. In:Annual Review of Statistics and Its Application6 (2019), pp. 405–431

work page 2019
[42]

Fast and Eager k -Medoids Clustering: O ( k ) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms

Erich Schubert et al. “Fast and Eager k -Medoids Clustering: O ( k ) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms”. In:Information Systems101 (2021), p. 101804

work page 2021
[43]

Web-Scale k-Means Clustering

D. Sculley. “Web-Scale k-Means Clustering”. In:Proceedings of the 19th International Conference on World Wide Web. ACM, 2010, pp. 1177–1178

work page 2010
[44]

Algorithms for Hierarchical Clustering: An Overview

Fionn Murtagh et al. “Algorithms for Hierarchical Clustering: An Overview”. In:WIREs Data Mining and Knowledge Discovery2.1 (2012), pp. 86–97

work page 2012
[45]

Hierarchical Grouping to Optimize an Objective Function

Joe H. Ward. “Hierarchical Grouping to Optimize an Objective Function”. In:Journal of the American Statistical Association58.301 (1963), pp. 236–244

work page 1963
[46]

BIRCH: An Efficient Data Clustering Method for Very Large Databases

Tian Zhang et al. “BIRCH: An Efficient Data Clustering Method for Very Large Databases”. In:ACM SIGMOD Record25.2 (1996), pp. 103–114

work page 1996
[47]

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise

Martin Ester et al. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”. In:Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. KDD’96. AAAI Press, 1996, pp. 226–231

work page 1996
[48]

OPTICS: Ordering Points to Identify the Clustering Structure

Mihael Ankerst et al. “OPTICS: Ordering Points to Identify the Clustering Structure”. In:Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. ACM, 1999, pp. 49–60

work page 1999
[49]

On Spectral Clustering: Analysis and an Algorithm

Andrew Y. Ng et al. “On Spectral Clustering: Analysis and an Algorithm”. In:Proceedings of the 15th International Conference on Neural Information Processing Systems: Natural and Synthetic. NIPS’01. MIT Press, 2001, pp. 849–856

work page 2001
[50]

Scikit-Learn: Machine Learning in Python

F. Pedregosa et al. “Scikit-Learn: Machine Learning in Python”. In:Journal of Machine Learning Research12 (2011), pp. 2825–2830. 12

work page 2011
[51]

Routledge, 2017

Leo Breiman et al.Classification And Regression Trees. Routledge, 2017

work page 2017
[52]

Categorical Functional Data Analysis. the Cfda r Package

Cristian Preda et al. “Categorical Functional Data Analysis. the Cfda r Package”. In:Mathematics 9.23 (2021), p. 3074

work page 2021
[53]

Carnegie Mellon University, 2001

Robert Thomas Olszewski.Generalized Feature Extraction for Structural Pattern Recognition in Time- Series Data. Carnegie Mellon University, 2001

work page 2001
[54]

Structure-Activity Relationship of Mutagenic Aromatic and Heteroaro- matic Nitro Compounds. Correlation with Molecular Orbital Energies and Hydrophobicity

Asim Kumar Debnath et al. “Structure-Activity Relationship of Mutagenic Aromatic and Heteroaro- matic Nitro Compounds. Correlation with Molecular Orbital Energies and Hydrophobicity”. In: Journal of Medicinal Chemistry34.2 (1991), pp. 786–797

work page 1991
[55]

Wiley Series in Probability and Statistics

Leonard Kaufman et al.Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability and Statistics. Wiley, 1990

work page 1990
[56]

A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems

G. N. Lance et al. “A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems”. In:The Computer Journal9.4 (1967), pp. 373–380

work page 1967
[57]

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Ricardo J. G. B. Campello et al. “Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection”. In:ACM Transactions on Knowledge Discovery from Data10.1 (2015), pp. 1–51

work page 2015
[58]

Clustering to Minimize the Maximum Intercluster Distance

Teofilo F. Gonzalez. “Clustering to Minimize the Maximum Intercluster Distance”. In:Theoretical Computer Science38 (1985), pp. 293–306

work page 1985
[59]

Bezdek.Pattern Recognition with Fuzzy Objective Function Algorithms

James C. Bezdek.Pattern Recognition with Fuzzy Objective Function Algorithms. Springer US, 1981

work page 1981
[60]

Clustering by Passing Messages Between Data Points

Brendan J. Frey et al. “Clustering by Passing Messages Between Data Points”. In:Science315.5814 (2007), pp. 972–976

work page 2007
[61]

Mean Shift, Mode Seeking, and Clustering

Yizong Cheng. “Mean Shift, Mode Seeking, and Clustering”. In:IEEE Transactions on Pattern Analysis and Machine Intelligence17.8 (1995), pp. 790–799

work page 1995
[62]

A Review of Standards and Statistics Used to Describe Blood Glucose Monitor Performance

Jan S Krouwer et al. “A Review of Standards and Statistics Used to Describe Blood Glucose Monitor Performance”. In:Journal of Diabetes Science and Technology4.1 (2010), pp. 75–83

work page 2010
[63]

Statistical Tools to Analyze Continuous Glucose Monitor Data

William Clarke et al. “Statistical Tools to Analyze Continuous Glucose Monitor Data”. In:Diabetes technology & therapeutics11 (2009), S–45

work page 2009
[64]

Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer’s Disease

Kaustubh Supekar et al. “Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer’s Disease”. In:PLOS Computational Biology4.6 (2008), e1000100

work page 2008
[65]

Parkinson’s Disease-Related Spatial Covariance Pattern Identified with Resting-State Functional MRI

Tao Wu et al. “Parkinson’s Disease-Related Spatial Covariance Pattern Identified with Resting-State Functional MRI”. In:Journal of Cerebral Blood Flow & Metabolism35.11 (2015), pp. 1764–1770

work page 2015
[66]

Closed-Loop Insulin Delivery in Suboptimally Controlled Type 1 Diabetes: A Multicentre, 12-Week Randomised Trial

Martin Tauschmann et al. “Closed-Loop Insulin Delivery in Suboptimally Controlled Type 1 Diabetes: A Multicentre, 12-Week Randomised Trial”. In:The Lancet392.10155 (2018), pp. 1321–1329

work page 2018
[67]

Ricky T. Q. Chen et al.Neural Ordinary Differential Equations. 2019

work page 2019
[68]

An Approach to Incorporate Subsampling Into a Generic Bayesian Hierarchical Model

Jonathan R. Bradley. “An Approach to Incorporate Subsampling Into a Generic Bayesian Hierarchical Model”. In:Journal of Computational and Graphical Statistics30.4 (2021), pp. 889–905

work page 2021
[69]

Scikit-Fda: APythonPackage for Functional Data Analysis

Carlos Ramos-Carreño et al. “Scikit-Fda: APythonPackage for Functional Data Analysis”. In: Journal of Statistical Software109.2 (2024)

work page 2024
[70]

Bishop.Pattern Recognition and Machine Learning

Christopher M. Bishop.Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, 2006

work page 2006
[71]

Adaptive Computation and Machine Learning

Carl Edward Rasmussen et al.Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, 2008

work page 2008
[72]

Amir Shahroudy et al.NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis. 2016. A Background This section establishes the notation and foundational properties of Gaussian measures and kernel methods used throughout this work. Appendix E discusses the extension of this framework to Banach spaces. 13 A.1 Gaussian measures on separable Hilbert sp...

work page 2016
[73]

Gaussian radial kernel (Proposition B.1)Let A∈RM×Mdenote the matrix representation of the restricted operatorA|XM in the basis (er)M r=1, with entries Arℓ=⟨Aeℓ,er⟩X . The closed forms of Proposition B.1 become the finite-dimensional approximations J(M) i,k := det ( IM +A 1/2KkA1/2)−1/2 ×exp { −1 2(xi−mk)⊤A1/2( IM +A 1/2KkA1/2)−1 A1/2(xi−mk) } , I(M) k,s :...

work page
[74]

In the projected space, this becomesκ(x,y) = (x⊤y+c) p onRM

Polynomial kernel (Proposition B.2)Fix an integer p≥1and c≥0and consider κ(x,y) = (⟨x,y⟩X +c)p. In the projected space, this becomesκ(x,y) = (x⊤y+c) p onRM . ComputingJ (M) i,k .Fory∼νk,M , define the scalars µ(M) i,k :=x⊤ i mk +c, v (M) i,k :=x⊤ i Kk xi, so that x⊤ i y+c is a one-dimensional Gaussian with meanµ(M) i,k and variancev(M) i,k . Then Proposit...

work page

[1] [1]

J. O. Ramsay et al.Functional Data Analysis. Springer Series in Statistics. Springer New York, 2005. 10

work page 2005

[2] [2]

Springer Series in Statistics

Frédéric Ferraty et al.Nonparametric Functional Data Analysis. Springer Series in Statistics. Springer New York, 2006

work page 2006

[3] [3]

Model-Based Clustering, Discriminant Analysis, and Density Estimation

Chris Fraley et al. “Model-Based Clustering, Discriminant Analysis, and Density Estimation”. In: Journal of the American Statistical Association97.458 (2002), pp. 611–631

work page 2002

[4] [4]

Maximum Likelihood from Incomplete Data Via theEMAlgorithm

A. P. Dempster et al. “Maximum Likelihood from Incomplete Data Via theEMAlgorithm”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology39.1 (1977), pp. 1–22

work page 1977

[5] [5]

Encyclopedia of Mathematics and Its Applications

Giuseppe Da Prato et al.Stochastic Equations in Infinite Dimensions. Encyclopedia of Mathematics and Its Applications. Cambridge University Press, 2014

work page 2014

[6] [6]

Haoyu Lu et al.Sequential Monte Carlo with Gaussian Mixture Approximation for Infinite-Dimensional Statistical Inverse Problems. 2026

work page 2026

[7] [7]

A Kernel Two-Sample Test

Arthur Gretton et al. “A Kernel Two-Sample Test”. In:Journal of Machine Learning Research13.25 (2012), pp. 723–773

work page 2012

[8] [8]

Universality, Characteristic Kernels and RKHS Embedding of Measures

Bharath K. Sriperumbudur et al. “Universality, Characteristic Kernels and RKHS Embedding of Measures”. In:Journal of Machine Learning Research12.70 (2011), pp. 2389–2410

work page 2011

[9] [9]

Functional Models for Time-Varying Random Objects

Paromita Dubey et al. “Functional Models for Time-Varying Random Objects”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology82.2 (2020), pp. 275–327

work page 2020

[10] [10]

Modeling Time-Varying Random Objects and Dynamic Networks

Paromita Dubey et al. “Modeling Time-Varying Random Objects and Dynamic Networks”. In:Journal of the American Statistical Association117.540 (2022), pp. 2252–2267

work page 2022

[11] [11]

Trial of hybrid closed-loop control in young children with type 1 diabetes

R Paul Wadwa et al. “Trial of hybrid closed-loop control in young children with type 1 diabetes”. In: New England Journal of Medicine388.11 (2023), pp. 991–1001

work page 2023

[12] [12]

Antonio Álvarez-López et al.Continuous-Time Learning of Probability Distributions: A Case Study in a Digital Trial of Young Children with Type 1 Diabetes. 2026. arXiv:2603.24427

work page arXiv 2026

[13] [13]

François-Xavier Briol et al.A Dictionary of Closed-Form Kernel Mean Embeddings. 2025

work page 2025

[14] [14]

Karhunen–Loève Decomposition of Gaussian Measures on Banach Spaces

Xavier Bay et al. “Karhunen–Loève Decomposition of Gaussian Measures on Banach Spaces”. In: Probability and Mathematical Statistics39.2 (2019), pp. 279–297

work page 2019

[15] [15]

Model-Based Clustering of Time Series in Group-Specific Functional Subspaces

Charles Bouveyron et al. “Model-Based Clustering of Time Series in Group-Specific Functional Subspaces”. In:Advances in Data Analysis and Classification5.4 (2011), pp. 281–300

work page 2011

[16] [16]

Functional Clustering and Identifying Substructures of Longitudinal Data

Jeng-Min Chiou et al. “Functional Clustering and Identifying Substructures of Longitudinal Data”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology69.4 (2007), pp. 679–699

work page 2007

[17] [17]

Clustering for Sparsely Sampled Functional Data

Gareth M James et al. “Clustering for Sparsely Sampled Functional Data”. In:Journal of the American Statistical Association98.462 (2003), pp. 397–408

work page 2003

[18] [18]

Wavelet-Based Clustering for Mixed-Effects Functional Models in High Dimension

M. Giacofci et al. “Wavelet-Based Clustering for Mixed-Effects Functional Models in High Dimension”. In:Biometrics69.1 (2013), pp. 31–40

work page 2013

[19] [19]

Funclust: A Curves Clustering Method Using Functional Random Variables Density Approximation

Julien Jacques et al. “Funclust: A Curves Clustering Method Using Functional Random Variables Density Approximation”. In:Neurocomputing112 (2013), pp. 164–171

work page 2013

[20] [20]

Vladimir Bogachev.Gaussian Measures. Vol. 62. Mathematical Surveys and Monographs. American Mathematical Society, 1998

work page 1998

[21] [21]

Defining Probability Density for a Distribution of Random Functions

Aurore Delaigle et al. “Defining Probability Density for a Distribution of Random Functions”. In: The Annals of Statistics38.2 (2010)

work page 2010

[22] [22]

K-Means Algorithms for Functional Data

María Luz López García et al. “K-Means Algorithms for Functional Data”. In:Neurocomputing151 (2015), pp. 231–245

work page 2015

[23] [23]

A Comparison of Hierarchical Methods for Clustering Functional Data

Laura Ferreira et al. “A Comparison of Hierarchical Methods for Clustering Functional Data”. In: Communications in Statistics - Simulation and Computation38.9 (2009), pp. 1925–1949

work page 2009

[24] [24]

A Hilbert Space Embedding for Distributions

Alex Smola et al. “A Hilbert Space Embedding for Distributions”. In:Algorithmic Learning Theory. Ed. by Marcus Hutter et al. Springer, 2007, pp. 13–31

work page 2007

[25] [25]

Kernel Mean Embedding of Distributions: A Review and Beyond

Krikamol Muandet et al. “Kernel Mean Embedding of Distributions: A Review and Beyond”. In: Foundations and Trends®in Machine Learning10.1–2 (2017), pp. 1–141

work page 2017

[26] [26]

Springer US, 2004

Alain Berlinet et al.Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer US, 2004. 11

work page 2004

[27] [27]

Equivalence of Distance-Based and RKHS-based Statistics in Hypothesis Testing

Dino Sejdinovic et al. “Equivalence of Distance-Based and RKHS-based Statistics in Hypothesis Testing”. In:The Annals of Statistics41.5 (2013)

work page 2013

[28] [28]

Generative Moment Matching Networks

Yujia Li et al. “Generative Moment Matching Networks”. In:Proceedings of the 32nd International Conference on Machine Learning. PMLR, 2015, pp. 1718–1727

work page 2015

[29] [29]

François-Xavier Briol et al.Statistical Inference for Generative Models with Maximum Mean Discrep- ancy. 2019

work page 2019

[30] [30]

Finite Sample Properties of Parametric MMD Estimation: Robustness to Misspecification and Dependence

Badr-Eddine Chérief-Abdellatif et al. “Finite Sample Properties of Parametric MMD Estimation: Robustness to Misspecification and Dependence”. In:Bernoulli28.1 (2022), pp. 181–213

work page 2022

[31] [31]

Minimax Estimation of Kernel Mean Embeddings

Ilya Tolstikhin et al. “Minimax Estimation of Kernel Mean Embeddings”. In:Journal of Machine Learning Research18.86 (2017), pp. 1–47

work page 2017

[32] [32]

arXiv.org

Guilherme França et al.Kernel K-Groups via Hartigan’s Method. arXiv.org. 2017

work page 2017

[33] [33]

Kernel Biclustering Algorithm in Hilbert Spaces

Marcos Matabuena et al. “Kernel Biclustering Algorithm in Hilbert Spaces”. In:Advances in Data Analysis and Classification(2025)

work page 2025

[34] [34]

An Analysis of Distributional Reinforcement Learning with Gaussian Mixtures

Mathis Antonetti et al. “An Analysis of Distributional Reinforcement Learning with Gaussian Mixtures”. In:Transactions on Machine Learning Research(2025)

work page 2025

[35] [35]

A Kernel Two-Sample Test for Functional Data

George Wynne et al. “A Kernel Two-Sample Test for Functional Data”. In:Journal of Machine Learning Research23.73 (2022), pp. 1–51

work page 2022

[36] [36]

D. M. Titterington et al.Statistical Analysis of Finite Mixture Distributions. Wiley, 1985

work page 1985

[37] [37]

Wasserstein Distributional Learning via Majorization-Minimization

Chengliang Tang et al. “Wasserstein Distributional Learning via Majorization-Minimization”. In: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 10703–10731

work page 2023

[38] [38]

Mixtures of Gaussian Processes

Volker Tresp. “Mixtures of Gaussian Processes”. In:Advances in Neural Information Processing Systems. Vol. 13. MIT Press, 2000

work page 2000

[39] [39]

Estimating Mixture of Gaussian Processes by Kernel Smoothing

Mian Huang et al. “Estimating Mixture of Gaussian Processes by Kernel Smoothing”. In:Journal of Business & Economic Statistics32.2 (2014), pp. 259–270

work page 2014

[40] [40]

Clustering Gene Expression Time Series Data Using an Infinite Gaussian Process Mixture Model

Ian C. McDowell et al. “Clustering Gene Expression Time Series Data Using an Infinite Gaussian Process Mixture Model”. In:PLOS Computational Biology14.1 (2018), e1005896

work page 2018

[41] [41]

Statistical Aspects of Wasserstein Distances

Victor M. Panaretos et al. “Statistical Aspects of Wasserstein Distances”. In:Annual Review of Statistics and Its Application6 (2019), pp. 405–431

work page 2019

[42] [42]

Fast and Eager k -Medoids Clustering: O ( k ) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms

Erich Schubert et al. “Fast and Eager k -Medoids Clustering: O ( k ) Runtime Improvement of the PAM, CLARA, and CLARANS Algorithms”. In:Information Systems101 (2021), p. 101804

work page 2021

[43] [43]

Web-Scale k-Means Clustering

D. Sculley. “Web-Scale k-Means Clustering”. In:Proceedings of the 19th International Conference on World Wide Web. ACM, 2010, pp. 1177–1178

work page 2010

[44] [44]

Algorithms for Hierarchical Clustering: An Overview

Fionn Murtagh et al. “Algorithms for Hierarchical Clustering: An Overview”. In:WIREs Data Mining and Knowledge Discovery2.1 (2012), pp. 86–97

work page 2012

[45] [45]

Hierarchical Grouping to Optimize an Objective Function

Joe H. Ward. “Hierarchical Grouping to Optimize an Objective Function”. In:Journal of the American Statistical Association58.301 (1963), pp. 236–244

work page 1963

[46] [46]

BIRCH: An Efficient Data Clustering Method for Very Large Databases

Tian Zhang et al. “BIRCH: An Efficient Data Clustering Method for Very Large Databases”. In:ACM SIGMOD Record25.2 (1996), pp. 103–114

work page 1996

[47] [47]

A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise

Martin Ester et al. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”. In:Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. KDD’96. AAAI Press, 1996, pp. 226–231

work page 1996

[48] [48]

OPTICS: Ordering Points to Identify the Clustering Structure

Mihael Ankerst et al. “OPTICS: Ordering Points to Identify the Clustering Structure”. In:Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. ACM, 1999, pp. 49–60

work page 1999

[49] [49]

On Spectral Clustering: Analysis and an Algorithm

Andrew Y. Ng et al. “On Spectral Clustering: Analysis and an Algorithm”. In:Proceedings of the 15th International Conference on Neural Information Processing Systems: Natural and Synthetic. NIPS’01. MIT Press, 2001, pp. 849–856

work page 2001

[50] [50]

Scikit-Learn: Machine Learning in Python

F. Pedregosa et al. “Scikit-Learn: Machine Learning in Python”. In:Journal of Machine Learning Research12 (2011), pp. 2825–2830. 12

work page 2011

[51] [51]

Routledge, 2017

Leo Breiman et al.Classification And Regression Trees. Routledge, 2017

work page 2017

[52] [52]

Categorical Functional Data Analysis. the Cfda r Package

Cristian Preda et al. “Categorical Functional Data Analysis. the Cfda r Package”. In:Mathematics 9.23 (2021), p. 3074

work page 2021

[53] [53]

Carnegie Mellon University, 2001

Robert Thomas Olszewski.Generalized Feature Extraction for Structural Pattern Recognition in Time- Series Data. Carnegie Mellon University, 2001

work page 2001

[54] [54]

Structure-Activity Relationship of Mutagenic Aromatic and Heteroaro- matic Nitro Compounds. Correlation with Molecular Orbital Energies and Hydrophobicity

Asim Kumar Debnath et al. “Structure-Activity Relationship of Mutagenic Aromatic and Heteroaro- matic Nitro Compounds. Correlation with Molecular Orbital Energies and Hydrophobicity”. In: Journal of Medicinal Chemistry34.2 (1991), pp. 786–797

work page 1991

[55] [55]

Wiley Series in Probability and Statistics

Leonard Kaufman et al.Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability and Statistics. Wiley, 1990

work page 1990

[56] [56]

A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems

G. N. Lance et al. “A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems”. In:The Computer Journal9.4 (1967), pp. 373–380

work page 1967

[57] [57]

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Ricardo J. G. B. Campello et al. “Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection”. In:ACM Transactions on Knowledge Discovery from Data10.1 (2015), pp. 1–51

work page 2015

[58] [58]

Clustering to Minimize the Maximum Intercluster Distance

Teofilo F. Gonzalez. “Clustering to Minimize the Maximum Intercluster Distance”. In:Theoretical Computer Science38 (1985), pp. 293–306

work page 1985

[59] [59]

Bezdek.Pattern Recognition with Fuzzy Objective Function Algorithms

James C. Bezdek.Pattern Recognition with Fuzzy Objective Function Algorithms. Springer US, 1981

work page 1981

[60] [60]

Clustering by Passing Messages Between Data Points

Brendan J. Frey et al. “Clustering by Passing Messages Between Data Points”. In:Science315.5814 (2007), pp. 972–976

work page 2007

[61] [61]

Mean Shift, Mode Seeking, and Clustering

Yizong Cheng. “Mean Shift, Mode Seeking, and Clustering”. In:IEEE Transactions on Pattern Analysis and Machine Intelligence17.8 (1995), pp. 790–799

work page 1995

[62] [62]

A Review of Standards and Statistics Used to Describe Blood Glucose Monitor Performance

Jan S Krouwer et al. “A Review of Standards and Statistics Used to Describe Blood Glucose Monitor Performance”. In:Journal of Diabetes Science and Technology4.1 (2010), pp. 75–83

work page 2010

[63] [63]

Statistical Tools to Analyze Continuous Glucose Monitor Data

William Clarke et al. “Statistical Tools to Analyze Continuous Glucose Monitor Data”. In:Diabetes technology & therapeutics11 (2009), S–45

work page 2009

[64] [64]

Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer’s Disease

Kaustubh Supekar et al. “Network Analysis of Intrinsic Functional Brain Connectivity in Alzheimer’s Disease”. In:PLOS Computational Biology4.6 (2008), e1000100

work page 2008

[65] [65]

Parkinson’s Disease-Related Spatial Covariance Pattern Identified with Resting-State Functional MRI

Tao Wu et al. “Parkinson’s Disease-Related Spatial Covariance Pattern Identified with Resting-State Functional MRI”. In:Journal of Cerebral Blood Flow & Metabolism35.11 (2015), pp. 1764–1770

work page 2015

[66] [66]

Closed-Loop Insulin Delivery in Suboptimally Controlled Type 1 Diabetes: A Multicentre, 12-Week Randomised Trial

Martin Tauschmann et al. “Closed-Loop Insulin Delivery in Suboptimally Controlled Type 1 Diabetes: A Multicentre, 12-Week Randomised Trial”. In:The Lancet392.10155 (2018), pp. 1321–1329

work page 2018

[67] [67]

Ricky T. Q. Chen et al.Neural Ordinary Differential Equations. 2019

work page 2019

[68] [68]

An Approach to Incorporate Subsampling Into a Generic Bayesian Hierarchical Model

Jonathan R. Bradley. “An Approach to Incorporate Subsampling Into a Generic Bayesian Hierarchical Model”. In:Journal of Computational and Graphical Statistics30.4 (2021), pp. 889–905

work page 2021

[69] [69]

Scikit-Fda: APythonPackage for Functional Data Analysis

Carlos Ramos-Carreño et al. “Scikit-Fda: APythonPackage for Functional Data Analysis”. In: Journal of Statistical Software109.2 (2024)

work page 2024

[70] [70]

Bishop.Pattern Recognition and Machine Learning

Christopher M. Bishop.Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, 2006

work page 2006

[71] [71]

Adaptive Computation and Machine Learning

Carl Edward Rasmussen et al.Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, 2008

work page 2008

[72] [72]

Amir Shahroudy et al.NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis. 2016. A Background This section establishes the notation and foundational properties of Gaussian measures and kernel methods used throughout this work. Appendix E discusses the extension of this framework to Banach spaces. 13 A.1 Gaussian measures on separable Hilbert sp...

work page 2016

[73] [73]

Gaussian radial kernel (Proposition B.1)Let A∈RM×Mdenote the matrix representation of the restricted operatorA|XM in the basis (er)M r=1, with entries Arℓ=⟨Aeℓ,er⟩X . The closed forms of Proposition B.1 become the finite-dimensional approximations J(M) i,k := det ( IM +A 1/2KkA1/2)−1/2 ×exp { −1 2(xi−mk)⊤A1/2( IM +A 1/2KkA1/2)−1 A1/2(xi−mk) } , I(M) k,s :...

work page

[74] [74]

In the projected space, this becomesκ(x,y) = (x⊤y+c) p onRM

Polynomial kernel (Proposition B.2)Fix an integer p≥1and c≥0and consider κ(x,y) = (⟨x,y⟩X +c)p. In the projected space, this becomesκ(x,y) = (x⊤y+c) p onRM . ComputingJ (M) i,k .Fory∼νk,M , define the scalars µ(M) i,k :=x⊤ i mk +c, v (M) i,k :=x⊤ i Kk xi, so that x⊤ i y+c is a one-dimensional Gaussian with meanµ(M) i,k and variancev(M) i,k . Then Proposit...

work page