Mat\'ern Gaussian Processes on Graphs

Alexander Terenin; Iskander Azangulov; Marc Peter Deisenroth; Nicolas Durrande; Peter Mostowsky; Viacheslav Borovitskiy

arxiv: 2010.15538 · v4 · pith:AMGHZFBDnew · submitted 2020-10-29 · 📊 stat.ML · cs.LG

Mat\'ern Gaussian Processes on Graphs

Viacheslav Borovitskiy , Iskander Azangulov , Alexander Terenin , Peter Mostowsky , Marc Peter Deisenroth , Nicolas Durrande This is my paper

Pith reviewed 2026-05-24 14:09 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords Gaussian processesMatérn kernelgraphsstochastic partial differential equationsinducing pointsmachine learning on graphs

0 comments

The pith

Matérn Gaussian processes on undirected graphs can be defined via stochastic partial differential equations and trained with inducing points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the stochastic partial differential equation characterization of Matérn Gaussian processes from Euclidean and Riemannian settings to undirected graphs. This construction yields processes that retain smoothness and stationarity properties of the continuous case. The resulting models admit standard training techniques such as inducing points, which in turn support mini-batch optimization and non-conjugate likelihoods.

Core claim

By leveraging the SPDE characterization, Matérn Gaussian processes on graphs inherit various attractive properties of their Euclidean and Riemannian analogs and can be trained using standard methods such as inducing points, enabling their use in mini-batch and non-conjugate settings.

What carries the argument

The stochastic partial differential equation characterization of Matérn processes, extended to the discrete graph setting.

Load-bearing premise

The stochastic partial differential equation characterization of Matérn processes extends to the discrete graph setting while preserving the key smoothness and stationarity properties that hold in the continuous case.

What would settle it

Compute the covariance function of the graph Matérn process on a cycle graph or path graph and check whether it matches the expected Matérn decay rate and positive-definiteness for varying smoothness parameters.

Figures

Figures reproduced from arXiv: 2010.15538 by Alexander Terenin, Iskander Azangulov, Marc Peter Deisenroth, Nicolas Durrande, Peter Mostowsky, Viacheslav Borovitskiy.

**Figure 2.** Figure 2: The Mat´ern kernel k3/2(x, ·) defined on increasingly fine triangular meshes (three left columns), and Mat´ern kernel k1/2(x, ·) on the sphere (rightmost). The true kernel is computed using 512 analytically known Laplace-Beltrami eigenpairs via the technique in Borovitskiy et al. (2020). The discrepancy in the smoothness parameter is due to a different parameterization of the kernel, which accounts for dim… view at source ↗

**Figure 3.** Figure 3: Traffic flow speed (mph) interpolation over a graph of San Jose highways. Colored circles represent [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Traffic flow speed (mph) interpolation over [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Topic classification for the Cora citation network, using the citation graph. graph. Note that both travel directions on a highway are considered separate roads and thus form separate edges—this can be seen in [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Topic classification for the Cora citation network, using the citation graph. Ground truth [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

**Figure 7.** Figure 7: Topic classification for the Cora citation network, using the citation graph. Prediction [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Topic classification for the Cora citation network, using the citation graph. Prediction errors. Red nodes are misclassified [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

read the original abstract

Gaussian processes are a versatile framework for learning unknown functions in a manner that permits one to utilize prior information about their properties. Although many different Gaussian process models are readily available when the input space is Euclidean, the choice is much more limited for Gaussian processes whose input space is an undirected graph. In this work, we leverage the stochastic partial differential equation characterization of Mat\'ern Gaussian processes - a widely-used model class in the Euclidean setting - to study their analog for undirected graphs. We show that the resulting Gaussian processes inherit various attractive properties of their Euclidean and Riemannian analogs and provide techniques that allow them to be trained using standard methods, such as inducing points. This enables graph Mat\'ern Gaussian processes to be employed in mini-batch and non-conjugate settings, thereby making them more accessible to practitioners and easier to deploy within larger learning frameworks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a direct SPDE construction for Matérn GPs on graphs that supports inducing-point training, but stationarity only holds on vertex-transitive graphs.

read the letter

The paper constructs Matérn Gaussian processes on undirected graphs by discretizing their SPDE characterization using the graph Laplacian. They obtain the covariance as a spectral function of the Laplacian and show that standard techniques like inducing points can be used for training, which supports mini-batch and non-conjugate settings. What is new is the application of the SPDE method to define the Matérn class specifically on graphs, extending the Euclidean and Riemannian versions. The work provides a principled way to control smoothness via the parameter nu while staying within the GP framework. It does well at reusing existing GP infrastructure. The training methods are not reinvented but applied directly, which makes the model more deployable in larger frameworks. The soft spot is on the inherited properties. Stationarity, where the covariance depends only on the distance between nodes, holds only for vertex-transitive graphs. On general graphs the covariance will vary with the global position of the nodes because the graph Laplacian does not yield a purely distance-based kernel. The abstract claims inheritance of attractive properties, so the paper should clarify exactly what carries over. Smoothness from the eigenvalue spectrum should be preserved, but stationarity is not. If the full paper does not address this distinction, it is a limitation that affects how the model can be used. The derivation is straightforward once the Laplacian substitution is accepted, and the citation pattern follows the relevant SPDE and graph kernel literature without obvious gaps. This paper is for researchers needing a GP prior on graph-structured inputs with tunable smoothness who also want scalable training. A reader working on network data or graph ML would get practical value from the construction, though they may need to verify the stationarity behavior on their graphs. It deserves a serious referee. The idea is grounded enough to warrant detailed review, particularly on the properties that are actually inherited.

Referee Report

1 major / 0 minor

Summary. The paper constructs Matérn Gaussian processes on undirected graphs by replacing the continuous Laplacian in the SPDE (κ² − Δ)^{ν/2} f = W with the graph Laplacian L, yielding a covariance operator that is a spectral function of L (typically of the form (κ²I + L)^{-ν}). It claims that the resulting GPs inherit attractive properties from their Euclidean and Riemannian counterparts and supplies training techniques based on inducing points that enable mini-batch and non-conjugate inference.

Significance. If the construction is valid and the inherited properties are correctly characterized, the work would supply a practical route for applying Matérn GPs to graph-structured data and integrating them into larger learning pipelines via standard inducing-point methods.

major comments (1)

[Abstract and construction section] Abstract and the section defining the graph Matérn process: the claim that the GPs inherit 'various attractive properties' of the Euclidean/Riemannian analogs is undermined by the fact that the spectral construction via the combinatorial or normalized Laplacian produces a covariance that depends on global vertex position rather than solely on shortest-path distance, except on vertex-transitive graphs. Stationarity therefore does not survive in general, which is load-bearing for the central inheritance claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on the manuscript. We address the single major comment below and will revise the abstract and construction section accordingly.

read point-by-point responses

Referee: [Abstract and construction section] Abstract and the section defining the graph Matérn process: the claim that the GPs inherit 'various attractive properties' of the Euclidean/Riemannian analogs is undermined by the fact that the spectral construction via the combinatorial or normalized Laplacian produces a covariance that depends on global vertex position rather than solely on shortest-path distance, except on vertex-transitive graphs. Stationarity therefore does not survive in general, which is load-bearing for the central inheritance claim.

Authors: We agree that the covariance operator defined via the graph Laplacian does not yield a stationary process in general, since the resulting kernel depends on the global eigenstructure of the graph rather than solely on shortest-path distance (except on vertex-transitive graphs). The manuscript does not explicitly claim stationarity; the phrase 'various attractive properties' refers to the SPDE-derived construction that preserves (i) control of mean-square regularity via the parameter ν, (ii) positive-definiteness of the covariance, and (iii) compatibility with standard inducing-point approximations that enable mini-batch and non-conjugate inference. Nevertheless, the referee's observation is valid and the current wording risks overstatement. We will revise the abstract and the definition section to state the inherited properties more precisely and to note explicitly that stationarity holds only on vertex-transitive graphs. This constitutes a partial revision; the core construction, theoretical results on the precision operator, and training methodology remain unchanged. revision: partial

Circularity Check

0 steps flagged

No circularity: direct extension of external SPDE via graph Laplacian

full rationale

The paper constructs graph Matérn GPs by replacing the continuous Laplacian in the known Euclidean/Riemannian SPDE (κ² − Δ)^{ν/2} f = W with the graph Laplacian L, then obtaining the covariance spectrally as a function of L. This modeling step is an independent discretization choice grounded in the external SPDE framework rather than a redefinition of the target covariance or a fitted parameter presented as a prediction. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation appear in the provided abstract or described derivation. The central claims about inherited properties are therefore not forced by construction and remain open to external verification against the continuous case.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central construction rests on the assumption that the SPDE characterization transfers to graphs; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption The stochastic partial differential equation characterization of Matérn Gaussian processes extends to undirected graphs while preserving key properties.
This is the explicit premise used to define the graph analog and to claim inheritance of Euclidean properties.

pith-pipeline@v0.9.0 · 5687 in / 1140 out tokens · 22095 ms · 2026-05-24T14:09:58.945059+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We leverage the stochastic partial differential equation characterization of Matérn Gaussian processes … (2ν/κ² + Δ)^{ν/2} f = W … f ∼ N(0, (2ν/κ² + Δ)^{-ν})
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the graph Laplacian … Δ = D − W … Φ(Δ) = U Φ(Λ) U^T

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

7 extracted references · 7 canonical work pages

[1]

Cited on page 7. C. Chen, K. Petty, A. Skabardonis, P. Varaiya, and Z. Jia. Freeway performance measurement system: mining loop detector data. Transportation Research Record, 1748(1):96–102, 2001. Cited on pages 7, 12. Mat´ ern Gaussian Processes on Graphs N. Durrande, V. Adam, L. Bordeaux, S. Eleftheriadis, and J. Hensman. Banded matrix operators for Gau...

work page 2001
[2]

Cited on pages 6, 7. L. C. Evans. Partial Diﬀerential Equations . American Mathematical Society, 2010. Cited on page 3. A. Feragen, F. Lauze, and S. Hauberg. Geodesic expo- nential kernels: When curvature and linearity con- ﬂict. In Conference on Computer Vision and Pattern Recognition, 2015. Cited on page 1. P. Goovaerts.Geostatistics for Natural Resourc...

work page 2010
[3]

Cited on page 2. A. Grigoryan. Heat kernel and analysis on manifolds , volume 47. American Mathematical Society, 2009. Cited on page 3. J. Hensman, N. Durrande, and A. Solin. Variational Fourier features for Gaussian processes. Journal of Machine Learning Research, 18:151–202, 2017. Cited on page 7. J. Hensman, N. Fusi, and N. D. Lawrence. Gaussian proces...

work page 2009
[4]

Cited on page 2. F. Lindgren, H. Rue, and J. Lindstr¨ om. An explicit link between Gaussian ﬁelds and Gaussian Markov ran- dom ﬁelds: the stochastic partial diﬀerential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 73(4):423–498,

work page
[5]

Cited on pages 1–3, 5. A. Mallasto and A. Feragen. Wrapped Gaussian pro- cess regression on Riemannian manifolds. In Confer- ence on Computer Vision and Pattern Recognition , pages 5580–5588, 2018. Cited on page 1. A. G. d. G. Matthews, M. van der Wilk, T. Nickson, K. Fujii, A. Boukouvalas, P. Le´ on-Villagr´ a, Z. Ghahra- mani, and J. Hensman. GPﬂow: A G...

work page arXiv 2018
[6]

Cited on page 3. S. V. N. Vishwanathan, N. N. Schraudolph, R. Kon- dor, and K. M. Borgwardt. Graph kernels. Journal of Machine Learning Research, 11:1201–1242, 2010. Cited on page 2. U. Von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395–416, 2007. Cited on page 4. Z. Wang, C. Gehring, P. Kohli, and S. Jegelka. Batched larg...

work page arXiv 2010
[7]

We obtain traﬃc congestion data from the California Performance Measurement Systems database (Chen et al., 2001)

obtained using the OSMnx library (Boeing, 2017). We obtain traﬃc congestion data from the California Performance Measurement Systems database (Chen et al., 2001). This system maps sensor location and time & date to (among other factors) traﬃc ﬂow speed in miles per hour measured by the given sensor at a particular time and date. We chose the speciﬁc date ...

work page 2017

[1] [1]

Cited on page 7. C. Chen, K. Petty, A. Skabardonis, P. Varaiya, and Z. Jia. Freeway performance measurement system: mining loop detector data. Transportation Research Record, 1748(1):96–102, 2001. Cited on pages 7, 12. Mat´ ern Gaussian Processes on Graphs N. Durrande, V. Adam, L. Bordeaux, S. Eleftheriadis, and J. Hensman. Banded matrix operators for Gau...

work page 2001

[2] [2]

Cited on pages 6, 7. L. C. Evans. Partial Diﬀerential Equations . American Mathematical Society, 2010. Cited on page 3. A. Feragen, F. Lauze, and S. Hauberg. Geodesic expo- nential kernels: When curvature and linearity con- ﬂict. In Conference on Computer Vision and Pattern Recognition, 2015. Cited on page 1. P. Goovaerts.Geostatistics for Natural Resourc...

work page 2010

[3] [3]

Cited on page 2. A. Grigoryan. Heat kernel and analysis on manifolds , volume 47. American Mathematical Society, 2009. Cited on page 3. J. Hensman, N. Durrande, and A. Solin. Variational Fourier features for Gaussian processes. Journal of Machine Learning Research, 18:151–202, 2017. Cited on page 7. J. Hensman, N. Fusi, and N. D. Lawrence. Gaussian proces...

work page 2009

[4] [4]

Cited on page 2. F. Lindgren, H. Rue, and J. Lindstr¨ om. An explicit link between Gaussian ﬁelds and Gaussian Markov ran- dom ﬁelds: the stochastic partial diﬀerential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 73(4):423–498,

work page

[5] [5]

Cited on pages 1–3, 5. A. Mallasto and A. Feragen. Wrapped Gaussian pro- cess regression on Riemannian manifolds. In Confer- ence on Computer Vision and Pattern Recognition , pages 5580–5588, 2018. Cited on page 1. A. G. d. G. Matthews, M. van der Wilk, T. Nickson, K. Fujii, A. Boukouvalas, P. Le´ on-Villagr´ a, Z. Ghahra- mani, and J. Hensman. GPﬂow: A G...

work page arXiv 2018

[6] [6]

Cited on page 3. S. V. N. Vishwanathan, N. N. Schraudolph, R. Kon- dor, and K. M. Borgwardt. Graph kernels. Journal of Machine Learning Research, 11:1201–1242, 2010. Cited on page 2. U. Von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395–416, 2007. Cited on page 4. Z. Wang, C. Gehring, P. Kohli, and S. Jegelka. Batched larg...

work page arXiv 2010

[7] [7]

We obtain traﬃc congestion data from the California Performance Measurement Systems database (Chen et al., 2001)

obtained using the OSMnx library (Boeing, 2017). We obtain traﬃc congestion data from the California Performance Measurement Systems database (Chen et al., 2001). This system maps sensor location and time & date to (among other factors) traﬃc ﬂow speed in miles per hour measured by the given sensor at a particular time and date. We chose the speciﬁc date ...

work page 2017