pith. sign in

arxiv: 2410.23212 · v2 · pith:23RD5JAQnew · submitted 2024-10-30 · 📊 stat.ML · cs.LG· math.ST· stat.TH

Improved convergence rate of kNN graph Laplacians: differentiable self-tuned affinity

Pith reviewed 2026-05-23 18:39 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.STstat.TH
keywords kNN graph Laplacianmanifold learningconvergence rateself-tuned affinitypointwise convergencegraph-based methodsdensity estimationkNN estimator
0
0 comments X

The pith

kNN graph Laplacians converge pointwise to manifold operators at rate O(N^{-2/(d+6)}), up to a log factor.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that a general class of kNN graph Laplacians, with affinities that self-tune via local kNN distances, converges in the operator pointwise sense to a limiting manifold operator that depends on the underlying density p. The rate is O(N^{-2/(d+6)}), achieved when the bandwidth epsilon scales as N^{-2/(d+6)} and the neighbor count k scales as N^{6/(d+6)}, by balancing bias and variance through a refined analysis of the kNN estimator. The result requires the kernel function k0 and the tuning function phi to have C^3 regularity along with other technical conditions, and applies to N i.i.d. samples from a density on an unknown d-dimensional manifold. Readers would care because the faster rate improves reliability of graph-based learning methods when data lie on low-dimensional structures inside high-dimensional spaces. The improvement stems directly from better control of the kNN distance estimator error.

Core claim

Under the manifold data setting, where N i.i.d. samples x_i are drawn from a density p on a d-dimensional unknown manifold embedded in high-dimensional Euclidean space, the kNN graph Laplacian with affinity W_ij = epsilon^{-d/2} k_0 ( ||x_i - x_j||^2 / (epsilon phi(hat rho(x_i), hat rho(x_j))^2) ) converges pointwise as an operator to the limiting manifold operator at rate O(N^{-2/(d+6)}), up to a log factor, when k0 and phi are C^3 and epsilon ~ N^{-2/(d+6)}, k ~ N^{6/(d+6)}.

What carries the argument

The self-tuned affinity that inserts the rescaled kNN distance hat rho(x) into the kernel bandwidth via the symmetric bivariate function phi, combined with the refined analysis of the kNN estimator that controls its error to the optimal order.

If this is right

  • The stated rate is obtained at the optimal order that balances theoretical bias and variance errors for the given regularity.
  • The convergence holds for the general affinity form when k0 and phi satisfy C^3 regularity and the listed technical conditions.
  • A refined analysis of the kNN estimator error is developed that supports the improved rate and may apply to other estimators.
  • Numerical experiments on simulated manifold data confirm the predicted convergence behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same self-tuning construction could be tested on other graph operators such as the normalized Laplacian or diffusion maps to check whether similar rate gains appear.
  • In practice the result suggests choosing k and epsilon jointly at the derived scalings rather than fixing one and tuning the other.
  • The approach may extend to settings with boundary or with varying manifold dimension if the local kNN estimator can be adapted accordingly.

Load-bearing premise

The N data points are independent draws from a density p supported on a smooth d-dimensional manifold embedded in Euclidean space.

What would settle it

Numerical computation of the pointwise operator error on a known manifold (such as the unit sphere) for increasing N, with epsilon and k at the stated scalings, showing that the error fails to decay at rate N^{-2/(d+6)} or better would contradict the claimed convergence rate.

Figures

Figures reproduced from arXiv: 2410.23212 by Nan Wu, Xiuyuan Cheng, Yixuan Tan.

Figure 1
Figure 1. Figure 1: The empirical kNN bandwidth function ρˆ defined in (9) (marked with blue dots) computed from N = 2000 samples on a one-dimensional curve where k = 32 (Left) and 64 (Right). Compared with the population bandwidth function ρ¯rk defined as in Definition 2.1 (marked in red solid line) and ρ¯ = p −1/d (marked in orange dashed line). where W is constructed as in (40). Following the same argument in Remark 7.3, w… view at source ↗
Figure 2
Figure 2. Figure 2: The errors of different kernels plotted against values of σ 2 0 , where Err defined in (45) (averaged over 2000 runs) is shown in solid curves, and Err defined in (46) is shown in dashed lines. The fitted slopes on the log-log plots are also shown. (Left) Theoretical fast-rate cases (i) and (ii). (Right) Theoretical slow-rate cases (iii), (iv), and (v). error bounds in Theorem 6.4, we see that the theoreti… view at source ↗
read the original abstract

In graph-based data analysis, $k$-nearest neighbor ($k$NN) graphs are widely used due to their adaptivity to local data densities. Allowing weighted edges in the graph, the kernelized graph affinity provides a more general type of $k$NN graph where the $k$NN distance is used to set the kernel bandwidth adaptively. In this work, we consider a general class of $k$NN graph where the graph affinity is $W_{ij} = \epsilon^{-d/2} k_0 ( \| x_i - x_j \|^2 / \epsilon \phi( \hat \rho(x_i), \hat \rho(x_j) )^2 ) $, with $\hat{\rho}(x)$ being the (rescaled) $k$NN distance at the point $x$, $\phi$ a symmetric bi-variate function, and $k_0$ a non-negative function on $[0,\infty)$. Under the manifold data setting, where $N$ i.i.d. samples $x_i$ are drawn from a density $p$ on a $d$-dimensional unknown manifold embedded in a high dimensional Euclidean space, we prove the operator pointwise convergence of the $k$NN graph Laplacian to the limiting manifold operator (depending on $p$) at the rate of $O(N^{-2/(d+6)})$, up to a log factor, when $k_0$ and $\phi$ have $C^3$ regularity and satisfy other technical conditions. This is obtained when $\epsilon \sim N^{-2/(d+6)}$ and $k \sim N^{6/(d+6)}$, both at the optimal order to balance the theoretical bias and variance errors. Our improved convergence rate is based on a refined analysis of the $k$NN estimator, which can be of independent interest. We validate our theory by numerical experiments on simulated data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript claims to establish pointwise convergence of a general class of kNN graph Laplacians, defined via self-tuned affinities W_ij = ε^{-d/2} k_0 ( ||x_i - x_j||^2 / (ε ϕ(ρ̂(x_i), ρ̂(x_j))^2) ), to the limiting manifold operator (depending on p). Under the manifold data setting with N i.i.d. samples from density p on a d-dimensional manifold embedded in high-dimensional space, it proves operator pointwise convergence at rate O(N^{-2/(d+6)}) up to a log factor, when k_0 and ϕ have C^3 regularity, achieved at the scalings ε ∼ N^{-2/(d+6)} and k ∼ N^{6/(d+6)}. The result relies on a refined analysis of the kNN distance estimator ρ̂.

Significance. If the rate holds after accounting for all error terms, this would constitute a meaningful improvement in convergence guarantees for adaptive (self-tuned) graph Laplacians over prior work on manifold learning, with the refined kNN analysis potentially useful beyond this setting. The numerical validation on simulated data is noted but secondary to the theoretical claim.

major comments (1)
  1. [refined analysis of the kNN estimator ρ̂ and its insertion into W_ij (bias-variance decomposition)] The bias-variance analysis underlying the claimed O(N^{-2/(d+6)}) rate (balancing ε and k scalings) must explicitly control the covariance between ρ̂(x_i) and ρ̂(x_j) for pairs with ||x_i - x_j|| = O(√ε). Substantial neighborhood overlap induces dependence that the current decomposition appears to treat as weak or absent; an unaccounted covariance term larger than N^{-4/(d+6)} would invalidate the pointwise rate. This is load-bearing for the central convergence claim.
minor comments (1)
  1. Clarify the precise form of the logarithmic factor in the main theorem statement (rather than only in the abstract) to allow direct verification of the rate.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on our manuscript. We address the single major comment below.

read point-by-point responses
  1. Referee: [refined analysis of the kNN estimator ρ̂ and its insertion into W_ij (bias-variance decomposition)] The bias-variance analysis underlying the claimed O(N^{-2/(d+6)}) rate (balancing ε and k scalings) must explicitly control the covariance between ρ̂(x_i) and ρ̂(x_j) for pairs with ||x_i - x_j|| = O(√ε). Substantial neighborhood overlap induces dependence that the current decomposition appears to treat as weak or absent; an unaccounted covariance term larger than N^{-4/(d+6)} would invalidate the pointwise rate. This is load-bearing for the central convergence claim.

    Authors: We agree that an explicit bound on Cov(ρ̂(x_i), ρ̂(x_j)) for ||x_i - x_j|| = O(√ε) is essential to close the variance analysis. Our refined kNN estimator analysis (which underpins the improved rate) derives joint concentration bounds that incorporate neighborhood overlap; the C^3 regularity of ϕ and k_0 is used to propagate these into the affinity entries W_ij. Under the stated scalings, the covariance contribution remains strictly smaller than N^{-4/(d+6)} (up to logs). Nevertheless, to eliminate any ambiguity in the current presentation of the decomposition, we will insert a dedicated lemma making the covariance control fully explicit. This constitutes a partial revision. revision: partial

Circularity Check

0 steps flagged

No circularity: mathematical convergence proof under stated assumptions

full rationale

The paper derives the O(N^{-2/(d+6)}) pointwise convergence rate for the kNN graph Laplacian operator via bias-variance analysis of the kNN distance estimator inserted into the affinity kernel, with epsilon and k balanced at the stated orders. All error terms follow from the manifold sampling assumption, C^3 regularity on k0 and phi, and standard concentration bounds; no parameter is fitted to the target rate, no self-citation chain is load-bearing for the central claim, and the result is not obtained by renaming or self-definition. The derivation is self-contained against the external manifold model.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The claim rests on the manifold data assumption and C^3 regularity of k0 and φ; these are domain assumptions standard in the field but not independently evidenced within the paper.

free parameters (2)
  • epsilon scaling = N^{-2/(d+6)}
    Chosen at N^{-2/(d+6)} to balance bias and variance in the convergence analysis.
  • k scaling = N^{6/(d+6)}
    Chosen at N^{6/(d+6)} as the optimal neighbor count order.
axioms (2)
  • domain assumption N i.i.d. samples drawn from density p on d-dimensional manifold embedded in high-dimensional Euclidean space.
    This is the manifold data setting required for the limiting operator to exist.
  • domain assumption k0 and φ have C^3 regularity and satisfy other technical conditions.
    Invoked to obtain the refined kNN estimator analysis and the stated rate.

pith-pipeline@v0.9.0 · 5898 in / 1518 out tokens · 47730 ms · 2026-05-23T18:39:36.173582+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

  1. [1]

    Über homogene polynome in (L2)

    Stefan Banach. Über homogene polynome in (L2). Studia Mathematica, 7(1):36–44, 1938

  2. [2]

    Semi-supervised learning on manifolds.Machine Learning Journal, 1, 2002

    Mikhail Belkin and Partha Niyogi. Semi-supervised learning on manifolds.Machine Learning Journal, 1, 2002

  3. [3]

    Laplacian eigenmaps for dimensionality reduction and data representation

    Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003

  4. [4]

    Towards a theoretical foundation for Laplacian-based manifold methods

    Mikhail Belkin and Partha Niyogi. Towards a theoretical foundation for Laplacian-based manifold methods. Journal of Computer and System Sciences, 74(8):1289–1308, 2008

  5. [5]

    Measure-based diffusion grid construction and high-dimensional data discretization.Applied and Computational Harmonic Analysis, 40(2):207–228, 2016

    Amit Bermanis, Moshe Salhov, Guy Wolf, and Amir Averbuch. Measure-based diffusion grid construction and high-dimensional data discretization.Applied and Computational Harmonic Analysis, 40(2):207–228, 2016

  6. [6]

    Graph approxima- tions to geodesics on embedded manifolds

    Mira Bernstein, Vin De Silva, John C Langford, and Joshua B Tenenbaum. Graph approxima- tions to geodesics on embedded manifolds. Technical report, 2000

  7. [7]

    Variable bandwidth diffusion kernels.Applied and Computational Harmonic Analysis, 40(1):68–96, 2016

    Tyrus Berry and John Harlim. Variable bandwidth diffusion kernels.Applied and Computational Harmonic Analysis, 40(1):68–96, 2016

  8. [8]

    Springer, 2015

    Gérard Biau and Luc Devroye.Lectures on the nearest neighbor method, volume 246. Springer, 2015

  9. [9]

    Lipschitz regularity of graph Laplacians on random data clouds.SIAM Journal on Mathematical Analysis, 54(1):1169–1222, 2022

    Jeff Calder, Nicolas Garcia Trillos, and Marta Lewicka. Lipschitz regularity of graph Laplacians on random data clouds.SIAM Journal on Mathematical Analysis, 54(1):1169–1222, 2022

  10. [10]

    Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs

    Jeff Calder and Nicolas Garcia Trillos. Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs. Applied and Computational Harmonic Analysis, 60:123–175, 2022

  11. [11]

    Two-sample statistics based on anisotropic kernels.Information and Inference: A Journal of the IMA, 9(3):677–719, 2020

    Xiuyuan Cheng, Alexander Cloninger, and Ronald R Coifman. Two-sample statistics based on anisotropic kernels.Information and Inference: A Journal of the IMA, 9(3):677–719, 2020. 54

  12. [12]

    Convergence of graph Laplacian with kNN self-tuned kernels

    Xiuyuan Cheng and Hau-Tieng Wu. Convergence of graph Laplacian with kNN self-tuned kernels. Information and Inference: A Journal of the IMA, 11(3):889–957, 2022

  13. [13]

    Eigen-convergence of Gaussian kernelized graph Laplacian by manifold heat interpolation.Applied and Computational Harmonic Analysis, 61:132–190, 2022

    Xiuyuan Cheng and Nan Wu. Eigen-convergence of Gaussian kernelized graph Laplacian by manifold heat interpolation.Applied and Computational Harmonic Analysis, 61:132–190, 2022

  14. [14]

    The manifold scattering transform for high-dimensional point cloud data

    Joyce Chew, Holly Steach, Siddharth Viswanath, Hau-Tieng Wu, Matthew Hirn, Deanna Needell, Matthew D Vesely, Smita Krishnaswamy, and Michael Perlmutter. The manifold scattering transform for high-dimensional point cloud data. InTopological, Algebraic and Geometric Learning Workshops 2022, pages 67–78. PMLR, 2022

  15. [15]

    Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006

    Ronald R Coifman and Stéphane Lafon. Diffusion maps.Applied and computational harmonic analysis, 21(1):5–30, 2006

  16. [16]

    Persistent homology for high- dimensional data based on spectral methods.arXiv preprint arXiv:2311.03087, 2023

    Sebastian Damrich, Philipp Berens, and Dmitry Kobak. Persistent homology for high- dimensional data based on spectral methods.arXiv preprint arXiv:2311.03087, 2023

  17. [17]

    The strong uniform consistency of nearest neighbor density estimates

    Luc P Devroye and Terry J Wagner. The strong uniform consistency of nearest neighbor density estimates. The Annals of Statistics, pages 536–540, 1977

  18. [18]

    Springer, 1992

    Manfredo Perdigao Do Carmo and J Flaherty Francis.Riemannian geometry, volume 6. Springer, 1992

  19. [19]

    Spectral convergence of graph Laplacian and heat kernel reconstruction inL∞ from random samples.Applied and Computational Harmonic Analysis, 55:282–336, 2021

    David B Dunson, Hau-Tieng Wu, and Nan Wu. Spectral convergence of graph Laplacian and heat kernel reconstruction inL∞ from random samples.Applied and Computational Harmonic Analysis, 55:282–336, 2021

  20. [20]

    Graph based gaussian processes on restricted domains

    David B Dunson, Hau-Tieng Wu, and Nan Wu. Graph based gaussian processes on restricted domains. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2):414–439, 2022

  21. [21]

    Asymptotic behavior of lp-based Laplacian regularization in semi-supervised learning

    Ahmed El Alaoui, Xiang Cheng, Aaditya Ramdas, Martin J Wainwright, and Michael I Jordan. Asymptotic behavior of lp-based Laplacian regularization in semi-supervised learning. In Conference on Learning Theory, pages 879–906. PMLR, 2016

  22. [22]

    Implicit manifold gaussian process regression.Advances in Neural Information Processing Systems, 36, 2024

    Bernardo Fichera, Slava Borovitskiy, Andreas Krause, and Aude G Billard. Implicit manifold gaussian process regression.Advances in Neural Information Processing Systems, 36, 2024

  23. [23]

    Analysis and algorithms forlp-based semi- supervised learning on graphs

    Mauricio Flores, Jeff Calder, and Gilad Lerman. Analysis and algorithms forlp-based semi- supervised learning on graphs. Applied and Computational Harmonic Analysis, 60:77–122, 2022

  24. [24]

    Nicolás García Trillos, Moritz Gerlach, Matthias Hein, and Dejan Slepčev. Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace–Beltrami operator.Foundations of Computational Mathematics, 20(4):827–887, 2020

  25. [25]

    Multi-manifold semi-supervised learning

    Andrew Goldberg, Xiaojin Zhu, Aarti Singh, Zhiting Xu, and Robert Nowak. Multi-manifold semi-supervised learning. InArtificial intelligence and statistics, pages 169–176. PMLR, 2009

  26. [26]

    Strong uniform convergence of Lapla- cians of random geometric and directed kNN graphs on compact manifolds.arXiv preprint arXiv:2212.10287, 2022

    Hélène Guérin, Dinh-Toan Nguyen, and Viet-Chi Tran. Strong uniform convergence of Lapla- cians of random geometric and directed kNN graphs on compact manifolds.arXiv preprint arXiv:2212.10287, 2022. 55

  27. [27]

    Graph Laplacians and their convergence on random neighborhood graphs.Journal of Machine Learning Research, 8(6), 2007

    Matthias Hein, Jean-Yves Audibert, and Ulrike von Luxburg. Graph Laplacians and their convergence on random neighborhood graphs.Journal of Machine Learning Research, 8(6), 2007

  28. [28]

    From graphs to manifolds– weak and strong pointwise consistency of graph Laplacians

    Matthias Hein, Jean-Yves Audibert, and Ulrike Von Luxburg. From graphs to manifolds– weak and strong pointwise consistency of graph Laplacians. InInternational Conference on Computational Learning Theory, pages 470–485. Springer, 2005

  29. [29]

    Spectral analysis of weighted Laplacians arising in data clustering.Applied and Computational Harmonic Analysis, 56:189–249, 2022

    Franca Hoffmann, Bamdad Hosseini, Assad A Oberai, and Andrew M Stuart. Spectral analysis of weighted Laplacians arising in data clustering.Applied and Computational Harmonic Analysis, 56:189–249, 2022

  30. [30]

    LDLE: Low distortion local eigenmaps

    Dhruv Kohli, Alexander Cloninger, and Gal Mishne. LDLE: Low distortion local eigenmaps. Journal of machine learning research, 22(282):1–64, 2021

  31. [31]

    Rate of strong uniform convergence of k-NN density estimates.Journal of statistical planning and inference, 8(2):185–192, 1983

    YP Mack. Rate of strong uniform convergence of k-NN density estimates.Journal of statistical planning and inference, 8(2):185–192, 1983

  32. [32]

    Multivariate k-nearest neighbor density estimates.Journal of Multivariate Analysis, 9(1):1–15, 1979

    YP Mack and Murray Rosenblatt. Multivariate k-nearest neighbor density estimates.Journal of Multivariate Analysis, 9(1):1–15, 1979

  33. [33]

    Optimal construction of k-nearest- neighbor graphs for identifying noisy clusters.Theoretical Computer Science, 410(19):1749–1764, 2009

    Markus Maier, Matthias Hein, and Ulrike Von Luxburg. Optimal construction of k-nearest- neighbor graphs for identifying noisy clusters.Theoretical Computer Science, 410(19):1749–1764, 2009

  34. [34]

    Consistency properties of nearest neighbor density function estimators

    David S Moore and James W Yackel. Consistency properties of nearest neighbor density function estimators. The Annals of Statistics, 5(1):143–154, 1977

  35. [35]

    Minimax-optimal semi-supervised regression on unknown manifolds

    Amit Moscovich, Ariel Jaffe, and Nadler Boaz. Minimax-optimal semi-supervised regression on unknown manifolds. InArtificial Intelligence and Statistics, pages 933–942. PMLR, 2017

  36. [36]

    Graph Laplacians on shared nearest neighbor graphs and graph Laplacians on k-nearest neighbor graphs having the same limit.arXiv preprint arXiv:2302.12399, 2023

    A Martina Neuman. Graph Laplacians on shared nearest neighbor graphs and graph Laplacians on k-nearest neighbor graphs having the same limit.arXiv preprint arXiv:2302.12399, 2023

  37. [37]

    On spectral clustering: Analysis and an algorithm

    Andrew Ng, Michael Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14, 2001

  38. [38]

    Springer, 2006

    Peter Petersen.Riemannian geometry, volume 171. Springer, 2006

  39. [39]

    Semi-supervised Fréchet regression.arXiv preprint arXiv:2404.10444, 2024

    Rui Qiu, Zhou Yu, and Zhenhua Lin. Semi-supervised Fréchet regression.arXiv preprint arXiv:2404.10444, 2024

  40. [40]

    Scalability and robustness of spectral embedding: landmark diffusion is all you need.Information and Inference: A Journal of the IMA, 11(4):1527–1595, 2022

    Chao Shen and Hau-Tieng Wu. Scalability and robustness of spectral embedding: landmark diffusion is all you need.Information and Inference: A Journal of the IMA, 11(4):1527–1595, 2022

  41. [41]

    From graph to manifold Laplacian: The convergence rate.Applied and Computa- tional Harmonic Analysis, 21(1):128–134, 2006

    Amit Singer. From graph to manifold Laplacian: The convergence rate.Applied and Computa- tional Harmonic Analysis, 21(1):128–134, 2006

  42. [42]

    Spectral convergence of the connection Laplacian from random samples

    Amit Singer and Hau-Tieng Wu. Spectral convergence of the connection Laplacian from random samples. Information and Inference: A Journal of the IMA, 6(1):58–123, 2017

  43. [43]

    Analysis ofp-Laplacian regularization in semisupervised learning

    Dejan Slepcev and Matthew Thorpe. Analysis ofp-Laplacian regularization in semisupervised learning. SIAM Journal on Mathematical Analysis, 51(3):2085–2120, 2019. 56

  44. [44]

    Diffusion maps for signal processing: A deeper look at manifold-learning techniques based on kernels and graphs.IEEE signal processing magazine, 30(4):75–86, 2013

    Ronen Talmon, Israel Cohen, Sharon Gannot, and Ronald R Coifman. Diffusion maps for signal processing: A deeper look at manifold-learning techniques based on kernels and graphs.IEEE signal processing magazine, 30(4):75–86, 2013

  45. [45]

    Adaptive bayesian regression on data with low intrinsic dimensionality.arXiv preprint arXiv:2407.09286, 2024

    Tao Tang, Nan Wu, Xiuyuan Cheng, and David Dunson. Adaptive bayesian regression on data with low intrinsic dimensionality.arXiv preprint arXiv:2407.09286, 2024

  46. [46]

    A global geometric framework for nonlinear dimensionality reduction.Science, 290(5500):2319–2323, 2000

    Joshua B Tenenbaum, Vin de Silva, and John C Langford. A global geometric framework for nonlinear dimensionality reduction.Science, 290(5500):2319–2323, 2000

  47. [47]

    An analysis of the convergence of graph Laplacians

    Daniel Ting, Ling Huang, and Michael I Jordan. An analysis of the convergence of graph Laplacians. In Proceedings of the 27th International Conference on International Conference on Machine Learning, pages 1079–1086, 2010

  48. [48]

    Fermat distances: Metric approximation, spectral convergence, and clustering algorithms.Journal of Machine Learning Research, 25(176):1–65, 2024

    Nicolás García Trillos, Anna Little, Daniel McKenzie, and James M Murphy. Fermat distances: Metric approximation, spectral convergence, and clustering algorithms.Journal of Machine Learning Research, 25(176):1–65, 2024

  49. [49]

    A tutorial on spectral clustering.Statistics and computing, 17:395–416, 2007

    Ulrike Von Luxburg. A tutorial on spectral clustering.Statistics and computing, 17:395–416, 2007

  50. [50]

    Think globally, fit locally under the manifold setup: Asymptotic analysis of locally linear embedding.The Annals of Statistics, 46(6B):3805 – 3837, 2018

    Hau-Tieng Wu and Nan Wu. Think globally, fit locally under the manifold setup: Asymptotic analysis of locally linear embedding.The Annals of Statistics, 46(6B):3805 – 3837, 2018

  51. [51]

    Self-tuning spectral clustering

    Lihi Zelnik-Manor and Pietro Perona. Self-tuning spectral clustering. Advances in neural information processing systems, 17, 2004

  52. [52]

    Analysis of kNN density estimation.IEEE Transactions on Information Theory, 68(12):7971–7995, 2022

    Puning Zhao and Lifeng Lai. Analysis of kNN density estimation.IEEE Transactions on Information Theory, 68(12):7971–7995, 2022

  53. [53]

    normal coordinate

    Xiaojin Zhu, Zoubin Ghahramani, and John D Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. InProceedings of the 20th International conference on Machine learning (ICML-03), pages 912–919, 2003. Appendix A Algorithm for general k0 in section 7 In this section, we detail how the algorithm described in Section 7 can be adapte...

  54. [54]

    The expectation of the former sum, which is its population counterpart, is ϵ−d/2 Z M k0 ∥x − y∥2 ϵϕ(¯ρr(x), ¯ρr(y))2 p(y)dV (y)

    Analyze the bias error of the independent sums ¯D(x) in the convergence analysis of the random-walk graph LaplaciansLrw and eLrw, where ¯D(x) is defined as in(33) in the proof of Theorem 6.1 and as in(A.161) in the proof of Theorem 6.4, respectively. The expectation of the former sum, which is its population counterpart, is ϵ−d/2 Z M k0 ∥x − y∥2 ϵϕ(¯ρr(x)...

  55. [55]

    Provide upper bounds for the expectations and variances of independent sums of random variables. Specifically, these expectations and variances can be upper bounded by the kernel integrals in the form of ϵ−d/2 Z M k0 ∥x − y∥2 ϵ max{¯ρr(x), ¯ρr(y)}2 ¯ρr(y)ip(y)dV (y), i = −4, −2, −1, . . . ,2. See (96) and (99) in the proof of Proposition 4.1 for an exampl...

  56. [56]

    Expand kernel integrals of the form ϵ−d/2 Z M h1 ∥x − y∥2 ϵϕ(¯ρr(x), ¯ρr(y))2 ; ερ,k ¯ρr(y)ip(y)dV (y), i = −2, −1, 0, 1, where h1(η; ερ,k)is as defined in Lemma E.1. This is used in the proofs to bound the replacement errors when k0 is compactly supported, specifically to bound the independent sums containing h1(η; ερ,k) in the bounds given in Lemmas E.1...

  57. [57]

    Since A ⊂ B2cmax ¯ρr(x)√ϵ(0), we have that A0 = A ∩ ˜A

    This implies that the set ˜A := u ∈ B2cmax ¯ρr(x)√ϵ(0), u(1 − 1 2 ∇ ˜βx(0) · u) ≤ ¯ρr(x)√ϵ (A.68) is isomorphic to ad-dimensional ball. Since A ⊂ B2cmax ¯ρr(x)√ϵ(0), we have that A0 = A ∩ ˜A. (A.69) This proves the claim. For˜A, we have ˜A = {tθ, θ ∈ Sd−1, t ∈ [0, ˜rϵ,x(θ)]}, where ˜rϵ,x(θ) = ¯ρr(x)√ϵ + 1 2 ∇ ˜βx(0) · θ¯ρr(x)2ϵ + O[p](ϵ3/2). By (A.69), A0...