On Kernel Eigen-alignments of KRR: Reconstruction and Generalization

Daniel Krutz; Ernest Fokoue; Richard Lange; Yang Liu

arxiv: 2605.15240 · v1 · pith:63Y3ZC4Xnew · submitted 2026-05-14 · 📊 stat.ML · cs.LG

On Kernel Eigen-alignments of KRR: Reconstruction and Generalization

Yang Liu , Ernest Fokoue , Richard Lange , Daniel Krutz This is my paper

Pith reviewed 2026-05-19 16:20 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords kernel ridge regressioneigen-alignmentgeneralization boundseigenvalue estimationkernel methodsmatrix perturbation

0 comments

The pith

In kernel ridge regression, generalization performance depends on the alignment of kernel eigenvectors with the target vectors rather than just reconstruction accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper connects generalization in kernel methods to how well the eigenvectors and eigenvalues of the kernel matrix can be estimated from finite data. It shows that the prediction is a weighted sum of these eigenvectors, so perturbations in the kernel lead to errors bounded by eigenvalue stability. For high-rank kernels, low reconstruction error is easy to achieve and thus weakly predictive of generalization. Strong performance instead requires better eigenvector alignment with targets, larger eigenvalue magnitudes, or larger gaps between consecutive eigenvalues.

Core claim

The paper establishes that a generalization bound for kernel ridge regression can be derived from the estimation stability of eigenvalues and eigenvectors of the kernel matrix. Since predictions are weighted sums of eigenvectors, the error from kernel perturbations can be analyzed to show that strong generalization requires increasing eigenvector alignment, eigenvalue magnitude, or gaps between consecutive eigenvalues.

What carries the argument

eigen-alignments between the kernel matrix and learning targets, which determine how much the kernel perturbations affect the weighted eigenvector sums in predictions

If this is right

Low reconstruction error alone does not guarantee good generalization for high-rank kernels.
Improving eigenvector alignment enhances the generalization bound.
Larger gaps between eigenvalues increase stability against perturbations.
Increasing eigenvalue magnitudes supports better generalization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This perspective suggests that kernel design should optimize for alignment properties in addition to other criteria.
Similar eigen-alignment analysis could extend to other kernel-based methods beyond ridge regression.
Finite-sample bounds like these may help explain why certain kernels work well on specific datasets but not others.

Load-bearing premise

The analysis assumes that the generalization error arises primarily from using a suboptimal finite training set in a finite-sample setting.

What would settle it

An experiment showing that generalization error remains high even when eigenvector alignment is strong and eigenvalues have large magnitudes and gaps would falsify the bound.

Figures

Figures reproduced from arXiv: 2605.15240 by Daniel Krutz, Ernest Fokoue, Richard Lange, Yang Liu.

**Figure 1.** Figure 1: Learning targets of the task affect the generalization performance. Learning targets aligned with top eigen [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Random data reconstruction error. We observe the following: 1) high reconstruction errors only appear when [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Learning targets of the task affect the generalization performance. Learning targets aligned with top eigen [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: With MNIST images data (labeled 4 and 9), we construct targets with top and trailing eigenvectors. On [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: Training error and testing error comparison for different kernel types and eigenvectors as the learning targets. [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Left: Generalization error increases as the number of eigenvectors that the learning target aligned with [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

**Figure 7.** Figure 7: Generalization error when treating each eigenvector as an individual synthetic learning target. [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

read the original abstract

This paper investigates the critical role of eigenalignments between the kernel matrix and learning targets in achieving robust generalization in learning problems. We establish a direct connection between generalization performance in kernel methods and the estimation of eigenvectors and eigenvalues of matrices, offering a more intuitive understanding compared to prior work with minimal assumptions. We also show that, since the prediction task in KRR is essentially the weighted sum of eigenvectors/singular vectors, by analyzing how much error can be caused by perturbations to the kernel matrix, we can then derive a bound on this generalization error using the estimation stability of matrix eigenvalues and eigenvectors. Compared with previous work, our analysis concentrates on finite-sample settings and on the generalization error arising from having a suboptimal finite training set. Our findings reveal that in kernel methods, as long as the kernel is of high rank, the near-zero reconstruction error can be trivially obtained, implying that the reconstruction error will have limited predictive power for generalization. Finally, we establish a generalization bound from an eigenvalues/eigenvectors estimation perspective, showing that strong generalization requires increasing eigenvector alignment, eigenvalue magnitude, or gaps between consecutive eigenvalues.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tries to bound KRR generalization via finite-sample eigenvector alignment and perturbation stability, but the perturbation model and gap conditions stay too vague for the main claim to land cleanly.

read the letter

The main thing to know is that this work wants to explain why some kernels generalize better by linking finite-sample eigenvector alignment to a perturbation argument on the kernel matrix, then deriving a bound that says strong generalization needs better alignment, bigger eigenvalues, or larger gaps. The finite-sample focus and the point about high-rank kernels making reconstruction error trivial are the clearest contributions here.

Referee Report

1 major / 1 minor

Summary. The paper analyzes kernel ridge regression (KRR) by linking generalization performance to eigen-alignments between the kernel matrix and learning targets. It derives a generalization bound by modeling the finite training set as inducing a perturbation on the kernel matrix, then bounding the error in the KRR predictor (expressed as a weighted sum of eigenvectors) via estimation stability of eigenvalues and eigenvectors. Key claims include that near-zero reconstruction error is trivial for high-rank kernels (limiting its predictive power for generalization) and that strong generalization requires increasing eigenvector alignment, eigenvalue magnitude, or gaps between consecutive eigenvalues, all under minimal assumptions in finite-sample settings.

Significance. If the perturbation analysis can be made rigorous with explicit conditions, the work would offer an intuitive finite-sample perspective on generalization in kernel methods that complements existing stability and Rademacher-based bounds. It could inform kernel selection or regularization by emphasizing eigenstructure, though the current lack of explicit perturbation form limits immediate applicability.

major comments (1)

[perturbation analysis and generalization bound derivation] The central derivation of the generalization bound (described in the abstract as arising from perturbation stability of the kernel matrix and detailed in the main analysis) applies eigenvector perturbation results such as Davis-Kahan bounds without specifying the exact perturbation model (additive noise on K, sampling error from the integral operator, or multiplicative distortion) or verifying that eigenvalue gap conditions hold uniformly for the kernels under consideration. This is load-bearing for the claim that strong generalization requires larger alignment, larger eigenvalues, or larger gaps, as the bound does not necessarily follow from the finite-sample premise without these elements.

minor comments (1)

[abstract and findings paragraph] The abstract states that 'near-zero reconstruction error can be trivially obtained' for high-rank kernels; clarify whether this holds only asymptotically or under specific finite-sample conditions, and add a brief remark on how this interacts with the proposed eigen-alignment requirements.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. We address the major comment on the perturbation analysis and generalization bound derivation below, and we plan to revise the paper to improve clarity on these points.

read point-by-point responses

Referee: The central derivation of the generalization bound (described in the abstract as arising from perturbation stability of the kernel matrix and detailed in the main analysis) applies eigenvector perturbation results such as Davis-Kahan bounds without specifying the exact perturbation model (additive noise on K, sampling error from the integral operator, or multiplicative distortion) or verifying that eigenvalue gap conditions hold uniformly for the kernels under consideration. This is load-bearing for the claim that strong generalization requires larger alignment, larger eigenvalues, or larger gaps, as the bound does not necessarily follow from the finite-sample premise without these elements.

Authors: We appreciate the referee highlighting the need for greater precision here. The perturbation we analyze is the additive difference between the empirical kernel matrix (computed from the finite training set) and the population kernel matrix induced by the integral operator; this is the natural sampling perturbation arising in the finite-sample setting. We will revise the manuscript to state this model explicitly, including a brief reference to standard concentration results for empirical kernel matrices. For the eigenvalue gap conditions, the Davis-Kahan bounds we invoke are applied conditionally on a positive gap between relevant eigenvalues, which is a standard technical assumption in this literature. We agree that discussing the uniformity of this condition strengthens the presentation. In revision we will add a short discussion noting that the bound holds whenever the gap condition is met and that regularization or kernel choice can be used to ensure suitable gaps for many common kernels (e.g., Gaussian). These changes make the assumptions transparent and support the claim that strong generalization requires sufficient alignment, eigenvalue magnitude, or gaps, without altering the core finite-sample perspective of the work. revision: yes

Circularity Check

0 steps flagged

Derivation applies perturbation analysis to KRR predictor without reducing to self-definition or fitted inputs

full rationale

The paper derives its generalization bound by expressing the KRR predictor as a weighted sum of eigenvectors and analyzing error from finite-sample perturbations to the kernel matrix, then invoking estimation stability results for eigenvalues and eigenvectors. This produces an explicit dependence of the bound on eigenvector alignment, eigenvalue magnitudes, and gaps, which is a direct consequence of the perturbation expansion rather than a tautological re-expression of the inputs. The separate observation that reconstruction error is near-zero for high-rank kernels is used only to downplay its predictive value for generalization and does not feed back into the bound derivation. No self-citations, ansatzes, or fitted parameters are shown to be load-bearing; the quantities in the bound are defined independently of the final generalization statement itself. The analysis therefore remains self-contained under the finite-sample premise stated in the abstract.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract alone: relies on standard matrix perturbation results for eigenvectors and eigenvalues; no free parameters, invented entities, or ad-hoc axioms explicitly named.

axioms (1)

standard math Standard results on stability of matrix eigenvalues and eigenvectors under perturbations hold in the finite-sample kernel setting.
Invoked to derive the generalization bound from kernel matrix perturbations.

pith-pipeline@v0.9.0 · 5724 in / 1122 out tokens · 36126 ms · 2026-05-19T16:20:04.582258+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

300 extracted references · 300 canonical work pages

[1]

Functional Analysis

Rudin, Walter. Functional Analysis

work page
[2]

On regularization algorithms in learning theory

Bauer, Frank and Pereverzev, Sergei and Rosasco, Lorenzo. On regularization algorithms in learning theory. Journal of complexity

work page
[3]

Approximation theory and harmonic analysis on spheres and balls

Dai, Feng and Xu, Yuan. Approximation theory and harmonic analysis on spheres and balls

work page
[4]

Aspects of Harmonic Analysis and Representation Theory

Gallier, Jean and Quaintance, Jocelyn. Aspects of Harmonic Analysis and Representation Theory

work page
[5]

Replica Method for the Machine Learning Theorist

Blake Bordelon, Haozhe Shan, Abdul Canatar, Boaz Barak, Cengiz Pehlevan. Replica Method for the Machine Learning Theorist

work page
[6]

Categorization based on similarity and features: the reproducing kernel Banach space (RKBS) approach

Zhang, Jun and Zhang, Haizhang. Categorization based on similarity and features: the reproducing kernel Banach space (RKBS) approach. New Handbook of Mathematical Psychology

work page
[7]

Fluctuations of Spiked Random Matrix Models and Failure Diagnosis in Sensor Networks

Couillet, Romain and Hachem, Walid. Fluctuations of Spiked Random Matrix Models and Failure Diagnosis in Sensor Networks. IEEE Transactions on Information Theory

work page
[8]

A Experimental details

Datasets, Downstream. A Experimental details

work page
[9]

Incremental kernel principal component analysis

Chin, Tat-Jun and Suter, David. Incremental kernel principal component analysis. IEEE transactions on image processing: a publication of the IEEE Signal Processing Society

work page
[10]

Scattered Data Approximation

Wendland, Holger. Scattered Data Approximation

work page
[11]

A support vector machine with a hybrid kernel and minimal vapnik-chervonenkis dimension

Tan, Ying and Wang, Jun. A support vector machine with a hybrid kernel and minimal vapnik-chervonenkis dimension. IEEE transactions on knowledge and data engineering

work page
[12]

Reproducing kernel Hilbert spaces on manifolds: Sobolev and Diffusion spaces

De Vito, Ernesto and Mücke, Nicole and Rosasco, Lorenzo. Reproducing kernel Hilbert spaces on manifolds: Sobolev and Diffusion spaces. arXiv [math.FA]

work page
[13]

Reproducing kernels of generalized Sobolev spaces via a Green function approach with distributional operators

Fasshauer, Gregory E and Ye, Qi. Reproducing kernels of generalized Sobolev spaces via a Green function approach with distributional operators. Numerische mathematik

work page
[14]

Adaptive kernel methods using the balancing principle

De Vito, E and Pereverzyev, S and Rosasco, L. Adaptive kernel methods using the balancing principle. Foundations of computational mathematics

work page
[15]

A Mathematical Introduction to Compressive Sensing

Foucart, Simon and Rauhut, Holger. A Mathematical Introduction to Compressive Sensing

work page
[16]

Matrix Perturbation Theory

Stewart, G W and Sun, Ji-Guang. Matrix Perturbation Theory

work page
[17]

Unshuffling data for improved generalization in visual question answering

Teney, Damien and Abbasnejad, Ehsan and van den Hengel, Anton. Unshuffling data for improved generalization in visual question answering. 2021 IEEE/CVF International Conference on Computer Vision (ICCV)

work page 2021
[18]

Dimension lower bounds for linear approaches to function approximation

Hsu, Daniel J. Dimension lower bounds for linear approaches to function approximation

work page
[19]

Measure Theory

Cohn, Donald L. Measure Theory

work page
[20]

for any p ∈ [1, ∞), n

b‖, ‖a and Max, x :=. for any p ∈ [1, ∞), n

work page
[21]

Algebraic geometry and statistical learning theory

Watanabe, Sumio. Algebraic geometry and statistical learning theory

work page
[22]

Measuring and testing homogeneity of distributions by characteristic distance

Li, Xu and Hu, Wenjuan and Zhang, Baoxue. Measuring and testing homogeneity of distributions by characteristic distance. Statistical papers (Berlin, Germany)

work page
[23]

Umbral nature of the Poisson random variables

Di Nardo, E and Senato, D. Umbral nature of the Poisson random variables. arXiv [math.PR]

work page
[24]

On the notion of neighborhood system

Mishachev, Nikolai and Shmyrin, Anatoly. On the notion of neighborhood system. 2019 1st International Conference on Control Systems, Mathematical Modelling, Automation and Energy Efficiency (SUMMA)

work page 2019
[25]

Functional Analysis for Probability and Stochastic Processes

Bobrowski, A. Functional Analysis for Probability and Stochastic Processes

work page
[26]

Stability and Generalization

Bousquet, O and Elisseeff, A. Stability and Generalization. Journal of machine learning research: JMLR

work page
[27]

Capacity of reproducing kernel spaces in learning theory

Zhou, Ding-Xuan. Capacity of reproducing kernel spaces in learning theory. IEEE transactions on information theory

work page
[28]

Approximation in learning theory

Temlyakov, V N. Approximation in learning theory. Constructive approximation

work page
[29]

Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms

Fischer, Simon and Steinwart, Ingo. Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms. Journal of Machine Learning Research

work page
[30]

Universal algorithms for learning theory part I : Piecewise constant functions

Binev, P and Cohen, A and Dahmen, W and DeVore, R and Temlyakov, V. Universal algorithms for learning theory part I : Piecewise constant functions. Journal of machine learning research: JMLR

work page
[31]

An introduction to Sobolev spaces

Piskin, Erhan and Okutmustur, Baver. An introduction to Sobolev spaces

work page
[32]

Entropy-based subsampling methods for big data

Sui, Qun and Ghosh, Sujit K. Entropy-based subsampling methods for big data. Journal of statistical theory and practice

work page
[33]

A theory of usable information under computational constraints

Xu, Yilun and Zhao, Shengjia and Song, Jiaming and Stewart, Russell and Ermon, Stefano. A theory of usable information under computational constraints. arXiv [cs.LG]

work page
[34]

Uniform approximation of functions with random bases

Rahimi, Ali and Recht, Benjamin. Uniform approximation of functions with random bases. 2008 46th Annual Allerton Conference on Communication, Control, and Computing

work page 2008
[35]

Tight frame expansions of multiscale reproducing kernels in Sobolev spaces

Opfer, Roland. Tight frame expansions of multiscale reproducing kernels in Sobolev spaces. Applied and computational harmonic analysis

work page
[36]

A theory of representation learning gives a deep generalisation of kernel methods

Yang, Adam X and Robeyns, Maxime and Milsom, Edward and Anson, Ben and Schoots, Nandi and Aitchison, Laurence. A theory of representation learning gives a deep generalisation of kernel methods. Proceedings of the 40th International Conference on Machine Learning

work page
[37]

Low rank spectral network alignment

Nassar, Huda and Veldt, Nate and Mohammadi, Shahin and Grama, Ananth and Gleich, David F. Low rank spectral network alignment. Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18

work page 2018
[38]

Learning the Kernel Function via Regularization

Micchelli, Charles A and Pontil, Massimiliano. Learning the Kernel Function via Regularization. Journal of Machine Learning Research

work page
[39]

HoMM: Higher-order Moment Matching for unsupervised domain adaptation

Chen, Chao and Fu, Zhihang and Chen, Zhihong and Jin, Sheng and Cheng, Zhaowei and Jin, Xinyu and Hua, Xian-Sheng. HoMM: Higher-order Moment Matching for unsupervised domain adaptation. Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

work page
[40]

Foundations of Machine Learning

Mohri, Mehryar and Rostamizadeh, Afshin and Talwalkar, Ameet. Foundations of Machine Learning

work page
[41]

Boosting a weak learning algorithm by majority

Freund, Y. Boosting a weak learning algorithm by majority. Information and computation

work page
[42]

Multivariate Function and Operator Estimation, Based on Smoothing Splines and Reproducing Kernels

Wahba, Grace. Multivariate Function and Operator Estimation, Based on Smoothing Splines and Reproducing Kernels. Nonlinear Modeling and Forecasting

work page
[43]

Smoothing Splines: Methods and Applications

Wang, Yuedong. Smoothing Splines: Methods and Applications

work page
[44]

Local Rademacher Complexity Bounds based on Covering Numbers

Lei, Yunwen and Ding, Lixin and Bi, Yingzhou. Local Rademacher Complexity Bounds based on Covering Numbers. arXiv [cs.AI]

work page
[45]

Concentration estimates for learning with ℓ1-regularizer and data dependent hypothesis spaces

Shi, Lei and Feng, Yun-Long and Zhou, Ding-Xuan. Concentration estimates for learning with ℓ1-regularizer and data dependent hypothesis spaces. Applied and computational harmonic analysis

work page
[46]

Learning with sample dependent hypothesis spaces

Wu, Qiang and Zhou, Ding-Xuan. Learning with sample dependent hypothesis spaces. Computers & mathematics with applications (Oxford, England: 1987)

work page 1987
[47]

Machine learning with data dependent hypothesis classes

Cannon, A and Ettinger, J M and Hush, D and Scovel, C. Machine learning with data dependent hypothesis classes. The Journal of Machine Learning Research

work page
[48]

Regularized modal regression with data-dependent hypothesis spaces

Wang, Yingjie and Chen, Hong and Song, Biqin and Li, Han. Regularized modal regression with data-dependent hypothesis spaces. International Journal of Wavelets, Multiresolution and Information Processing

work page
[49]

The eigenlearning framework: A conservation law perspective on kernel regression and wide neural networks

Simon, James B and Dickens, Madeline and Karkada, Dhruva and DeWeese, Michael R. The eigenlearning framework: A conservation law perspective on kernel regression and wide neural networks. arXiv [cs.LG]

work page
[50]

Frame Theory in Data Science

Zhang, Zhihua and Jorgensen, Palle E T. Frame Theory in Data Science

work page
[51]

Kernel Alignment Risk Estimator: Risk prediction from training data

Jacot, Arthur and cSimcsek, Berfin and Spadaro, Francesco and Hongler, Clément and Gabriel, Franck. Kernel Alignment Risk Estimator: Risk prediction from training data. Neural Information Processing Systems

work page
[52]

Group invariant scattering

Mallat, Stéphane. Group invariant scattering. Communications on pure and applied mathematics

work page
[53]

Towards understanding the spectral bias of deep learning

Cao, Yuan and Fang, Zhiying and Wu, Yue and Zhou, Ding-Xuan and Gu, Quanquan. Towards understanding the spectral bias of deep learning. arXiv [cs.LG]

work page
[54]

SimpleMKKM: Simple Multiple Kernel K-means

Liu, Xinwang and Zhu, En and Liu, Jiyuan. SimpleMKKM: Simple Multiple Kernel K-means. arXiv [cs.LG]

work page
[55]

Deep unsupervised domain adaptation: A review of recent advances and perspectives

Liu, Xiaofeng and Yoo, Chaehwa and Xing, Fangxu and Oh, Hyejin and El Fakhri, Georges and Kang, Je-Won and Woo, Jonghye. Deep unsupervised domain adaptation: A review of recent advances and perspectives. APSIPA transactions on signal and information processing

work page
[56]

Optimal Transport for Domain Adaptation

Courty, Nicolas and Flamary, Rémi and Tuia, Devis and Rakotomamonjy, Alain. Optimal Transport for Domain Adaptation. arXiv [cs.LG]

work page
[57]

Ahlberg, E N Nilson and (Eds.), J L Walsh

J.H. Ahlberg, E N Nilson and (Eds.), J L Walsh. The Theory of Splines and Their Applications

work page
[58]

Introduction to Fourier Analysis and Wavelets

Pinsky, Mark A. Introduction to Fourier Analysis and Wavelets

work page
[59]

Data Assimilation in Reduced Modeling

Binev, Peter and Cohen, Albert and Dahmen, Wolfgang and DeVore, Ronald and Petrova, Guergana and Wojtaszczyk, Przemyslaw. Data Assimilation in Reduced Modeling. arXiv [math.NA]

work page
[60]

Instances of computational optimal recovery: Refined approximability models

Foucart, Simon. Instances of computational optimal recovery: Refined approximability models. Journal of complexity

work page
[61]

Overparametrized regression: Ridgeless interpolation

Tibshirani, Ryan. Overparametrized regression: Ridgeless interpolation. Lecture Notes from Course of Advanced Topics in Statistical Learning, Spring

work page
[62]

Introduction to Shannon sampling and interpolation theory

Marks, II, Robert J. Introduction to Shannon sampling and interpolation theory

work page
[63]

The Joys of Haar Measure

Diestel, Joe and Spalsbury, Angela. The Joys of Haar Measure

work page
[64]

Identifiability Conditions for Domain Adaptation

Gulrajani, Ishaan and Hashimoto, Tatsunori. Identifiability Conditions for Domain Adaptation. Proceedings of the 39th International Conference on Machine Learning

work page
[65]

A Generalized Representer Theorem

Schölkopf, Bernhard and Herbrich, Ralf and Smola, Alex J. A Generalized Representer Theorem. Lecture Notes in Computer Science

work page
[66]

Modern umbral calculus: An elementary introduction with applications to linear interpolation and operator approximation theory

Costabile, Francesco Aldo. Modern umbral calculus: An elementary introduction with applications to linear interpolation and operator approximation theory

work page
[67]

Kernel properties and representations of boundary integral operators

Schwab, C and Wendland, W L. Kernel properties and representations of boundary integral operators. Mathematische Nachrichten

work page
[68]

Some properties of regularized kernel methods

Vito, E and Rosasco, L and Caponnetto, A and Piana, M and Verri, A. Some properties of regularized kernel methods. Journal of machine learning research: JMLR

work page
[69]

Kwok, Ivor W Tsang

James T. Kwok, Ivor W Tsang. Learning with Idealized Kernels

work page
[70]

Improving support vector machine classifiers by modifying kernel functions

Amari, S and Wu, S. Improving support vector machine classifiers by modifying kernel functions. Neural networks: the official journal of the International Neural Network Society

work page
[71]

Learning low-rank kernel matrices

Kulis, Brian and Sustik, Mátyás and Dhillon, Inderjit. Learning low-rank kernel matrices. Proceedings of the 23rd international conference on Machine learning - ICML '06

work page
[72]

Efficient SVM training using low-rank kernel representations

Fine, Shai and Scheinberg, K. Efficient SVM training using low-rank kernel representations. Journal of machine learning research: JMLR

work page
[73]

Approximating the Covariance Matrix with Low-rank Perturbations

Magdon-Ismail, Malik and Purnell, Jonathan T. Approximating the Covariance Matrix with Low-rank Perturbations

work page
[74]

Kristiadi-Being Bayesian, Even Just a Bit, Fixe\_supplement\_1

work page
[75]

Analysis of Representations for Domain Adaptation

Ben-David, Shai and Blitzer, John and Crammer, Koby and Pereira, Fernando. Analysis of Representations for Domain Adaptation. Advances in Neural Information Processing Systems

work page
[76]

On the difficulty of approximately maximizing agreements

Ben-David, Shai and Eiron, Nadav and Long, Philip M. On the difficulty of approximately maximizing agreements. Journal of computer and system sciences

work page
[77]

Machine learning and the physical sciences

Carleo, Giuseppe and Cirac, Ignacio and Cranmer, Kyle and Daudet, Laurent and Schuld, Maria and Tishby, Naftali and Vogt-Maranto, Leslie and Zdeborová, Lenka. Machine learning and the physical sciences. Reviews of modern physics

work page
[78]

Mathematics Stack Exchange

Eigenvalues for a product of matrices. Mathematics Stack Exchange

work page
[79]

Eigenvalue inequalities for matrix product

Zhang, F and Zhang, Q. Eigenvalue inequalities for matrix product. IEEE transactions on automatic control

work page
[80]

Extreme eigenvalue distributions of some complex correlated non-central Wishart and gamma-Wishart random matrices

Dharmawansa, Prathapasinghe and McKay, Matthew R. Extreme eigenvalue distributions of some complex correlated non-central Wishart and gamma-Wishart random matrices. Journal of multivariate analysis

work page

Showing first 80 references.

[1] [1]

Functional Analysis

Rudin, Walter. Functional Analysis

work page

[2] [2]

On regularization algorithms in learning theory

Bauer, Frank and Pereverzev, Sergei and Rosasco, Lorenzo. On regularization algorithms in learning theory. Journal of complexity

work page

[3] [3]

Approximation theory and harmonic analysis on spheres and balls

Dai, Feng and Xu, Yuan. Approximation theory and harmonic analysis on spheres and balls

work page

[4] [4]

Aspects of Harmonic Analysis and Representation Theory

Gallier, Jean and Quaintance, Jocelyn. Aspects of Harmonic Analysis and Representation Theory

work page

[5] [5]

Replica Method for the Machine Learning Theorist

Blake Bordelon, Haozhe Shan, Abdul Canatar, Boaz Barak, Cengiz Pehlevan. Replica Method for the Machine Learning Theorist

work page

[6] [6]

Categorization based on similarity and features: the reproducing kernel Banach space (RKBS) approach

Zhang, Jun and Zhang, Haizhang. Categorization based on similarity and features: the reproducing kernel Banach space (RKBS) approach. New Handbook of Mathematical Psychology

work page

[7] [7]

Fluctuations of Spiked Random Matrix Models and Failure Diagnosis in Sensor Networks

Couillet, Romain and Hachem, Walid. Fluctuations of Spiked Random Matrix Models and Failure Diagnosis in Sensor Networks. IEEE Transactions on Information Theory

work page

[8] [8]

A Experimental details

Datasets, Downstream. A Experimental details

work page

[9] [9]

Incremental kernel principal component analysis

Chin, Tat-Jun and Suter, David. Incremental kernel principal component analysis. IEEE transactions on image processing: a publication of the IEEE Signal Processing Society

work page

[10] [10]

Scattered Data Approximation

Wendland, Holger. Scattered Data Approximation

work page

[11] [11]

A support vector machine with a hybrid kernel and minimal vapnik-chervonenkis dimension

Tan, Ying and Wang, Jun. A support vector machine with a hybrid kernel and minimal vapnik-chervonenkis dimension. IEEE transactions on knowledge and data engineering

work page

[12] [12]

Reproducing kernel Hilbert spaces on manifolds: Sobolev and Diffusion spaces

De Vito, Ernesto and Mücke, Nicole and Rosasco, Lorenzo. Reproducing kernel Hilbert spaces on manifolds: Sobolev and Diffusion spaces. arXiv [math.FA]

work page

[13] [13]

Reproducing kernels of generalized Sobolev spaces via a Green function approach with distributional operators

Fasshauer, Gregory E and Ye, Qi. Reproducing kernels of generalized Sobolev spaces via a Green function approach with distributional operators. Numerische mathematik

work page

[14] [14]

Adaptive kernel methods using the balancing principle

De Vito, E and Pereverzyev, S and Rosasco, L. Adaptive kernel methods using the balancing principle. Foundations of computational mathematics

work page

[15] [15]

A Mathematical Introduction to Compressive Sensing

Foucart, Simon and Rauhut, Holger. A Mathematical Introduction to Compressive Sensing

work page

[16] [16]

Matrix Perturbation Theory

Stewart, G W and Sun, Ji-Guang. Matrix Perturbation Theory

work page

[17] [17]

Unshuffling data for improved generalization in visual question answering

Teney, Damien and Abbasnejad, Ehsan and van den Hengel, Anton. Unshuffling data for improved generalization in visual question answering. 2021 IEEE/CVF International Conference on Computer Vision (ICCV)

work page 2021

[18] [18]

Dimension lower bounds for linear approaches to function approximation

Hsu, Daniel J. Dimension lower bounds for linear approaches to function approximation

work page

[19] [19]

Measure Theory

Cohn, Donald L. Measure Theory

work page

[20] [20]

for any p ∈ [1, ∞), n

b‖, ‖a and Max, x :=. for any p ∈ [1, ∞), n

work page

[21] [21]

Algebraic geometry and statistical learning theory

Watanabe, Sumio. Algebraic geometry and statistical learning theory

work page

[22] [22]

Measuring and testing homogeneity of distributions by characteristic distance

Li, Xu and Hu, Wenjuan and Zhang, Baoxue. Measuring and testing homogeneity of distributions by characteristic distance. Statistical papers (Berlin, Germany)

work page

[23] [23]

Umbral nature of the Poisson random variables

Di Nardo, E and Senato, D. Umbral nature of the Poisson random variables. arXiv [math.PR]

work page

[24] [24]

On the notion of neighborhood system

Mishachev, Nikolai and Shmyrin, Anatoly. On the notion of neighborhood system. 2019 1st International Conference on Control Systems, Mathematical Modelling, Automation and Energy Efficiency (SUMMA)

work page 2019

[25] [25]

Functional Analysis for Probability and Stochastic Processes

Bobrowski, A. Functional Analysis for Probability and Stochastic Processes

work page

[26] [26]

Stability and Generalization

Bousquet, O and Elisseeff, A. Stability and Generalization. Journal of machine learning research: JMLR

work page

[27] [27]

Capacity of reproducing kernel spaces in learning theory

Zhou, Ding-Xuan. Capacity of reproducing kernel spaces in learning theory. IEEE transactions on information theory

work page

[28] [28]

Approximation in learning theory

Temlyakov, V N. Approximation in learning theory. Constructive approximation

work page

[29] [29]

Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms

Fischer, Simon and Steinwart, Ingo. Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms. Journal of Machine Learning Research

work page

[30] [30]

Universal algorithms for learning theory part I : Piecewise constant functions

Binev, P and Cohen, A and Dahmen, W and DeVore, R and Temlyakov, V. Universal algorithms for learning theory part I : Piecewise constant functions. Journal of machine learning research: JMLR

work page

[31] [31]

An introduction to Sobolev spaces

Piskin, Erhan and Okutmustur, Baver. An introduction to Sobolev spaces

work page

[32] [32]

Entropy-based subsampling methods for big data

Sui, Qun and Ghosh, Sujit K. Entropy-based subsampling methods for big data. Journal of statistical theory and practice

work page

[33] [33]

A theory of usable information under computational constraints

Xu, Yilun and Zhao, Shengjia and Song, Jiaming and Stewart, Russell and Ermon, Stefano. A theory of usable information under computational constraints. arXiv [cs.LG]

work page

[34] [34]

Uniform approximation of functions with random bases

Rahimi, Ali and Recht, Benjamin. Uniform approximation of functions with random bases. 2008 46th Annual Allerton Conference on Communication, Control, and Computing

work page 2008

[35] [35]

Tight frame expansions of multiscale reproducing kernels in Sobolev spaces

Opfer, Roland. Tight frame expansions of multiscale reproducing kernels in Sobolev spaces. Applied and computational harmonic analysis

work page

[36] [36]

A theory of representation learning gives a deep generalisation of kernel methods

Yang, Adam X and Robeyns, Maxime and Milsom, Edward and Anson, Ben and Schoots, Nandi and Aitchison, Laurence. A theory of representation learning gives a deep generalisation of kernel methods. Proceedings of the 40th International Conference on Machine Learning

work page

[37] [37]

Low rank spectral network alignment

Nassar, Huda and Veldt, Nate and Mohammadi, Shahin and Grama, Ananth and Gleich, David F. Low rank spectral network alignment. Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18

work page 2018

[38] [38]

Learning the Kernel Function via Regularization

Micchelli, Charles A and Pontil, Massimiliano. Learning the Kernel Function via Regularization. Journal of Machine Learning Research

work page

[39] [39]

HoMM: Higher-order Moment Matching for unsupervised domain adaptation

Chen, Chao and Fu, Zhihang and Chen, Zhihong and Jin, Sheng and Cheng, Zhaowei and Jin, Xinyu and Hua, Xian-Sheng. HoMM: Higher-order Moment Matching for unsupervised domain adaptation. Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

work page

[40] [40]

Foundations of Machine Learning

Mohri, Mehryar and Rostamizadeh, Afshin and Talwalkar, Ameet. Foundations of Machine Learning

work page

[41] [41]

Boosting a weak learning algorithm by majority

Freund, Y. Boosting a weak learning algorithm by majority. Information and computation

work page

[42] [42]

Multivariate Function and Operator Estimation, Based on Smoothing Splines and Reproducing Kernels

Wahba, Grace. Multivariate Function and Operator Estimation, Based on Smoothing Splines and Reproducing Kernels. Nonlinear Modeling and Forecasting

work page

[43] [43]

Smoothing Splines: Methods and Applications

Wang, Yuedong. Smoothing Splines: Methods and Applications

work page

[44] [44]

Local Rademacher Complexity Bounds based on Covering Numbers

Lei, Yunwen and Ding, Lixin and Bi, Yingzhou. Local Rademacher Complexity Bounds based on Covering Numbers. arXiv [cs.AI]

work page

[45] [45]

Concentration estimates for learning with ℓ1-regularizer and data dependent hypothesis spaces

Shi, Lei and Feng, Yun-Long and Zhou, Ding-Xuan. Concentration estimates for learning with ℓ1-regularizer and data dependent hypothesis spaces. Applied and computational harmonic analysis

work page

[46] [46]

Learning with sample dependent hypothesis spaces

Wu, Qiang and Zhou, Ding-Xuan. Learning with sample dependent hypothesis spaces. Computers & mathematics with applications (Oxford, England: 1987)

work page 1987

[47] [47]

Machine learning with data dependent hypothesis classes

Cannon, A and Ettinger, J M and Hush, D and Scovel, C. Machine learning with data dependent hypothesis classes. The Journal of Machine Learning Research

work page

[48] [48]

Regularized modal regression with data-dependent hypothesis spaces

Wang, Yingjie and Chen, Hong and Song, Biqin and Li, Han. Regularized modal regression with data-dependent hypothesis spaces. International Journal of Wavelets, Multiresolution and Information Processing

work page

[49] [49]

The eigenlearning framework: A conservation law perspective on kernel regression and wide neural networks

Simon, James B and Dickens, Madeline and Karkada, Dhruva and DeWeese, Michael R. The eigenlearning framework: A conservation law perspective on kernel regression and wide neural networks. arXiv [cs.LG]

work page

[50] [50]

Frame Theory in Data Science

Zhang, Zhihua and Jorgensen, Palle E T. Frame Theory in Data Science

work page

[51] [51]

Kernel Alignment Risk Estimator: Risk prediction from training data

Jacot, Arthur and cSimcsek, Berfin and Spadaro, Francesco and Hongler, Clément and Gabriel, Franck. Kernel Alignment Risk Estimator: Risk prediction from training data. Neural Information Processing Systems

work page

[52] [52]

Group invariant scattering

Mallat, Stéphane. Group invariant scattering. Communications on pure and applied mathematics

work page

[53] [53]

Towards understanding the spectral bias of deep learning

Cao, Yuan and Fang, Zhiying and Wu, Yue and Zhou, Ding-Xuan and Gu, Quanquan. Towards understanding the spectral bias of deep learning. arXiv [cs.LG]

work page

[54] [54]

SimpleMKKM: Simple Multiple Kernel K-means

Liu, Xinwang and Zhu, En and Liu, Jiyuan. SimpleMKKM: Simple Multiple Kernel K-means. arXiv [cs.LG]

work page

[55] [55]

Deep unsupervised domain adaptation: A review of recent advances and perspectives

Liu, Xiaofeng and Yoo, Chaehwa and Xing, Fangxu and Oh, Hyejin and El Fakhri, Georges and Kang, Je-Won and Woo, Jonghye. Deep unsupervised domain adaptation: A review of recent advances and perspectives. APSIPA transactions on signal and information processing

work page

[56] [56]

Optimal Transport for Domain Adaptation

Courty, Nicolas and Flamary, Rémi and Tuia, Devis and Rakotomamonjy, Alain. Optimal Transport for Domain Adaptation. arXiv [cs.LG]

work page

[57] [57]

Ahlberg, E N Nilson and (Eds.), J L Walsh

J.H. Ahlberg, E N Nilson and (Eds.), J L Walsh. The Theory of Splines and Their Applications

work page

[58] [58]

Introduction to Fourier Analysis and Wavelets

Pinsky, Mark A. Introduction to Fourier Analysis and Wavelets

work page

[59] [59]

Data Assimilation in Reduced Modeling

Binev, Peter and Cohen, Albert and Dahmen, Wolfgang and DeVore, Ronald and Petrova, Guergana and Wojtaszczyk, Przemyslaw. Data Assimilation in Reduced Modeling. arXiv [math.NA]

work page

[60] [60]

Instances of computational optimal recovery: Refined approximability models

Foucart, Simon. Instances of computational optimal recovery: Refined approximability models. Journal of complexity

work page

[61] [61]

Overparametrized regression: Ridgeless interpolation

Tibshirani, Ryan. Overparametrized regression: Ridgeless interpolation. Lecture Notes from Course of Advanced Topics in Statistical Learning, Spring

work page

[62] [62]

Introduction to Shannon sampling and interpolation theory

Marks, II, Robert J. Introduction to Shannon sampling and interpolation theory

work page

[63] [63]

The Joys of Haar Measure

Diestel, Joe and Spalsbury, Angela. The Joys of Haar Measure

work page

[64] [64]

Identifiability Conditions for Domain Adaptation

Gulrajani, Ishaan and Hashimoto, Tatsunori. Identifiability Conditions for Domain Adaptation. Proceedings of the 39th International Conference on Machine Learning

work page

[65] [65]

A Generalized Representer Theorem

Schölkopf, Bernhard and Herbrich, Ralf and Smola, Alex J. A Generalized Representer Theorem. Lecture Notes in Computer Science

work page

[66] [66]

Modern umbral calculus: An elementary introduction with applications to linear interpolation and operator approximation theory

Costabile, Francesco Aldo. Modern umbral calculus: An elementary introduction with applications to linear interpolation and operator approximation theory

work page

[67] [67]

Kernel properties and representations of boundary integral operators

Schwab, C and Wendland, W L. Kernel properties and representations of boundary integral operators. Mathematische Nachrichten

work page

[68] [68]

Some properties of regularized kernel methods

Vito, E and Rosasco, L and Caponnetto, A and Piana, M and Verri, A. Some properties of regularized kernel methods. Journal of machine learning research: JMLR

work page

[69] [69]

Kwok, Ivor W Tsang

James T. Kwok, Ivor W Tsang. Learning with Idealized Kernels

work page

[70] [70]

Improving support vector machine classifiers by modifying kernel functions

Amari, S and Wu, S. Improving support vector machine classifiers by modifying kernel functions. Neural networks: the official journal of the International Neural Network Society

work page

[71] [71]

Learning low-rank kernel matrices

Kulis, Brian and Sustik, Mátyás and Dhillon, Inderjit. Learning low-rank kernel matrices. Proceedings of the 23rd international conference on Machine learning - ICML '06

work page

[72] [72]

Efficient SVM training using low-rank kernel representations

Fine, Shai and Scheinberg, K. Efficient SVM training using low-rank kernel representations. Journal of machine learning research: JMLR

work page

[73] [73]

Approximating the Covariance Matrix with Low-rank Perturbations

Magdon-Ismail, Malik and Purnell, Jonathan T. Approximating the Covariance Matrix with Low-rank Perturbations

work page

[74] [74]

Kristiadi-Being Bayesian, Even Just a Bit, Fixe\_supplement\_1

work page

[75] [75]

Analysis of Representations for Domain Adaptation

Ben-David, Shai and Blitzer, John and Crammer, Koby and Pereira, Fernando. Analysis of Representations for Domain Adaptation. Advances in Neural Information Processing Systems

work page

[76] [76]

On the difficulty of approximately maximizing agreements

Ben-David, Shai and Eiron, Nadav and Long, Philip M. On the difficulty of approximately maximizing agreements. Journal of computer and system sciences

work page

[77] [77]

Machine learning and the physical sciences

Carleo, Giuseppe and Cirac, Ignacio and Cranmer, Kyle and Daudet, Laurent and Schuld, Maria and Tishby, Naftali and Vogt-Maranto, Leslie and Zdeborová, Lenka. Machine learning and the physical sciences. Reviews of modern physics

work page

[78] [78]

Mathematics Stack Exchange

Eigenvalues for a product of matrices. Mathematics Stack Exchange

work page

[79] [79]

Eigenvalue inequalities for matrix product

Zhang, F and Zhang, Q. Eigenvalue inequalities for matrix product. IEEE transactions on automatic control

work page

[80] [80]

Extreme eigenvalue distributions of some complex correlated non-central Wishart and gamma-Wishart random matrices

Dharmawansa, Prathapasinghe and McKay, Matthew R. Extreme eigenvalue distributions of some complex correlated non-central Wishart and gamma-Wishart random matrices. Journal of multivariate analysis

work page