Structural interpretability in SVMs with truncated orthogonal polynomial kernels

Edmundo J. Huertas; Nuria Torrado; V\'ictor Soto-Larrosa

arxiv: 2604.15285 · v1 · submitted 2026-04-16 · 📊 stat.ML · cs.LG· math.ST· stat.TH

Structural interpretability in SVMs with truncated orthogonal polynomial kernels

V\'ictor Soto-Larrosa , Nuria Torrado , Edmundo J. Huertas This is my paper

Pith reviewed 2026-05-10 09:42 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.STstat.TH

keywords support vector machinesorthogonal polynomial kernelsinterpretabilityreproducing kernel Hilbert spacepost-training analysiskernel contribution indicesstructural decompositiondecision function expansion

0 comments

The pith

Truncated orthogonal polynomial kernels allow exact post-training decomposition of SVM decision functions via orthonormal RKHS expansions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that support vector machines built with truncated orthogonal polynomial kernels possess a finite-dimensional reproducing kernel Hilbert space equipped with an explicit tensor-product orthonormal basis. This basis makes it possible to expand the fitted decision function exactly in intrinsic coordinates, which in turn supports a diagnostic framework called Orthogonal Representation Contribution Analysis. The framework relies on normalized Orthogonal Kernel Contribution indices that break down the squared RKHS norm according to interaction orders, polynomial degrees, marginal effects, and pairwise terms. A sympathetic reader cares because the entire procedure runs after training is finished, requires no surrogate models or retraining, and surfaces structural properties of the classifier that accuracy numbers alone do not reveal, as demonstrated on a double-spiral synthetic task and a five-dimensional echocardiogram dataset.

Core claim

Because the reproducing kernel Hilbert space for truncated orthogonal polynomial kernels is finite-dimensional and possesses an explicit tensor-product orthonormal basis, the fitted SVM decision function admits an exact expansion in intrinsic RKHS coordinates. This expansion underpins the Orthogonal Representation Contribution Analysis (ORCA) diagnostic, which employs normalized Orthogonal Kernel Contribution (OKC) indices to partition the squared RKHS norm of the classifier according to interaction orders, total polynomial degrees, marginal coordinate effects, and pairwise contributions. The method operates entirely after training is complete and is validated on a synthetic double-spiral问题,

What carries the argument

The exact expansion of the SVM decision function in the tensor-product orthonormal basis of the finite-dimensional RKHS, which directly produces the normalized Orthogonal Kernel Contribution indices that quantify how the squared norm is allocated across structural components.

If this is right

The squared RKHS norm of the classifier can be attributed to specific interaction orders and total degrees without retraining.
Marginal coordinate effects and pairwise contributions can be isolated and compared directly from the trained model.
Structural complexity measures derived from the indices are independent of predictive accuracy on both synthetic and real data.
The framework applies to any dataset once the SVM with the given kernel is trained, including five-dimensional medical records.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The indices could be used to decide how far to truncate the polynomial degree by identifying where higher-order contributions become negligible.
In high-dimensional problems the same decomposition might flag redundant feature interactions that could be dropped to simplify the model.
Similar orthonormal expansions might be sought for other kernels that possess finite bases, extending the interpretability approach beyond polynomials.

Load-bearing premise

The reproducing kernel Hilbert space of the truncated orthogonal polynomial kernel is finite-dimensional and admits an explicit tensor-product orthonormal basis that permits exact expansion of any function in the space.

What would settle it

After training an SVM on the double-spiral data, compute the OKC indices and check whether their sum exactly recovers the squared RKHS norm of the fitted classifier; any material discrepancy would show the decomposition is not exact.

Figures

Figures reproduced from arXiv: 2604.15285 by Edmundo J. Huertas, Nuria Torrado, V\'ictor Soto-Larrosa.

**Figure 1.** Figure 1: Decision boundaries {g(x) = 0} of the Jacobi-kernel SVM on the double-spiral dataset for α = β = 0 (Legendre polynomials) and C = 1, as the truncation level n varies from 1 to 16. Top row, left to right: n = 1, 2, 3, 5. Bottom row, left to right: n = 8, 12, 14, 16. The background colour map shows the decision function g(x); the bold curve is the zero level set; points are coloured by class label. dimension… view at source ↗

**Figure 2.** Figure 2: Decision boundaries {g(x) = 0} of the Jacobi-kernel SVM on the doublespiral dataset for fixed n = 12 and C = 1, as the Jacobi parameters (α, β) vary. Top row: symmetric cases (α, β) ∈ {(0, 0), (0.5, 0.5), (1, 1), (2, 2)}. Middle row: asymmetric cases (α, β) ∈ {(0.5, 0), (1, 0), (2, 1), (2, 0.5)}. Bottom row: asymmetric cases (α, β) ∈ {(1.5, 0.3), (3, 0.5), (0, 2), (1, 3)}. The background colour map shows … view at source ↗

read the original abstract

We study post-training interpretability for Support Vector Machines (SVMs) built from truncated orthogonal polynomial kernels. Since the associated reproducing kernel Hilbert space is finite-dimensional and admits an explicit tensor-product orthonormal basis, the fitted decision function can be expanded exactly in intrinsic RKHS coordinates. This leads to Orthogonal Representation Contribution Analysis (ORCA), a diagnostic framework based on normalized Orthogonal Kernel Contribution (OKC) indices. These indices quantify how the squared RKHS norm of the classifier is distributed across interaction orders, total polynomial degrees, marginal coordinate effects, and pairwise contributions. The methodology is fully post-training and requires neither surrogate models nor retraining. We illustrate its diagnostic value on a synthetic double-spiral problem and on a real five-dimensional echocardiogram dataset. The results show that the proposed indices reveal structural aspects of model complexity that are not captured by predictive accuracy alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives an exact algebraic decomposition of the SVM decision function for truncated orthogonal polynomial kernels into normalized contributions by interaction order, degree, and variable pairs.

read the letter

The main point is that for SVMs using these specific kernels the reproducing kernel Hilbert space is finite-dimensional with an explicit tensor-product orthonormal basis, so the fitted classifier expands directly into marginal, pairwise, and higher-order terms. The authors turn that expansion into normalized Orthogonal Kernel Contribution indices that split the squared RKHS norm without any approximation or retraining. That is the actual new piece: the ORCA framework and OKC indices applied to this kernel family.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes Orthogonal Representation Contribution Analysis (ORCA) as a post-training interpretability framework for SVM classifiers using truncated orthogonal polynomial kernels. Exploiting the finite-dimensional RKHS and its explicit tensor-product orthonormal basis (arising from the Mercer expansion of the kernel), the fitted decision function is expanded exactly in intrinsic coordinates. This yields normalized Orthogonal Kernel Contribution (OKC) indices that decompose the squared RKHS norm of the classifier by interaction order, total polynomial degree, marginal coordinate effects, and pairwise contributions. The approach requires no retraining or surrogate models and is demonstrated on a synthetic double-spiral dataset and a five-dimensional echocardiogram dataset.

Significance. If the algebraic construction holds, the work supplies an exact, intrinsic, and parameter-free diagnostic for structural complexity in kernel SVMs. The decomposition directly quantifies how the model allocates its RKHS norm across orders and coordinates, offering insight beyond predictive accuracy that is unavailable from standard post-hoc methods. The finite-dimensional setting and orthonormal basis make the indices computable directly from the dual coefficients, which is a clear technical strength.

minor comments (3)

The abstract and introduction state that the indices are 'normalized,' but the precise normalization (e.g., whether each OKC index is divided by the total squared norm or by the norm of its interaction-order subspace) is not stated explicitly in the provided summary; a short clarifying sentence or equation in §3 would remove ambiguity.
In the experimental section, the double-spiral and echocardiogram results are presented only qualitatively; adding a small table that reports the OKC values for the top three interaction orders (with standard errors if multiple runs are performed) would strengthen the claim that the indices reveal structure not captured by accuracy alone.
The paper correctly notes that the method is fully post-training, yet it would be useful to include a one-sentence remark on computational cost: once the dual coefficients are known, the OKC indices are obtained by a single matrix-vector multiplication whose dimension equals the number of basis functions.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work on ORCA and for recognizing its value as an exact, parameter-free decomposition of the RKHS norm in finite-dimensional polynomial kernel SVMs. The recommendation for minor revision is noted. No specific major comments were raised in the report, so we have no substantive points requiring rebuttal or revision at this stage. We remain available for any editorial clarifications the editor may request.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central construction begins with the choice of a truncated orthogonal polynomial kernel, which by the Mercer theorem induces a finite-dimensional RKHS possessing an explicit tensor-product orthonormal basis of univariate polynomials. The representer theorem then places the SVM decision function exactly in the span of this basis, so that the squared RKHS norm admits an algebraic decomposition into contributions by interaction order, total degree, marginals, and pairs. The OKC indices are defined directly as the normalized terms of this decomposition once the dual coefficients are known; they are not obtained by fitting any parameter to a target quantity, nor do they rely on a self-citation chain, uniqueness theorem, or ansatz smuggled from prior work. The resulting ORCA framework is therefore a post-training diagnostic that follows tautologically from the kernel's finite basis property without reducing the claimed interpretability result to its own inputs by construction. No load-bearing circular step is present.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard properties of reproducing kernel Hilbert spaces for polynomial kernels and the explicit orthonormal basis that exists once truncation is applied. No new physical entities or ad-hoc fitted constants are introduced beyond the usual SVM regularization parameter.

axioms (1)

standard math The RKHS associated with a truncated orthogonal polynomial kernel is finite-dimensional and possesses an explicit tensor-product orthonormal basis.
Invoked in the abstract to justify exact expansion of the decision function.

invented entities (1)

Orthogonal Kernel Contribution (OKC) indices no independent evidence
purpose: Normalized measures of how the squared RKHS norm is partitioned across orders, degrees, and coordinate interactions.
New diagnostic quantities defined from the expansion coefficients.

pith-pipeline@v0.9.0 · 5458 in / 1302 out tokens · 21842 ms · 2026-05-10T09:42:39.961105+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

[1]

DOI:https://doi

Echocardiogram[Dataset], UCI Machine Learning Repository, 1988. DOI:https://doi. org/10.24432/C5QW24

work page doi:10.24432/c5qw24 1988
[2]

J. B. Lasserre, E. Pauwels, and M. Putinar,The Christoffel–Darboux Kernel for Data Analysis, Cambridge University Press, 2022

work page 2022
[3]

Gautschi,Orthogonal Polynomials: Computation and Approximation, Oxford University Press, 2004

W. Gautschi,Orthogonal Polynomials: Computation and Approximation, Oxford University Press, 2004

work page 2004
[4]

Cortes and V

C. Cortes and V. Vapnik,Support-vector networks, Machine Learning20(1995), 273–297

work page 1995
[5]

Vapnik,Statistical Learning Theory, Wiley, 1998

V. Vapnik,Statistical Learning Theory, Wiley, 1998

work page 1998
[6]

Schölkopf and A

B. Schölkopf and A. J. Smola,Learning with Kernels: Support Vector Machines, Regular- ization, Optimization, and Beyond, MIT Press, 2002. 24

work page 2002
[7]

Berlinet and C

A. Berlinet and C. Thomas-Agnan,Reproducing Kernel Hilbert Spaces in Probability and Statistics, Springer, 2004

work page 2004
[8]

Kimeldorf and G

G. Kimeldorf and G. Wahba,Some results on Tchebycheffian spline functions, Journal of Mathematical Analysis and Applications33(1971), 82–95

work page 1971
[9]

Schölkopf, R

B. Schölkopf, R. Herbrich, and A. J. Smola,A generalized representer theorem, inCom- putational Learning Theory (COLT 2001), Lecture Notes in Computer Science, vol. 2111, Springer, 2001, pp. 416–426

work page 2001
[10]

S. Ozer, C. H. Chen, and H. A. Çırpan,A set of new Chebyshev kernel functions for support vector machine pattern classification, Pattern Recognition44(2011), no. 7, 1435–1447

work page 2011
[11]

V. H. Moghaddam and J. Hamidzadeh,New Hermite orthogonal polynomial kernel and combined kernels in Support Vector Machine classifier, Pattern Recognition60(2016), 921–935

work page 2016
[12]

Szegő,Orthogonal Polynomials, American Mathematical Society Colloquium Publications, Vol

G. Szegő,Orthogonal Polynomials, American Mathematical Society Colloquium Publications, Vol. 23, American Mathematical Society, 1939

work page 1939
[13]

Simon,The Christoffel–Darboux kernel, Proceedings of Symposia in Pure Mathematics 79(2008); available as arXiv:0806.1528

B. Simon,The Christoffel–Darboux kernel, Proceedings of Symposia in Pure Mathematics 79(2008); available as arXiv:0806.1528. 25

work page arXiv 2008

[1] [1]

DOI:https://doi

Echocardiogram[Dataset], UCI Machine Learning Repository, 1988. DOI:https://doi. org/10.24432/C5QW24

work page doi:10.24432/c5qw24 1988

[2] [2]

J. B. Lasserre, E. Pauwels, and M. Putinar,The Christoffel–Darboux Kernel for Data Analysis, Cambridge University Press, 2022

work page 2022

[3] [3]

Gautschi,Orthogonal Polynomials: Computation and Approximation, Oxford University Press, 2004

W. Gautschi,Orthogonal Polynomials: Computation and Approximation, Oxford University Press, 2004

work page 2004

[4] [4]

Cortes and V

C. Cortes and V. Vapnik,Support-vector networks, Machine Learning20(1995), 273–297

work page 1995

[5] [5]

Vapnik,Statistical Learning Theory, Wiley, 1998

V. Vapnik,Statistical Learning Theory, Wiley, 1998

work page 1998

[6] [6]

Schölkopf and A

B. Schölkopf and A. J. Smola,Learning with Kernels: Support Vector Machines, Regular- ization, Optimization, and Beyond, MIT Press, 2002. 24

work page 2002

[7] [7]

Berlinet and C

A. Berlinet and C. Thomas-Agnan,Reproducing Kernel Hilbert Spaces in Probability and Statistics, Springer, 2004

work page 2004

[8] [8]

Kimeldorf and G

G. Kimeldorf and G. Wahba,Some results on Tchebycheffian spline functions, Journal of Mathematical Analysis and Applications33(1971), 82–95

work page 1971

[9] [9]

Schölkopf, R

B. Schölkopf, R. Herbrich, and A. J. Smola,A generalized representer theorem, inCom- putational Learning Theory (COLT 2001), Lecture Notes in Computer Science, vol. 2111, Springer, 2001, pp. 416–426

work page 2001

[10] [10]

S. Ozer, C. H. Chen, and H. A. Çırpan,A set of new Chebyshev kernel functions for support vector machine pattern classification, Pattern Recognition44(2011), no. 7, 1435–1447

work page 2011

[11] [11]

V. H. Moghaddam and J. Hamidzadeh,New Hermite orthogonal polynomial kernel and combined kernels in Support Vector Machine classifier, Pattern Recognition60(2016), 921–935

work page 2016

[12] [12]

Szegő,Orthogonal Polynomials, American Mathematical Society Colloquium Publications, Vol

G. Szegő,Orthogonal Polynomials, American Mathematical Society Colloquium Publications, Vol. 23, American Mathematical Society, 1939

work page 1939

[13] [13]

Simon,The Christoffel–Darboux kernel, Proceedings of Symposia in Pure Mathematics 79(2008); available as arXiv:0806.1528

B. Simon,The Christoffel–Darboux kernel, Proceedings of Symposia in Pure Mathematics 79(2008); available as arXiv:0806.1528. 25

work page arXiv 2008