Self-Supervised Learning by Curvature Alignment

Benyamin Ghojogh; M.Hadi Sepanj; Paul Fieguth

arxiv: 2511.17426 · v2 · pith:V5SQSSTZnew · submitted 2025-11-21 · 💻 cs.LG · cs.CV· stat.ML

Self-Supervised Learning by Curvature Alignment

Benyamin Ghojogh , M.Hadi Sepanj , Paul Fieguth This is my paper

Pith reviewed 2026-05-21 18:07 UTC · model grok-4.3

classification 💻 cs.LG cs.CVstat.ML

keywords self-supervised learningcurvature regularizationmanifold geometryrepresentation learningnon-contrastive SSLBarlow Twinslocal geometry

0 comments

The pith

Aligning discrete curvature scores across augmentations shapes local manifold geometry for better self-supervised representations

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that adding a curvature regularizer to non-contrastive SSL objectives like Barlow Twins improves representation quality by enforcing consistency of local manifold bending between different views of the same data point. It computes a discrete curvature score for each embedding from cosine interactions among its k nearest neighbors on the unit hypersphere and then applies a Barlow-style loss to align and decorrelate these scores across augmentations. A sympathetic reader would care because existing SSL methods control only first- and second-order statistics while leaving higher-order geometric structure of the data manifold unaddressed. The framework includes a kernel extension that computes curvature from a normalized local Gram matrix in RKHS. Experiments with a ResNet-18 backbone on MNIST and CIFAR-10 show that the resulting representations achieve competitive or improved linear evaluation accuracy compared to Barlow Twins and VICReg.

Core claim

CurvSSL augments a standard two-view encoder-projector architecture and Barlow Twins-style redundancy-reduction loss with a curvature regularizer; each embedding is treated as a vertex whose k nearest neighbors define a discrete curvature score via cosine interactions on the unit hypersphere, and these scores are aligned and decorrelated across augmentations by a second Barlow-style loss on the curvature-derived matrix, thereby encouraging both view invariance and consistency of local manifold bending.

What carries the argument

Discrete curvature score computed from cosine interactions among k nearest neighbors on the unit hypersphere, aligned via a Barlow-style loss on the curvature matrix to enforce geometric consistency across augmentations

If this is right

Representations acquire invariance to local geometric properties in addition to statistical moments.
The method extends directly to a kernel variant that operates in an RKHS using a normalized local Gram matrix.
Curvature alignment serves as a simple complement to purely statistical SSL regularizers focused on variance and covariance.
Linear evaluation performance on image benchmarks such as MNIST and CIFAR-10 becomes competitive with or superior to Barlow Twins and VICReg.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same curvature-alignment idea could be grafted onto other SSL frameworks such as VICReg or SimCLR to test for additive gains.
Domains with strong intrinsic manifold structure, such as scientific imaging or molecular data, may benefit more from explicit geometric regularization than natural-image benchmarks.
Replacing the fixed k-nearest-neighbor approximation with a continuous or adaptive curvature estimator might tighten the geometric constraint further.
Higher-order manifold statistics beyond curvature, such as torsion or higher derivatives, could be aligned in the same Barlow-style manner.

Load-bearing premise

That the discrete curvature score from cosine interactions among k nearest neighbors on the unit hypersphere meaningfully captures local manifold bending and that aligning these scores across augmentations improves the quality of the learned representations for downstream tasks.

What would settle it

An ablation study in which removing the curvature regularizer produces equal or higher linear probing accuracy on MNIST and CIFAR-10, or an independent check showing that the computed curvature scores do not correlate with other measures of local manifold curvature such as eigenvalues from local PCA.

Figures

Figures reproduced from arXiv: 2511.17426 by Benyamin Ghojogh, M.Hadi Sepanj, Paul Fieguth.

**Figure 1.** Figure 1: (a) Polyhedron vertex, unit sphere, and the opposite cone, (b) large and small curvature, (c) a point and its neighbors normalized on a unit hyper-sphere around it. it is because it is far away (different) from its neighbors. Therefore, we define a curvature score, denoted by c, which is proportional to the curvature or angular effect. Since, according to the equation of angular effect, the curvature is … view at source ↗

**Figure 2.** Figure 2: UMAP visualization of encoder features on MNIST after curvature-regularized SSL. Points are colored by ground-truth digit class. 5.3. UMAP Visualization of Learned Representations Beyond scalar accuracy, we study the geometry of the learned representations via UMAP embeddings. For each dataset, we extract features from the frozen encoder on a held-out split (train or test) and project them to two dimensio… view at source ↗

read the original abstract

Self-supervised learning (SSL) has recently advanced through non-contrastive methods that couple an invariance term with variance, covariance, or redundancy-reduction penalties. While such objectives shape first- and second-order statistics of the representation, they largely ignore the local geometry of the underlying data manifold. In this paper, we introduce CurvSSL, a curvature-regularized self-supervised learning framework, and its RKHS extension, kernel CurvSSL. Our approach retains a standard two-view encoder-projector architecture with a Barlow Twins-style redundancy-reduction loss on projected features, but augments it with a curvature-based regularizer. Each embedding is treated as a vertex whose $k$ nearest neighbors define a discrete curvature score via cosine interactions on the unit hypersphere; in the kernel variant, curvature is computed from a normalized local Gram matrix in an RKHS. These scores are aligned and decorrelated across augmentations by a Barlow-style loss on a curvature-derived matrix, encouraging both view invariance and consistency of local manifold bending. Experiments on MNIST and CIFAR-10 datasets with a ResNet-18 backbone show that curvature-regularized SSL yields competitive or improved linear evaluation performance compared to Barlow Twins and VICReg. Our results indicate that explicitly shaping local geometry is a simple and effective complement to purely statistical SSL regularizers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper layers a kNN-based curvature regularizer onto Barlow Twins and claims modest gains on small image datasets, but the results are too thin to show the geometry term is doing real work.

read the letter

The core addition is a curvature alignment loss that treats embeddings as points on the unit hypersphere, picks k nearest neighbors, and builds a discrete curvature score from their cosine interactions. A second Barlow-style term then decorrelates and aligns these scores across the two augmented views. The kernel version does something similar inside an RKHS with a local Gram matrix. This sits on top of the usual redundancy-reduction objective without changing the encoder-projector backbone much.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces CurvSSL, a self-supervised learning framework that augments a Barlow Twins-style redundancy reduction loss with a curvature regularizer. The curvature score for each embedding is computed from cosine interactions among its k nearest neighbors on the unit hypersphere (or a normalized local Gram matrix in the RKHS variant). These scores form a matrix that is aligned and decorrelated across augmentations via an additional Barlow-style loss. Experiments on MNIST and CIFAR-10 with a ResNet-18 backbone report competitive or improved linear evaluation performance relative to Barlow Twins and VICReg.

Significance. If the central claim holds after addressing experimental gaps, the work demonstrates that adding an explicit local-geometry regularizer can usefully complement purely statistical SSL objectives such as redundancy reduction. Credit is given for the structural independence of the curvature term from the main loss (as noted in the circularity assessment) and for providing an RKHS extension that broadens the approach beyond Euclidean embeddings.

major comments (3)

[Experiments] Experiments section: linear evaluation accuracies on MNIST and CIFAR-10 are presented without error bars, standard deviations across multiple random seeds, or statistical significance tests. This directly weakens the claim of 'improved' performance, as observed differences could arise from training stochasticity rather than the curvature regularizer.
[Method] Method section (curvature definition): the discrete curvature score is constructed from average cosine interactions among k nearest neighbors on the unit hypersphere, yet no synthetic validation on manifolds with known analytic curvature (e.g., sphere or torus) or ablation isolating geometric content from generic neighbor consistency is provided. This is load-bearing for the interpretation that performance gains stem from shaping local manifold bending rather than additional consistency regularization.
[Experiments] Experiments section: no ablation studies quantify the sensitivity to the free parameters k and the curvature-loss weight, nor compare against a generic consistency regularizer with matched computational cost. These omissions leave open whether the reported gains are specific to the proposed geometric construction.

minor comments (2)

[Abstract] Abstract: the phrase 'competitive or improved' is vague; adding a brief quantitative statement of the observed accuracy deltas would improve clarity.
[Notation] Notation: the construction of the curvature score matrix from the kNN graph would benefit from an explicit equation or small diagram to clarify the transition from embeddings to the regularized matrix.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and for recognizing the potential of incorporating an explicit local-geometry regularizer into non-contrastive SSL. We address each major comment below and commit to revisions that strengthen the empirical claims and clarify the geometric contribution without misrepresenting the current results.

read point-by-point responses

Referee: [Experiments] Experiments section: linear evaluation accuracies on MNIST and CIFAR-10 are presented without error bars, standard deviations across multiple random seeds, or statistical significance tests. This directly weakens the claim of 'improved' performance, as observed differences could arise from training stochasticity rather than the curvature regularizer.

Authors: We agree that single-run results limit the strength of the performance claims. In the revised manuscript we will rerun all linear-evaluation experiments on both MNIST and CIFAR-10 with at least three independent random seeds, report mean accuracies together with standard deviations, and include error bars in the tables. Where differences remain, we will also report the results of paired statistical significance tests. revision: yes
Referee: [Method] Method section (curvature definition): the discrete curvature score is constructed from average cosine interactions among k nearest neighbors on the unit hypersphere, yet no synthetic validation on manifolds with known analytic curvature (e.g., sphere or torus) or ablation isolating geometric content from generic neighbor consistency is provided. This is load-bearing for the interpretation that performance gains stem from shaping local manifold bending rather than additional consistency regularization.

Authors: The concern is well-founded. While the curvature score is explicitly derived from local cosine geometry on the hypersphere, we acknowledge that additional controls are needed to separate geometric bending from generic neighbor consistency. In revision we will add (i) an ablation that replaces the curvature computation with a simple neighbor-consistency loss using the same k-NN structure and (ii) a synthetic experiment on points sampled from a sphere, where the discrete score can be compared against the known analytic curvature. Full validation on a torus would require further theoretical work that lies outside the present scope; we will therefore mark this as a limitation and focus the added experiment on the sphere case. revision: partial
Referee: [Experiments] Experiments section: no ablation studies quantify the sensitivity to the free parameters k and the curvature-loss weight, nor compare against a generic consistency regularizer with matched computational cost. These omissions leave open whether the reported gains are specific to the proposed geometric construction.

Authors: We accept that sensitivity and specificity analyses are required. The revised experiments section will include systematic ablations over k (values 5, 10, 20) and the curvature-loss weight (range 0.1–1.0), reporting linear-evaluation accuracy for each setting. We will also implement a matched-cost generic consistency baseline that enforces view-invariant neighbor relations without the curvature formulation, and we will compare both accuracy and wall-clock overhead. These results will be presented in a new table or figure. revision: yes

Circularity Check

0 steps flagged

No circularity in CurvSSL: curvature regularizer is explicitly defined and independent

full rationale

The paper defines a discrete curvature score directly from kNN cosine interactions on unit-hypersphere embeddings (or normalized local Gram matrix in the kernel case) and applies an additional Barlow-style alignment/decorrelation loss on the resulting curvature matrix. This regularizer is structurally separate from the core redundancy-reduction term, involves no fitted parameters renamed as predictions, and relies on no self-citations or imported uniqueness theorems for its justification. The claimed improvement is presented as an empirical outcome of the combined loss rather than a derivation that reduces to its own inputs by construction. The framework is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The framework introduces a new discrete curvature score whose definition and alignment constitute the main added content beyond prior SSL losses; k and regularization coefficients are free parameters while the hypersphere embedding and local Gram matrix are domain assumptions.

free parameters (2)

k (nearest neighbors)
Number of neighbors used to define the local neighborhood for curvature computation; chosen by the authors.
curvature loss weight
Scalar balancing the curvature alignment term against the main redundancy-reduction loss.

axioms (2)

domain assumption Embeddings are normalized to lie on the unit hypersphere
Standard preprocessing step in many SSL methods that enables cosine-based curvature.
ad hoc to paper Local manifold curvature can be approximated by average cosine interactions among k nearest neighbors
Core modeling choice introduced in the paper for the discrete curvature score.

invented entities (1)

Curvature score matrix no independent evidence
purpose: Encodes local bending information derived from neighbor cosines
New derived quantity on which a second Barlow-style loss is applied.

pith-pipeline@v0.9.0 · 5766 in / 1603 out tokens · 71882 ms · 2026-05-21T18:07:49.033447+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking (D=3 from nontrivial linking on S^D) unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

c(xi) := sum_{a=1}^{k-1} sum_{b=a+1}^k (z_a - z_i)^T (z_b - z_i) / (||...|| ||...||) after centering and unit-hypersphere normalization; analogous normalized Gram matrix in RKHS
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel (J uniquely satisfies the calibrated reciprocal functional equation) unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Lcurv on curvature-derived cross-matrix M plus Lemb redundancy reduction; total L = Lemb + α_curv Lcurv

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 2 internal anchors

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[2]

Normalized kernels as similarity indices

Ah-Pine, J. Normalized kernels as similarity indices. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp.\ 362--373. Springer, 2010

work page 2010
[3]

VICReg : Variance-invariance-covariance regularization for self-supervised learning

Bardes, A., Ponce, J., and LeCun, Y. VICReg : Variance-invariance-covariance regularization for self-supervised learning. In International Conference on Learning Representations, 2022

work page 2022
[4]

A simple framework for contrastive learning of visual representations

Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp.\ 1597--1607. PmLR, 2020

work page 2020
[5]

Coxeter, H. S. M. Regular polytopes. Courier Corporation, 1973

work page 1973
[6]

Progymnasmata de solidorum elementis

Descartes, R. Progymnasmata de solidorum elementis. Oeuvres de Descartes, X: 0 265--276, 1890

work page
[7]

Anomaly detection and prototype selection using polyhedron curvature

Ghojogh, B., Karray, F., and Crowley, M. Anomaly detection and prototype selection using polyhedron curvature. In Canadian Conference on Artificial Intelligence, pp.\ 238--250. Springer, 2020

work page 2020
[8]

Background on kernels

Ghojogh, B., Crowley, M., Karray, F., and Ghodsi, A. Background on kernels. Elements of Dimensionality Reduction and Manifold Learning, pp.\ 43--73, 2023

work page 2023
[9]

Introduction to RKHS , and some simple kernel algorithms

Gretton, A. Introduction to RKHS , and some simple kernel algorithms. Adv. Top. Mach. Learn. Lecture Conducted from University College London, 16 0 (5-3): 0 2, 2013

work page 2013
[10]

Bootstrap your own latent-a new approach to self-supervised learning

Grill, J.-B., Strub, F., Altch \'e , F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., et al. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33: 0 21271--21284, 2020

work page 2020
[11]

Deep residual learning for image recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

work page 2016
[12]

and Pedersen, J

Hilton, P. and Pedersen, J. Descartes, Euler , Poincare , Polya and polyhedra. S \'e minaire de Philosophie et Math \'e matiques , 0 (8): 0 1--17, 1982

work page 1982
[13]

Hofmann, T., Sch \"o lkopf, B., and Smola, A. J. Kernel methods in machine learning. The annals of statistics, pp.\ 1171--1220, 2008

work page 2008
[14]

Kingma, D. P. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[15]

and Hinton, G

Krizhevsky, A. and Hinton, G. Learning multiple layers of features from tiny images. Technical report, University of Toronto, ON, Canada, 2009

work page 2009
[16]

Gradient-based learning applied to document recognition

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86 0 (11): 0 2278--2324, 1998

work page 1998
[17]

Curvature and shape

Markvorsen, S. Curvature and shape. In Yugoslav Geometrical Seminar, Fall School of Differential Geometry, Yugoslavia, pp.\ 55--75, 1996

work page 1996
[18]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

McInnes, L., Healy, J., and Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

T., Kumar, M., Mittal, A., and Kumar, K

Rani, V., Nabi, S. T., Kumar, M., Mittal, A., and Kumar, K. Self-supervised learning: A succinct review. Archives of Computational Methods in Engineering, 30 0 (4): 0 2761--2775, 2023

work page 2023
[20]

Richeson, D. S. Euler's Gem: The Polyhedron Formula and the Birth of Topology, volume 64. Princeton University Press, 2019

work page 2019
[21]

The kernel trick for distances

Sch \"o lkopf, B. The kernel trick for distances. In Advances in neural information processing systems, pp.\ 301--307, 2001

work page 2001
[22]

and Fieguth, P

Sepanj, H. and Fieguth, P. Aligning feature distributions in VICReg using maximum mean discrepancy for enhanced manifold awareness in self-supervised representation learning. Journal of Computational Vision and Imaging Systems, 10 0 (1): 0 13--18, 2024

work page 2024
[23]

Sepanj, M. H. and Fiegth, P. SinSim : Sinkhorn -regularized SimCLR . arXiv preprint arXiv:2502.10478, 2025

work page arXiv 2025
[24]

H., Ghojogh, B., and Fieguth, P

Sepanj, M. H., Ghojogh, B., and Fieguth, P. Self-supervised learning using nonlinear dependence. IEEE Access, 13: 0 190582--190589, 2025 a

work page 2025
[25]

H., Ghojogh, B., and Fieguth, P

Sepanj, M. H., Ghojogh, B., and Fieguth, P. Kernel VICReg for self-supervised learning in reproducing kernel Hilbert space. arXiv preprint arXiv:2509.07289, 2025 b

work page arXiv 2025
[26]

Barlow twins: Self-supervised learning via redundancy reduction

Zbontar, J., Jing, L., Misra, I., LeCun, Y., and Deny, S. Barlow twins: Self-supervised learning via redundancy reduction. In International conference on machine learning, pp.\ 12310--12320. PMLR, 2021

work page 2021

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[2] [2]

Normalized kernels as similarity indices

Ah-Pine, J. Normalized kernels as similarity indices. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp.\ 362--373. Springer, 2010

work page 2010

[3] [3]

VICReg : Variance-invariance-covariance regularization for self-supervised learning

Bardes, A., Ponce, J., and LeCun, Y. VICReg : Variance-invariance-covariance regularization for self-supervised learning. In International Conference on Learning Representations, 2022

work page 2022

[4] [4]

A simple framework for contrastive learning of visual representations

Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp.\ 1597--1607. PmLR, 2020

work page 2020

[5] [5]

Coxeter, H. S. M. Regular polytopes. Courier Corporation, 1973

work page 1973

[6] [6]

Progymnasmata de solidorum elementis

Descartes, R. Progymnasmata de solidorum elementis. Oeuvres de Descartes, X: 0 265--276, 1890

work page

[7] [7]

Anomaly detection and prototype selection using polyhedron curvature

Ghojogh, B., Karray, F., and Crowley, M. Anomaly detection and prototype selection using polyhedron curvature. In Canadian Conference on Artificial Intelligence, pp.\ 238--250. Springer, 2020

work page 2020

[8] [8]

Background on kernels

Ghojogh, B., Crowley, M., Karray, F., and Ghodsi, A. Background on kernels. Elements of Dimensionality Reduction and Manifold Learning, pp.\ 43--73, 2023

work page 2023

[9] [9]

Introduction to RKHS , and some simple kernel algorithms

Gretton, A. Introduction to RKHS , and some simple kernel algorithms. Adv. Top. Mach. Learn. Lecture Conducted from University College London, 16 0 (5-3): 0 2, 2013

work page 2013

[10] [10]

Bootstrap your own latent-a new approach to self-supervised learning

Grill, J.-B., Strub, F., Altch \'e , F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., et al. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33: 0 21271--21284, 2020

work page 2020

[11] [11]

Deep residual learning for image recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

work page 2016

[12] [12]

and Pedersen, J

Hilton, P. and Pedersen, J. Descartes, Euler , Poincare , Polya and polyhedra. S \'e minaire de Philosophie et Math \'e matiques , 0 (8): 0 1--17, 1982

work page 1982

[13] [13]

Hofmann, T., Sch \"o lkopf, B., and Smola, A. J. Kernel methods in machine learning. The annals of statistics, pp.\ 1171--1220, 2008

work page 2008

[14] [14]

Kingma, D. P. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[15] [15]

and Hinton, G

Krizhevsky, A. and Hinton, G. Learning multiple layers of features from tiny images. Technical report, University of Toronto, ON, Canada, 2009

work page 2009

[16] [16]

Gradient-based learning applied to document recognition

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86 0 (11): 0 2278--2324, 1998

work page 1998

[17] [17]

Curvature and shape

Markvorsen, S. Curvature and shape. In Yugoslav Geometrical Seminar, Fall School of Differential Geometry, Yugoslavia, pp.\ 55--75, 1996

work page 1996

[18] [18]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

McInnes, L., Healy, J., and Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

T., Kumar, M., Mittal, A., and Kumar, K

Rani, V., Nabi, S. T., Kumar, M., Mittal, A., and Kumar, K. Self-supervised learning: A succinct review. Archives of Computational Methods in Engineering, 30 0 (4): 0 2761--2775, 2023

work page 2023

[20] [20]

Richeson, D. S. Euler's Gem: The Polyhedron Formula and the Birth of Topology, volume 64. Princeton University Press, 2019

work page 2019

[21] [21]

The kernel trick for distances

Sch \"o lkopf, B. The kernel trick for distances. In Advances in neural information processing systems, pp.\ 301--307, 2001

work page 2001

[22] [22]

and Fieguth, P

Sepanj, H. and Fieguth, P. Aligning feature distributions in VICReg using maximum mean discrepancy for enhanced manifold awareness in self-supervised representation learning. Journal of Computational Vision and Imaging Systems, 10 0 (1): 0 13--18, 2024

work page 2024

[23] [23]

Sepanj, M. H. and Fiegth, P. SinSim : Sinkhorn -regularized SimCLR . arXiv preprint arXiv:2502.10478, 2025

work page arXiv 2025

[24] [24]

H., Ghojogh, B., and Fieguth, P

Sepanj, M. H., Ghojogh, B., and Fieguth, P. Self-supervised learning using nonlinear dependence. IEEE Access, 13: 0 190582--190589, 2025 a

work page 2025

[25] [25]

H., Ghojogh, B., and Fieguth, P

Sepanj, M. H., Ghojogh, B., and Fieguth, P. Kernel VICReg for self-supervised learning in reproducing kernel Hilbert space. arXiv preprint arXiv:2509.07289, 2025 b

work page arXiv 2025

[26] [26]

Barlow twins: Self-supervised learning via redundancy reduction

Zbontar, J., Jing, L., Misra, I., LeCun, Y., and Deny, S. Barlow twins: Self-supervised learning via redundancy reduction. In International conference on machine learning, pp.\ 12310--12320. PMLR, 2021

work page 2021