Self-Organizing Maps with Optimized Latent Positions

Akira Notsu; Katsuhiro Honda; Seiki Ubukata

arxiv: 2604.13622 · v1 · submitted 2026-04-15 · 💻 cs.LG

Self-Organizing Maps with Optimized Latent Positions

Seiki Ubukata , Akira Notsu , Katsuhiro Honda This is my paper

Pith reviewed 2026-05-10 13:41 UTC · model grok-4.3

classification 💻 cs.LG

keywords self-organizing mapstopographic mappingsoft topographic vector quantizationblock coordinate descentlatent positionsentropy regularizationvector quantization

0 comments

The pith

SOM-OLP optimizes continuous latent positions for each data point by replacing STVQ's coupled neighborhood term with a separable quadratic surrogate.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Self-Organizing Maps with Optimized Latent Positions to resolve the efficiency-objective trade-off in classical topographic mapping. It takes the neighborhood distortion from Soft Topographic Vector Quantization and replaces it with a separable surrogate that exploits the local quadratic structure, then adds entropy regularization. The resulting objective admits a block coordinate descent algorithm whose updates for assignment probabilities, latent positions, and reference vectors are all closed-form. Each iteration is guaranteed to leave the objective no higher than before and runs in time linear in the number of data points and latent nodes. Experiments on synthetic manifolds, MNIST-scale data, and 16 benchmarks indicate that the method matches or exceeds prior approaches in neighborhood preservation while scaling better when the map grows large.

Core claim

Starting from the neighborhood distortion of STVQ, a separable surrogate local cost is constructed from its local quadratic structure. An entropy-regularized objective is formulated on this surrogate. Block coordinate descent then yields closed-form updates for assignment probabilities, latent positions, and reference vectors, with the objective guaranteed to decrease monotonically at every step and with per-iteration cost linear in data size and map size.

What carries the argument

The separable surrogate local cost extracted from the quadratic neighborhood distortion of STVQ, which decouples the objective enough to permit independent closed-form updates for each block while preserving the topographic ordering goal.

If this is right

The block updates remain closed-form and the objective is monotonically non-increasing for any choice of entropy weight.
Per-iteration cost stays linear in both the number of data points and the number of latent nodes, removing the quadratic coupling bottleneck of prior objective-based SOMs.
Continuous latent positions per data point are learned jointly with the map, allowing the method to adapt the embedding geometry without fixing a discrete grid in advance.
On 16 benchmark datasets the method obtains the lowest average rank among compared topographic and quantization baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same surrogate-construction tactic could be tried on other neighborhood-coupled objectives in embedding or clustering to obtain similar linear-time block schemes.
Because latent positions are continuous and per-point, the approach might extend naturally to maps whose topology is learned rather than prescribed.
The monotonicity guarantee supplies a practical stopping criterion that earlier heuristic SOM variants lacked.

Load-bearing premise

The local quadratic structure of STVQ neighborhood distortion is close enough to the true cost that a separable surrogate built from it still produces topographic maps whose neighborhood preservation is competitive with the original coupled objective.

What would settle it

On a dataset where the neighborhood distortion deviates sharply from quadratic, run both SOM-OLP and standard STVQ to the same number of iterations and measure whether the final topographic error of SOM-OLP exceeds that of STVQ by more than the gap seen on quadratic-friendly data.

Figures

Figures reproduced from arXiv: 2604.13622 by Akira Notsu, Katsuhiro Honda, Seiki Ubukata.

**Figure 1.** Figure 1: Comparison of BSOM, STVQf, GTM, and SOM-OLP on the saddle dataset. For each method, the data-space view shows the learned reference [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: Latent representations of the Digits dataset. While BSOM and STVQf are constrained to discrete nodes, GTM and SOM-OLP provide continuous [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 4.** Figure 4: Critical-difference diagram based on the average ranks of [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Self-Organizing Maps (SOM) are a classical method for unsupervised learning, vector quantization, and topographic mapping of high-dimensional data. However, existing SOM formulations often involve a trade-off between computational efficiency and a clearly defined optimization objective. Objective-based variants such as Soft Topographic Vector Quantization (STVQ) provide a principled formulation, but their neighborhood-coupled computations become expensive as the number of latent nodes increases. In this paper, we propose Self-Organizing Maps with Optimized Latent Positions (SOM-OLP), an objective-based topographic mapping method that introduces a continuous latent position for each data point. Starting from the neighborhood distortion of STVQ, we construct a separable surrogate local cost based on its local quadratic structure and formulate an entropy-regularized objective based on it. This yields a simple block coordinate descent scheme with closed-form updates for assignment probabilities, latent positions, and reference vectors, while guaranteeing monotonic non-increase of the objective and retaining linear per-iteration complexity in the numbers of data points and latent nodes. Experiments on a synthetic saddle manifold, scalability studies on the Digits and MNIST datasets, and 16 benchmark datasets show that SOM-OLP achieves competitive neighborhood preservation and quantization performance, favorable scalability for large numbers of latent nodes and large datasets, and the best average rank among the compared methods on the benchmark datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SOM-OLP adds continuous per-point latent positions and a quadratic surrogate from STVQ to get closed-form block updates and linear scaling, but the surrogate's fidelity to actual neighborhood structure is the unproven load-bearing piece.

read the letter

The core advance is a formulation that assigns each data point its own continuous latent position, derives a separable quadratic surrogate from the STVQ neighborhood distortion term, and then optimizes an entropy-regularized objective with block coordinate descent. The updates for assignment probabilities, latent positions, and reference vectors are closed-form, the objective is monotonically non-increasing, and per-iteration cost stays linear in data points and nodes. That combination is not in the cited prior work on STVQ or standard SOMs. Experiments on a synthetic manifold, Digits, MNIST, and 16 benchmarks show competitive neighborhood preservation and quantization error, plus better scaling when the number of latent nodes grows large, and the best average rank among the methods tested. Those results are concrete and the monotonicity guarantee is a genuine plus for an objective-based method. The soft spot is the quadratic surrogate step itself. The abstract claims it comes from the local quadratic structure of the STVQ distortion, yet gives no explicit expansion, no error bounds, and no argument that stationary points of the surrogate still enforce useful global topology rather than just local fits. The entropy regularization strength is also a free parameter whose effect on the final mapping is not fully characterized. If the approximation holds only in narrow regimes, the claimed topography preservation could weaken. Readers working on efficient vector quantization or topographic mapping will find the algorithm and scaling results useful. The paper shows clear derivation and honest empirical comparison, so it deserves a serious referee rather than desk rejection, though the surrogate analysis would be the main point for revision.

Referee Report

2 major / 2 minor

Summary. The paper proposes Self-Organizing Maps with Optimized Latent Positions (SOM-OLP), an objective-based topographic mapping method. Starting from the neighborhood distortion term in Soft Topographic Vector Quantization (STVQ), it constructs a separable surrogate local cost via the local quadratic structure of that term, formulates an entropy-regularized objective, and derives a block coordinate descent algorithm with closed-form updates for assignment probabilities, continuous latent positions per data point, and reference vectors. The procedure guarantees monotonic non-increase of the objective while retaining linear per-iteration complexity in the numbers of data points and latent nodes. Experiments on a synthetic saddle manifold, scalability tests on Digits and MNIST, and 16 benchmark datasets report competitive neighborhood preservation and quantization performance, favorable scaling for large latent grids and datasets, and the best average rank among compared methods.

Significance. If the quadratic surrogate retains sufficient topographic structure from STVQ, the work would provide a useful advance in objective-driven SOM variants by delivering closed-form updates, a monotonicity guarantee, and linear complexity without sacrificing mapping quality. The explicit construction from an existing neighborhood distortion term, the block-coordinate scheme, and the extensive empirical comparisons (including scalability and benchmark rankings) are clear strengths that could make the method attractive for large-scale unsupervised topographic mapping tasks.

major comments (2)

[§3 (surrogate construction)] The central construction (described in the abstract and presumably §3) approximates the STVQ neighborhood distortion by its local quadratic structure to obtain a separable surrogate, yet provides neither the explicit quadratic expansion nor the regime (e.g., neighborhood size or curvature bound) under which the approximation is claimed to be accurate. Because separability, closed-form updates, and the claim that stationary points still produce useful topographic mappings all rest on this step, the absence of the expansion and a supporting argument or bound constitutes a load-bearing gap.
[optimization procedure and monotonicity claim] The monotonic non-increase guarantee for the entropy-regularized objective is stated, but it is unclear whether the guarantee holds for the original STVQ distortion or only for the surrogate; if the latter, the manuscript should clarify how much the surrogate can deviate from the true coupled neighborhood term before the topographic properties are lost. This directly affects the interpretation of the experimental neighborhood-preservation results.

minor comments (2)

[abstract and experimental section] The abstract refers to “16 benchmark datasets” without naming them or providing a summary table; the main text should include an explicit list or reference to the supplementary material.
[§2–3] Notation for the continuous latent positions (one per data point) and their relation to the discrete latent grid should be introduced with a clear diagram or equation early in the methods section to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and insightful comments, which highlight important aspects of the surrogate construction and optimization guarantees. We address each major comment below with clarifications and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [§3 (surrogate construction)] The central construction (described in the abstract and presumably §3) approximates the STVQ neighborhood distortion by its local quadratic structure to obtain a separable surrogate, yet provides neither the explicit quadratic expansion nor the regime (e.g., neighborhood size or curvature bound) under which the approximation is claimed to be accurate. Because separability, closed-form updates, and the claim that stationary points still produce useful topographic mappings all rest on this step, the absence of the expansion and a supporting argument or bound constitutes a load-bearing gap.

Authors: We agree that an explicit derivation of the quadratic expansion and a discussion of its validity regime would improve clarity. In the revised manuscript, we will expand §3 to include the second-order Taylor expansion of the STVQ neighborhood distortion term around the current latent positions, showing how the cross terms vanish to yield separability. We will also add a supporting paragraph on the regime: the approximation is accurate when the neighborhood kernel is smooth (e.g., Gaussian with moderate width) and latent position updates remain small between iterations, which is enforced by the block-coordinate scheme. This directly supports why stationary points of the surrogate retain useful topographic structure, as confirmed by the saddle-manifold and benchmark experiments. revision: yes
Referee: [optimization procedure and monotonicity claim] The monotonic non-increase guarantee for the entropy-regularized objective is stated, but it is unclear whether the guarantee holds for the original STVQ distortion or only for the surrogate; if the latter, the manuscript should clarify how much the surrogate can deviate from the true coupled neighborhood term before the topographic properties are lost. This directly affects the interpretation of the experimental neighborhood-preservation results.

Authors: The monotonicity guarantee applies strictly to the entropy-regularized surrogate objective, as the closed-form block-coordinate updates are derived for its separable quadratic form. We will revise the text (primarily in §3 and §4) to state this explicitly and add a short discussion of the deviation: because the surrogate matches the local curvature of the original neighborhood term, the difference is second-order in the latent-position change; the descent property on the surrogate therefore induces approximate descent on the original term for sufficiently small steps. We will tie this to the empirical neighborhood-preservation results, noting that the competitive topographic quality on the saddle manifold and 16 benchmarks indicates the deviation does not erode the mapping properties in practice. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation proceeds from external STVQ distortion via explicit quadratic surrogate

full rationale

The paper starts from the neighborhood distortion term of prior STVQ work (Graepel et al.), constructs an explicit separable surrogate exploiting its local quadratic structure, and derives the entropy-regularized objective and closed-form block coordinate descent updates directly from that surrogate. Monotonicity follows from standard surrogate optimization arguments, and complexity claims are linear in data and nodes by construction of the separability. No equation reduces a fitted quantity to a renamed prediction, no uniqueness theorem is imported via self-citation, and the central approximation step is stated as such rather than smuggled or defined circularly. The derivation chain is therefore self-contained against the external STVQ reference and does not collapse to its own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that a local quadratic approximation to STVQ neighborhood distortion yields a useful separable surrogate, plus the introduction of continuous latent positions as a new modeling choice whose only support is the optimization itself.

free parameters (1)

entropy regularization strength
The entropy term is added to the objective and its coefficient must be chosen to control softness of assignments.

axioms (1)

domain assumption The neighborhood distortion of STVQ admits a local quadratic structure that can be turned into a separable surrogate cost.
Invoked to construct the surrogate local cost from which the entropy-regularized objective and closed-form updates follow.

invented entities (1)

continuous latent position for each data point no independent evidence
purpose: To allow direct optimization of positions in latent space while retaining topographic properties.
New modeling variable introduced in the method; no independent falsifiable prediction outside the optimization is provided.

pith-pipeline@v0.9.0 · 5539 in / 1605 out tokens · 45825 ms · 2026-05-10T13:41:34.523550+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[1]

Self-organized formation of topologically correct feature maps,

T. Kohonen, “Self-organized formation of topologically correct feature maps,” Biological Cybernetics, vol. 43, no. 1, pp. 59–69, 1982

work page 1982
[2]

The self-organizing map,

T. Kohonen, “The self-organizing map,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1464–1480, 1990

work page 1990
[3]

Self-organizing maps: ordering, convergence properties and energy functions,

E. Erwin, K. Obermayer, and K. Schulten, “Self-organizing maps: ordering, convergence properties and energy functions,” Biological Cy- bernetics, vol. 67, no. 1, pp. 47–55, 1992

work page 1992
[4]

A Bayesian analysis of self-organizing maps,

S. P. Luttrell, “A Bayesian analysis of self-organizing maps,” Neural Computation, vol. 6, no. 5, pp. 767–794, 1994

work page 1994
[5]

Energy functions for self-organizing maps,

T. Heskes, “Energy functions for self-organizing maps,” in Kohonen Maps, Elsevier, 1999, pp. 303–315

work page 1999
[6]

Phase transitions in stochas- tic self-organizing maps,

T. Graepel, M. Burger, and K. Obermayer, “Phase transitions in stochas- tic self-organizing maps,” Physical Review E, vol. 56, pp. 3876–3890, 1997

work page 1997
[7]

Self-organizing maps, vector quantization, and mixture modeling,

T. Heskes, “Self-organizing maps, vector quantization, and mixture modeling,” IEEE Transactions on Neural Networks, vol. 12, no. 6, pp. 1299–1305, 2001

work page 2001
[8]

Fuzzy c-means as a regularization and maximum entropy approach,

S. Miyamoto and M. Mukaidono, “Fuzzy c-means as a regularization and maximum entropy approach,” in Proc. 7th Int. Fuzzy Systems Association World Congress (IFSA’97), vol. 2, Prague, Czech Republic, June 1997, pp. 86–92

work page 1997
[9]

Some methods for classification and analysis of multi- variate observations,

J. MacQueen, “Some methods for classification and analysis of multi- variate observations,” in Proc. 5th Berkeley Symp. Math. Statist. Probab., vol. 1, L. M. Le Cam and J. Neyman, Eds. Berkeley, CA: University of California Press, 1967, pp. 281–297

work page 1967
[10]

I. T. Jolliffe, Principal Component Analysis, 2nd ed. New York: Springer, 2002

work page 2002
[11]

GTM: The generative topographic mapping,

C. M. Bishop, M. Svens ´en, and C. K. I. Williams, “GTM: The generative topographic mapping,” Neural Computation, vol. 10, no. 1, pp. 215–234, 1998

work page 1998
[12]

ugtm: A Python package for data modeling and visualiza- tion using generative topographic mapping,

H. A. Gaspar, “ugtm: A Python package for data modeling and visualiza- tion using generative topographic mapping,” Journal of Open Research Software, vol. 6, no. 1, p. 26, 2018

work page 2018
[13]

Neighborhood preservation in nonlinear pro- jection methods: An experimental study,

J. Venna and S. Kaski, “Neighborhood preservation in nonlinear pro- jection methods: An experimental study,” in Artificial Neural Net- works (ICANN 2001), LNCS 2130. Berlin, Heidelberg: Springer, 2001, pp. 485–491

work page 2001
[14]

Quality assessment of dimensionality reduction: Rank-based criteria,

J. A. Lee and M. Verleysen, “Quality assessment of dimensionality reduction: Rank-based criteria,” Neurocomputing, vol. 72, no. 7–9, pp. 1431–1443, 2009

work page 2009
[15]

Survey and comparison of quality measures for self- organizing maps,

G. P ¨olzlbauer, “Survey and comparison of quality measures for self- organizing maps,” in Proc. 5th Workshop on Data Analysis (WDA’04), 2004, pp. 67–82

work page 2004
[16]

Optuna: A next-generation hyperparameter optimization framework,

T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proc. 25th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining (KDD), 2019, pp. 2623–2631

work page 2019
[17]

Scikit-learn: Machine learning in Python,

F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vander- Plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011

work page 2011
[18]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998

work page 1998
[19]

The UCI Machine Learning Repository,

M. Kelly, R. Longjohn, and K. Nottingham, “The UCI Machine Learning Repository,” [Online]. Available: https://archive.ics.uci.edu

work page
[20]

Rapid learning with parametrized self- organizing maps,

J. Walter and H. Ritter, “Rapid learning with parametrized self- organizing maps,” Neurocomputing, vol. 12, pp. 131–153, 1996

work page 1996
[21]

SOM- V AE: Interpretable discrete representation learning on time series,

V . Fortuin, M. H¨user, F. Locatello, H. Strathmann, and G. R¨atsch, “SOM- V AE: Interpretable discrete representation learning on time series,” in Proc. 7th Int. Conf. on Learning Representations (ICLR), 2019

work page 2019
[22]

SatSOM: Saturation self-organizing maps for continual learning,

I. Urbanik and P. Gajewski, “SatSOM: Saturation self-organizing maps for continual learning,” 2025, arXiv:2506.10680v5. [Online]. Available: https://arxiv.org/abs/2506.10680

work page arXiv 2025
[23]

Topological autoen- coders,

M. Moor, M. Horn, B. Rieck, and K. Borgwardt, “Topological autoen- coders,” in Proc. 37th Int. Conf. on Machine Learning (ICML), vol. 119. PMLR, 2020, pp. 7045–7054

work page 2020

[1] [1]

Self-organized formation of topologically correct feature maps,

T. Kohonen, “Self-organized formation of topologically correct feature maps,” Biological Cybernetics, vol. 43, no. 1, pp. 59–69, 1982

work page 1982

[2] [2]

The self-organizing map,

T. Kohonen, “The self-organizing map,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1464–1480, 1990

work page 1990

[3] [3]

Self-organizing maps: ordering, convergence properties and energy functions,

E. Erwin, K. Obermayer, and K. Schulten, “Self-organizing maps: ordering, convergence properties and energy functions,” Biological Cy- bernetics, vol. 67, no. 1, pp. 47–55, 1992

work page 1992

[4] [4]

A Bayesian analysis of self-organizing maps,

S. P. Luttrell, “A Bayesian analysis of self-organizing maps,” Neural Computation, vol. 6, no. 5, pp. 767–794, 1994

work page 1994

[5] [5]

Energy functions for self-organizing maps,

T. Heskes, “Energy functions for self-organizing maps,” in Kohonen Maps, Elsevier, 1999, pp. 303–315

work page 1999

[6] [6]

Phase transitions in stochas- tic self-organizing maps,

T. Graepel, M. Burger, and K. Obermayer, “Phase transitions in stochas- tic self-organizing maps,” Physical Review E, vol. 56, pp. 3876–3890, 1997

work page 1997

[7] [7]

Self-organizing maps, vector quantization, and mixture modeling,

T. Heskes, “Self-organizing maps, vector quantization, and mixture modeling,” IEEE Transactions on Neural Networks, vol. 12, no. 6, pp. 1299–1305, 2001

work page 2001

[8] [8]

Fuzzy c-means as a regularization and maximum entropy approach,

S. Miyamoto and M. Mukaidono, “Fuzzy c-means as a regularization and maximum entropy approach,” in Proc. 7th Int. Fuzzy Systems Association World Congress (IFSA’97), vol. 2, Prague, Czech Republic, June 1997, pp. 86–92

work page 1997

[9] [9]

Some methods for classification and analysis of multi- variate observations,

J. MacQueen, “Some methods for classification and analysis of multi- variate observations,” in Proc. 5th Berkeley Symp. Math. Statist. Probab., vol. 1, L. M. Le Cam and J. Neyman, Eds. Berkeley, CA: University of California Press, 1967, pp. 281–297

work page 1967

[10] [10]

I. T. Jolliffe, Principal Component Analysis, 2nd ed. New York: Springer, 2002

work page 2002

[11] [11]

GTM: The generative topographic mapping,

C. M. Bishop, M. Svens ´en, and C. K. I. Williams, “GTM: The generative topographic mapping,” Neural Computation, vol. 10, no. 1, pp. 215–234, 1998

work page 1998

[12] [12]

ugtm: A Python package for data modeling and visualiza- tion using generative topographic mapping,

H. A. Gaspar, “ugtm: A Python package for data modeling and visualiza- tion using generative topographic mapping,” Journal of Open Research Software, vol. 6, no. 1, p. 26, 2018

work page 2018

[13] [13]

Neighborhood preservation in nonlinear pro- jection methods: An experimental study,

J. Venna and S. Kaski, “Neighborhood preservation in nonlinear pro- jection methods: An experimental study,” in Artificial Neural Net- works (ICANN 2001), LNCS 2130. Berlin, Heidelberg: Springer, 2001, pp. 485–491

work page 2001

[14] [14]

Quality assessment of dimensionality reduction: Rank-based criteria,

J. A. Lee and M. Verleysen, “Quality assessment of dimensionality reduction: Rank-based criteria,” Neurocomputing, vol. 72, no. 7–9, pp. 1431–1443, 2009

work page 2009

[15] [15]

Survey and comparison of quality measures for self- organizing maps,

G. P ¨olzlbauer, “Survey and comparison of quality measures for self- organizing maps,” in Proc. 5th Workshop on Data Analysis (WDA’04), 2004, pp. 67–82

work page 2004

[16] [16]

Optuna: A next-generation hyperparameter optimization framework,

T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proc. 25th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining (KDD), 2019, pp. 2623–2631

work page 2019

[17] [17]

Scikit-learn: Machine learning in Python,

F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vander- Plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011

work page 2011

[18] [18]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998

work page 1998

[19] [19]

The UCI Machine Learning Repository,

M. Kelly, R. Longjohn, and K. Nottingham, “The UCI Machine Learning Repository,” [Online]. Available: https://archive.ics.uci.edu

work page

[20] [20]

Rapid learning with parametrized self- organizing maps,

J. Walter and H. Ritter, “Rapid learning with parametrized self- organizing maps,” Neurocomputing, vol. 12, pp. 131–153, 1996

work page 1996

[21] [21]

SOM- V AE: Interpretable discrete representation learning on time series,

V . Fortuin, M. H¨user, F. Locatello, H. Strathmann, and G. R¨atsch, “SOM- V AE: Interpretable discrete representation learning on time series,” in Proc. 7th Int. Conf. on Learning Representations (ICLR), 2019

work page 2019

[22] [22]

SatSOM: Saturation self-organizing maps for continual learning,

I. Urbanik and P. Gajewski, “SatSOM: Saturation self-organizing maps for continual learning,” 2025, arXiv:2506.10680v5. [Online]. Available: https://arxiv.org/abs/2506.10680

work page arXiv 2025

[23] [23]

Topological autoen- coders,

M. Moor, M. Horn, B. Rieck, and K. Borgwardt, “Topological autoen- coders,” in Proc. 37th Int. Conf. on Machine Learning (ICML), vol. 119. PMLR, 2020, pp. 7045–7054

work page 2020