pith. sign in

arxiv: 2512.24062 · v2 · submitted 2025-12-30 · 💻 cs.LG

Energy-Balanced Hyperspherical Graph Representation Learning via Structural Binding and Entropic Dispersion

Pith reviewed 2026-05-16 19:17 UTC · model grok-4.3

classification 💻 cs.LG
keywords graph representation learninghyperspherical embeddingsenergy-based modelsover-smoothingnode classificationthermodynamic constraintsuniformity regularization
0
0 comments X

The pith

Graph nodes reach a balanced energy state on a unit hypersphere when local structural binding competes against global repulsive dispersion under an adaptive thermostat.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models graph representation learning as a search for energy equilibrium on a latent manifold. Standard message passing dissipates energy without bound, pushing representations toward collapse and over-smoothing. HyperGRL counters this by minimizing a Helmholtz free energy that adds a neighbor-mean alignment term for local cohesion and a sampling-free uniformity term for global spread. An entropy-guided adaptive thermostat modulates temperature to keep the system in a metastable balance. Experiments across node classification, clustering, and link prediction show the resulting embeddings remain discriminative on multiple benchmarks.

Core claim

Graph Representation Learning is cast as a physical process of seeking an energy equilibrium state for a node system on a latent manifold. By minimizing a Helmholtz free energy objective that combines Structural Binding Energy (via Neighbor-Mean Alignment) as a local binding force with Mean-Field Repulsive Potential (via Sampling-Free Uniformity) as a global entropic force, and governing their trade-off through an entropy-guided Adaptive Thermostat, nodes are embedded on a unit hypersphere in a robust metastable state that preserves both structural cohesion and representation discriminability.

What carries the argument

The Helmholtz free energy objective on the unit hypersphere, formed by Structural Binding Energy via Neighbor-Mean Alignment (local cohesion) competing with Mean-Field Repulsive Potential via Sampling-Free Uniformity (global dispersion), dynamically regulated by an entropy-guided Adaptive Thermostat that adjusts the system's temperature.

If this is right

  • Node embeddings become more discriminative while reducing over-smoothing on classification, clustering, and link-prediction tasks.
  • Representations stay uniformly dispersed on the hypersphere without requiring negative samples.
  • The adaptive thermostat automatically trades off local alignment against global uniformity during training.
  • The same energy-minimization view applies across diverse benchmark graphs without dataset-specific redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The thermodynamic framing could be tested on temporal or heterogeneous graphs to see whether the same binding-plus-repulsion balance prevents collapse in those settings.
  • If the sampling-free uniformity term scales well, it may replace contrastive losses in other embedding domains such as knowledge graphs or molecular structures.
  • The approach suggests that explicit energy constraints could be added to existing message-passing layers rather than replacing them entirely.

Load-bearing premise

The two competing energy terms together with the adaptive thermostat will reliably drive the node system to a metastable equilibrium that balances cohesion and dispersion without creating new instabilities or demanding heavy hyperparameter search.

What would settle it

If HyperGRL produces feature collapse, measurable over-smoothing, or lower accuracy than standard GNN baselines on widely used citation datasets such as Cora, CiteSeer, or PubMed, the claim that the energy balance yields robust representations would be refuted.

Figures

Figures reproduced from arXiv: 2512.24062 by Hongbin Wang, Junjun Guo, Rui Chen, Yantuan Xian, Yan Xiang, Zhengtao Yu.

Figure 1
Figure 1. Figure 1: Overview of HyperGRL. Given a graph G(A, X), a graph augmentation T produces an augmented graph G(A ′ , X ′ ). This augmented graph is then encoded by a GNN fθ to generate node representations H. These representations are subsequently normalized onto a hyperspherical space to yield Z, where training is driven by two complementary objectives: the Neighbor-Mean Alignment loss Lalign, which pulls each node to… view at source ↗
Figure 2
Figure 2. Figure 2: Effect of adaptive α. 94% on Coauthor-CS—demonstrating the robustness and consistency of our framework across different GNN architectures. These results highlight that HyperGRL not only adapts seamlessly to diverse backbone networks but also fully exploits the expressive power of Transformer architectures, achieving superior representation quality and stronger generalization across heterogeneous graph doma… view at source ↗
Figure 3
Figure 3. Figure 3: Impact of the target entropy Htarget. 5.6.2. Impact of the neighbor-mean order k To investigate the influence of the neighbor-mean order k, we fix all other hyperpa￾rameters and vary k from 1 to 3. As shown in [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Impact of the neighbor-mean order k. Notably, HyperGRL remains highly stable across different dimensions, maintaining competitive accuracy even in low-dimensional settings, which highlights its efficient utilization of representation capacity. (a) Cora (b) WikiCS [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Performance on node classification (Accuracy %) with di [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance on node classification (Accuracy %) with di [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: t-SNE embeddings of nodes in the Cora dataset. Each color represents a distinct class. [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗
read the original abstract

Graph Representation Learning (GRL) can be fundamentally modeled as a physical process of seeking an energy equilibrium state for a node system on a latent manifold. However, existing Graph Neural Networks (GNNs) often suffer from uncontrolled energy dissipation during message passing, driving the system towards a state of Thermal Death--manifested as feature collapse or over-smoothing--due to the absence of explicit thermodynamic constraints. To address this, we propose HyperGRL, a thermodynamics-driven framework that embeds nodes on a unit hypersphere by minimizing a Helmholtz free energy objective composed of two competing potentials. First, we introduce Structural Binding Energy (via Neighbor-Mean Alignment), which functions as a local binding force to strengthen structural cohesion, encouraging structurally related nodes to form compact local clusters. Second, to counteract representation collapse, we impose a Mean-Field Repulsive Potential (via Sampling-Free Uniformity), which acts as a global entropic force to maximize representation dispersion without the need for negative sampling. Crucially, to govern the trade-off between local alignment and global uniformity, we devise an Adaptive Thermostat. This entropy-guided strategy dynamically regulates the system's "temperature" during training, guiding the representation towards a robust metastable state that balances local cohesion with global discriminability. Extensive experiments on node classification, node clustering, and link prediction show that HyperGRL consistently achieves strong performance across diverse benchmark datasets, yielding more discriminative and robust representations while alleviating over-smoothing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes HyperGRL, a thermodynamics-driven framework for graph representation learning that embeds nodes on the unit hypersphere by minimizing a Helmholtz free energy objective. This objective combines Structural Binding Energy (via Neighbor-Mean Alignment) to promote local structural cohesion with a Mean-Field Repulsive Potential (via Sampling-Free Uniformity) to enforce global dispersion, with the trade-off governed by an entropy-guided Adaptive Thermostat that dynamically regulates temperature to reach a metastable state. The approach is motivated as a remedy for over-smoothing and feature collapse in GNNs, and extensive experiments on node classification, node clustering, and link prediction are reported to show consistent improvements across benchmark datasets.

Significance. If the central claims hold, the work supplies a physically motivated objective for controlling the balance between local alignment and global uniformity in hyperspherical GRL, which could offer a scalable, sampling-free alternative to contrastive methods and a dynamic mechanism for mitigating over-smoothing. The adaptive thermostat and sampling-free uniformity are practical strengths that may generalize beyond the evaluated tasks.

major comments (2)
  1. [Method section on Adaptive Thermostat] The central claim that the Adaptive Thermostat drives the system to a reliable metastable balance between cohesion and discriminability rests on unproven dynamics; no Lyapunov-style argument, fixed-point characterization, or discrete-time convergence bound is supplied for the combined Helmholtz objective under mini-batch entropy estimates (see the description of the Adaptive Thermostat and the free-energy formulation).
  2. [Experiments] Performance claims of 'strong performance' and alleviation of over-smoothing are asserted without reported effect sizes, statistical significance, or ablation results isolating the contributions of the binding energy, repulsive potential, and thermostat; this leaves the empirical support for the central claim difficult to evaluate (see Experiments section).
minor comments (2)
  1. [Method] Explicit equations for the Helmholtz free energy, the Structural Binding Energy, and the Mean-Field Repulsive Potential should be provided early in the method section to clarify the objective and the role of the thermostat scaling factor.
  2. [Related Work] The manuscript would benefit from additional citations to prior hyperspherical embedding work and thermodynamic analogies in representation learning to better situate the novelty.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, indicating planned revisions where appropriate to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Method section on Adaptive Thermostat] The central claim that the Adaptive Thermostat drives the system to a reliable metastable balance between cohesion and discriminability rests on unproven dynamics; no Lyapunov-style argument, fixed-point characterization, or discrete-time convergence bound is supplied for the combined Helmholtz objective under mini-batch entropy estimates (see the description of the Adaptive Thermostat and the free-energy formulation).

    Authors: We acknowledge that the manuscript does not provide a formal Lyapunov-style argument, fixed-point characterization, or discrete-time convergence bound for the Adaptive Thermostat under mini-batch entropy estimates. The thermostat is designed as an entropy-guided mechanism to dynamically balance the Structural Binding Energy and Mean-Field Repulsive Potential toward a metastable state, with its behavior supported by the overall Helmholtz free-energy formulation. In the revision, we will add an empirical analysis of the training dynamics, including plots tracking the evolution of free-energy components, entropy estimates, and representation metrics over epochs across multiple runs, along with a discussion of observed fixed-point behavior in practice. A complete theoretical convergence proof under general conditions is beyond the scope of this work and would require additional assumptions on graph properties; we will note this limitation explicitly while emphasizing the practical stability observed in experiments. revision: partial

  2. Referee: [Experiments] Performance claims of 'strong performance' and alleviation of over-smoothing are asserted without reported effect sizes, statistical significance, or ablation results isolating the contributions of the binding energy, repulsive potential, and thermostat; this leaves the empirical support for the central claim difficult to evaluate (see Experiments section).

    Authors: We agree that the current experimental reporting lacks effect sizes, statistical significance tests, and component-wise ablations, which weakens the evaluation of the central claims. In the revised manuscript, we will include quantitative effect sizes (e.g., mean improvements with standard deviations over 10 random seeds), paired statistical significance tests (e.g., t-tests with p-values) comparing against baselines, and dedicated ablation studies that isolate the Structural Binding Energy, Mean-Field Repulsive Potential, and Adaptive Thermostat. We will also add visualizations and metrics specifically demonstrating alleviation of over-smoothing (e.g., node embedding variance and homophily preservation across layers). These changes will provide clearer, more rigorous empirical support. revision: yes

Circularity Check

1 steps flagged

Helmholtz free energy objective and Adaptive Thermostat defined to enforce the claimed metastable balance by construction

specific steps
  1. self definitional [Abstract]
    "we propose HyperGRL, a thermodynamics-driven framework that embeds nodes on a unit hypersphere by minimizing a Helmholtz free energy objective composed of two competing potentials. First, we introduce Structural Binding Energy (via Neighbor-Mean Alignment)... Second, to counteract representation collapse, we impose a Mean-Field Repulsive Potential (via Sampling-Free Uniformity)... Crucially, to govern the trade-off between local alignment and global uniformity, we devise an Adaptive Thermostat. This entropy-guided strategy dynamically regulates the system's 'temperature' during training, guid["

    The metastable state balancing local cohesion with global discriminability is not derived; it is the explicit target of the objective that is defined as the sum of the two potentials whose relative weighting is controlled by the thermostat the authors introduce for that purpose. Minimizing the constructed free energy therefore produces the claimed balance by definition.

full rationale

The paper defines the core objective as minimization of a Helmholtz free energy composed of the newly introduced Structural Binding Energy and Mean-Field Repulsive Potential, with the Adaptive Thermostat explicitly devised to regulate their trade-off and drive the system to the desired metastable state. This makes the reported balance of cohesion and discriminability (and alleviation of over-smoothing) a direct consequence of the model construction rather than an independent prediction or emergent result. No external stability analysis or fixed-point derivation is provided to break the self-definition. Experiments then evaluate performance on the same constructed dynamics, yielding partial circularity (score 6).

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 3 invented entities

The framework rests on the domain assumption that graph structure can be faithfully captured by local mean alignment and global uniformity on a hypersphere, plus the ad-hoc introduction of three new energy terms whose functional forms and interaction are defined within the paper.

free parameters (1)
  • thermostat entropy scaling factor
    Controls the dynamic temperature adjustment; its value is not derived from first principles and must be chosen or adapted during training.
axioms (2)
  • domain assumption Node representations lie on the unit hypersphere
    Stated as the embedding space without derivation from graph properties.
  • ad hoc to paper Helmholtz free energy is an appropriate objective for GRL equilibrium
    Invoked to justify the binding-plus-repulsion decomposition.
invented entities (3)
  • Structural Binding Energy no independent evidence
    purpose: Local force encouraging structurally related nodes to cluster
    Newly defined via Neighbor-Mean Alignment; no independent physical justification supplied.
  • Mean-Field Repulsive Potential no independent evidence
    purpose: Global force maximizing dispersion without negative samples
    Newly defined via Sampling-Free Uniformity; introduced to counteract collapse.
  • Adaptive Thermostat no independent evidence
    purpose: Dynamic regulator of system temperature based on entropy
    New control strategy to balance the two potentials during optimization.

pith-pipeline@v0.9.0 · 5570 in / 1559 out tokens · 55312 ms · 2026-05-16T19:17:45.432525+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 3 internal anchors

  1. [1]

    Y . Zhu, Y . Xu, F. Yu, Q. Liu, S. Wu, L. Wang, Deep graph contrastive representation learning, arXiv preprint arXiv:2006.04131 (2020). 25

  2. [2]

    Veliˇckovi´c, W

    P. Veliˇckovi´c, W. Fedus, W. L. Hamilton, P. Liò, Y . Bengio, R. D. Hjelm, Deep graph infomax, in: International Conference on Learning Representations, 2019

  3. [3]

    Thakoor, C

    S. Thakoor, C. Tallec, M. G. Azar, R. Munos, P. Veli ˇckovi´c, M. Valko, Boot- strapped representation learning on graphs, in: ICLR 2021 Workshop on Geomet- rical and Topological Representation Learning, 2021

  4. [4]

    A. v. d. Oord, Y . Li, O. Vinyals, Representation learning with contrastive predictive coding, arXiv preprint arXiv:1807.03748 (2018)

  5. [5]

    Hassani, A

    K. Hassani, A. H. Khasahmadi, Contrastive multi-view representation learning on graphs, in: International Conference on Machine Learning, V ol. 119, PMLR, 2020, pp. 4116–4126

  6. [6]

    Grill, F

    J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Do- ersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, et al., Bootstrap your own latent-a new approach to self-supervised learning, Adv. Neural Inf. Process. Syst. 33 (2020) 21271–21284

  7. [7]

    T. Wang, P. Isola, Understanding contrastive representation learning through alignment and uniformity on the hypersphere, in: International Conference on Machine Learning, PMLR, 2020, pp. 9929–9939

  8. [8]

    Y . Zhu, Y . Xu, F. Yu, Q. Liu, S. Wu, L. Wang, Graph contrastive learning with adaptive augmentation, in: Proceedings of the Web Conference 2021, 2021, pp. 2069–2080

  9. [9]

    J. Xia, L. Wu, G. Wang, J. Chen, S. Z. Li, Progcl: Rethinking hard negative mining in graph contrastive learning, in: International Conference on Machine Learning, V ol. 162, PMLR, 2022, pp. 24332–24346

  10. [10]

    Zheng, S

    Y . Zheng, S. Pan, V . Lee, Y . Zheng, P. S. Yu, Rethinking and scaling up graph contrastive learning: An extremely efficient approach with group discrimination, Adv. Neural Inf. Process. Syst. 35 (2022) 10809–10820. 26

  11. [11]

    N. Lee, J. Lee, C. Park, Augmentation-free self-supervised learning on graphs, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 36, 2022, pp. 7372–7380

  12. [12]

    W. Sun, J. Li, L. Chen, B. Wu, Y . Bian, Z. Zheng, Rethinking and simplifying boot- strapped graph latents, in: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024, pp. 665–673

  13. [13]

    Y . You, T. Chen, Y . Sui, T. Chen, Z. Wang, Y . Shen, Graph contrastive learning with augmentations, Adv. Neural Inf. Process. Syst. 33 (2020) 5812–5823

  14. [14]

    Thakoor, C

    S. Thakoor, C. Tallec, M. G. Azar, M. Azabou, E. L. Dyer, R. Munos, P. Veliˇckovi´c, M. Valko, Large-scale representation learning on graphs via bootstrapping, in: International Conference on Learning Representations, 2022

  15. [15]

    Y . Tian, C. Sun, B. Poole, D. Krishnan, C. Schmid, P. Isola, What makes for good views for contrastive learning?, Adv. Neural Inf. Process. Syst. 33 (2020) 6827–6839

  16. [16]

    T. N. Kipf, M. Welling, Variational graph auto-encoders, arXiv preprint arXiv:1611.07308 (2016)

  17. [17]

    Z. Hou, X. Liu, Y . Cen, Y . Dong, H. Yang, C. Wang, J. Tang, Graphmae: Self- supervised masked graph autoencoders, in: Proceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2022, pp. 594–604

  18. [18]

    F.-Y . Sun, J. Hoffman, V . Verma, J. Tang, Infograph: Unsupervised and semi- supervised graph-level representation learning via mutual information maximiza- tion, in: International Conference on Learning Representations, 2020

  19. [19]

    X. Wang, L. Peng, R. Hu, P. Hu, X. Zhu, Unsupervised multiplex graph represen- tation learning via maximizing coding rate reduction, Pattern Recognit. 165 (2025) 111557. 27

  20. [20]

    Z. Luo, Y . Dong, Q. Zheng, H. Liu, M. Luo, Dual-channel graph contrastive learning for self-supervised graph-level representation learning, Pattern Recognit. 139 (2023) 109448

  21. [21]

    J. Fang, S. Liang, Z. Meng, M. De Rijke, Hyperspherical variational co-embedding for attributed networks, ACM Trans. Inf. Syst. 40 (3) (2021) 1–36

  22. [22]

    P. Wang, D. Wu, C. Chen, K. Liu, Y . Fu, J. Huang, Y . Zhou, J. Zhan, X. Hua, Deep adaptive graph clustering via von mises-fisher distributions, ACM Trans. Web 18 (2) (2024) 1–21

  23. [23]

    J. Lu, D. Wu, F. Nie, R. Wang, X. Li, Hyperspherical prototype node clustering, Trans. Mach. Learn. Res. (2024)

  24. [24]

    D. He, L. Shan, J. Zhao, H. Zhang, Z. Wang, W. Zhang, Exploitation of a latent mechanism in graph contrastive learning: Representation scattering, Adv. Neural Inf. Process. Syst. 37 (2024) 115351–115376

  25. [25]

    S. Yun, M. Jeong, R. Kim, J. Kang, H. J. Kim, Graph transformer networks, Adv. Neural Inf. Process. Syst. 32 (2019)

  26. [26]

    T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: International Conference on Learning Representations, 2017

  27. [27]

    Veliˇckovi´c, G

    P. Veliˇckovi´c, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y . Bengio, Graph attention networks, in: International Conference on Learning Representations, 2018

  28. [28]

    Hamilton, Z

    W. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst. 30 (2017)

  29. [29]

    and Cangea, C

    P. Mernyei, C. Cangea, Wiki-cs: A wikipedia-based benchmark for graph neural networks, arXiv preprint arXiv:2007.02901 (2020)

  30. [30]

    Pitfalls of Graph Neural Network Evaluation

    O. Shchur, M. Mumme, A. Bojchevski, S. Günnemann, Pitfalls of graph neural network evaluation, arXiv preprint arXiv:1811.05868 (2018). 28

  31. [31]

    Perozzi, R

    B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: Online learning of social represen- tations, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 701–710

  32. [32]

    Grover, J

    A. Grover, J. Leskovec, Node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864

  33. [33]

    Z. Peng, W. Huang, M. Luo, Q. Zheng, Y . Rong, T. Xu, J. Huang, Graph represen- tation learning via graphical mutual information maximization, in: Proceedings of the Web Conference 2020, 2020, pp. 259–270

  34. [34]

    Zhang, Q

    H. Zhang, Q. Wu, J. Yan, D. Wipf, P. S. Yu, From canonical correlation analysis to self-supervised graph neural networks, Adv. Neural Inf. Process. Syst. 34 (2021) 76–89

  35. [35]

    Y . Mo, L. Peng, J. Xu, X. Shi, X. Zhu, Simple unsupervised graph representation learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 36, 2022, pp. 7797–7805

  36. [36]

    A. K. Menon, C. Elkan, Link prediction via matrix factorization, in: Joint Euro- pean Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2011, pp. 437–452

  37. [37]

    Zhang, Y

    M. Zhang, Y . Chen, Link prediction based on graph neural networks, Adv. Neural Inf. Process. Syst. 31 (2018)

  38. [38]

    B. P. Chamberlain, S. Shirobokov, E. Rossi, F. Frasca, T. Markovich, N. Y . Ham- merla, M. M. Bronstein, M. Hansmire, Graph neural networks for link prediction with subgraph sketching, in: International Conference on Learning Representa- tions, 2023

  39. [39]

    Z. Zhu, Z. Zhang, L.-P. Xhonneux, J. Tang, Neural bellman-ford networks: A general graph neural network framework for link prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 29476–29490. 29

  40. [40]

    S. Yun, S. Kim, J. Lee, J. Kang, H. J. Kim, Neo-gnns: Neighborhood overlap- aware graph neural networks for link prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 13683–13694

  41. [41]

    H. Wang, H. Yin, M. Zhang, P. Li, Equivariant and stable positional encoding for more powerful graph neural networks, in: International Conference on Learning Representations, 2022

  42. [42]

    X. Wang, H. Yang, M. Zhang, Neural common neighbor with completion for link prediction, in: International Conference on Learning Representations, 2024

  43. [43]

    J. Li, H. Shomer, H. Mao, S. Zeng, Y . Ma, N. Shah, J. Tang, D. Yin, Evaluating graph neural networks for link prediction: Current pitfalls and new benchmarking, Adv. Neural Inf. Process. Syst. 36 (2023) 3853–3866

  44. [44]

    L. v. d. Maaten, G. Hinton, Visualizing data using t-sne, J. Mach. Learn. Res. 9 (2008) 2579–2605. 30