Energy-Balanced Hyperspherical Graph Representation Learning via Structural Binding and Entropic Dispersion

Hongbin Wang; Junjun Guo; Rui Chen; Yantuan Xian; Yan Xiang; Zhengtao Yu

arxiv: 2512.24062 · v2 · submitted 2025-12-30 · 💻 cs.LG

Energy-Balanced Hyperspherical Graph Representation Learning via Structural Binding and Entropic Dispersion

Rui Chen , Junjun Guo , Hongbin Wang , Yan Xiang , Yantuan Xian , Zhengtao Yu This is my paper

Pith reviewed 2026-05-16 19:17 UTC · model grok-4.3

classification 💻 cs.LG

keywords graph representation learninghyperspherical embeddingsenergy-based modelsover-smoothingnode classificationthermodynamic constraintsuniformity regularization

0 comments

The pith

Graph nodes reach a balanced energy state on a unit hypersphere when local structural binding competes against global repulsive dispersion under an adaptive thermostat.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models graph representation learning as a search for energy equilibrium on a latent manifold. Standard message passing dissipates energy without bound, pushing representations toward collapse and over-smoothing. HyperGRL counters this by minimizing a Helmholtz free energy that adds a neighbor-mean alignment term for local cohesion and a sampling-free uniformity term for global spread. An entropy-guided adaptive thermostat modulates temperature to keep the system in a metastable balance. Experiments across node classification, clustering, and link prediction show the resulting embeddings remain discriminative on multiple benchmarks.

Core claim

Graph Representation Learning is cast as a physical process of seeking an energy equilibrium state for a node system on a latent manifold. By minimizing a Helmholtz free energy objective that combines Structural Binding Energy (via Neighbor-Mean Alignment) as a local binding force with Mean-Field Repulsive Potential (via Sampling-Free Uniformity) as a global entropic force, and governing their trade-off through an entropy-guided Adaptive Thermostat, nodes are embedded on a unit hypersphere in a robust metastable state that preserves both structural cohesion and representation discriminability.

What carries the argument

The Helmholtz free energy objective on the unit hypersphere, formed by Structural Binding Energy via Neighbor-Mean Alignment (local cohesion) competing with Mean-Field Repulsive Potential via Sampling-Free Uniformity (global dispersion), dynamically regulated by an entropy-guided Adaptive Thermostat that adjusts the system's temperature.

If this is right

Node embeddings become more discriminative while reducing over-smoothing on classification, clustering, and link-prediction tasks.
Representations stay uniformly dispersed on the hypersphere without requiring negative samples.
The adaptive thermostat automatically trades off local alignment against global uniformity during training.
The same energy-minimization view applies across diverse benchmark graphs without dataset-specific redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The thermodynamic framing could be tested on temporal or heterogeneous graphs to see whether the same binding-plus-repulsion balance prevents collapse in those settings.
If the sampling-free uniformity term scales well, it may replace contrastive losses in other embedding domains such as knowledge graphs or molecular structures.
The approach suggests that explicit energy constraints could be added to existing message-passing layers rather than replacing them entirely.

Load-bearing premise

The two competing energy terms together with the adaptive thermostat will reliably drive the node system to a metastable equilibrium that balances cohesion and dispersion without creating new instabilities or demanding heavy hyperparameter search.

What would settle it

If HyperGRL produces feature collapse, measurable over-smoothing, or lower accuracy than standard GNN baselines on widely used citation datasets such as Cora, CiteSeer, or PubMed, the claim that the energy balance yields robust representations would be refuted.

Figures

Figures reproduced from arXiv: 2512.24062 by Hongbin Wang, Junjun Guo, Rui Chen, Yantuan Xian, Yan Xiang, Zhengtao Yu.

**Figure 1.** Figure 1: Overview of HyperGRL. Given a graph G(A, X), a graph augmentation T produces an augmented graph G(A ′ , X ′ ). This augmented graph is then encoded by a GNN fθ to generate node representations H. These representations are subsequently normalized onto a hyperspherical space to yield Z, where training is driven by two complementary objectives: the Neighbor-Mean Alignment loss Lalign, which pulls each node to… view at source ↗

**Figure 2.** Figure 2: Effect of adaptive α. 94% on Coauthor-CS—demonstrating the robustness and consistency of our framework across different GNN architectures. These results highlight that HyperGRL not only adapts seamlessly to diverse backbone networks but also fully exploits the expressive power of Transformer architectures, achieving superior representation quality and stronger generalization across heterogeneous graph doma… view at source ↗

**Figure 3.** Figure 3: Impact of the target entropy Htarget. 5.6.2. Impact of the neighbor-mean order k To investigate the influence of the neighbor-mean order k, we fix all other hyperparameters and vary k from 1 to 3. As shown in [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗

**Figure 4.** Figure 4: Impact of the neighbor-mean order k. Notably, HyperGRL remains highly stable across different dimensions, maintaining competitive accuracy even in low-dimensional settings, which highlights its efficient utilization of representation capacity. (a) Cora (b) WikiCS [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗

**Figure 5.** Figure 5: Performance on node classification (Accuracy %) with di [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

**Figure 6.** Figure 6: Performance on node classification (Accuracy %) with di [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: t-SNE embeddings of nodes in the Cora dataset. Each color represents a distinct class. [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗

read the original abstract

Graph Representation Learning (GRL) can be fundamentally modeled as a physical process of seeking an energy equilibrium state for a node system on a latent manifold. However, existing Graph Neural Networks (GNNs) often suffer from uncontrolled energy dissipation during message passing, driving the system towards a state of Thermal Death--manifested as feature collapse or over-smoothing--due to the absence of explicit thermodynamic constraints. To address this, we propose HyperGRL, a thermodynamics-driven framework that embeds nodes on a unit hypersphere by minimizing a Helmholtz free energy objective composed of two competing potentials. First, we introduce Structural Binding Energy (via Neighbor-Mean Alignment), which functions as a local binding force to strengthen structural cohesion, encouraging structurally related nodes to form compact local clusters. Second, to counteract representation collapse, we impose a Mean-Field Repulsive Potential (via Sampling-Free Uniformity), which acts as a global entropic force to maximize representation dispersion without the need for negative sampling. Crucially, to govern the trade-off between local alignment and global uniformity, we devise an Adaptive Thermostat. This entropy-guided strategy dynamically regulates the system's "temperature" during training, guiding the representation towards a robust metastable state that balances local cohesion with global discriminability. Extensive experiments on node classification, node clustering, and link prediction show that HyperGRL consistently achieves strong performance across diverse benchmark datasets, yielding more discriminative and robust representations while alleviating over-smoothing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The thermodynamic framing for hyperspherical GRL is novel but unsupported by derivations or evidence in the current version.

read the letter

The punchline for this paper is that it offers a novel thermodynamic framing for hyperspherical graph representation learning to combat over-smoothing, but the current version lacks the mathematical and empirical backing to make a strong case. What stands out as new is the use of a Helmholtz free energy objective that combines a Structural Binding Energy based on neighbor-mean alignment for local cohesion with a Mean-Field Repulsive Potential for global uniformity on the unit hypersphere, regulated by an entropy-guided Adaptive Thermostat. This specific combination isn't in the prior work mentioned. The paper does a good job of motivating the problem as energy dissipation leading to feature collapse and proposing competing potentials to reach a balanced state without relying on negative samples. Where it falls short is in the details. There are no derivations showing how the potentials are formulated or how the thermostat is implemented and why it leads to stable dynamics. The stress-test note correctly points out the missing formal stability analysis, which leaves open the possibility of oscillations or collapse. Additionally, while the abstract mentions strong performance on node classification, clustering, and link prediction, there are no numbers, tables, or comparisons provided, making it impossible to judge the actual gains or whether the method works as described. The citation pattern looks typical, but without the full text it's difficult to see if key related work is addressed. This is the kind of paper that could interest researchers in graph representation learning who are exploring new regularization strategies or physics analogies. A serious reader might find the framing useful for thinking about their own models, but the missing pieces mean it doesn't yet deliver a complete contribution. My recommendation is to send it for peer review after the authors include the full mathematical derivations, the thermostat algorithm, convergence arguments, and the detailed experimental results with ablations. The core idea is solid enough that referees could help refine it into something more robust.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes HyperGRL, a thermodynamics-driven framework for graph representation learning that embeds nodes on the unit hypersphere by minimizing a Helmholtz free energy objective. This objective combines Structural Binding Energy (via Neighbor-Mean Alignment) to promote local structural cohesion with a Mean-Field Repulsive Potential (via Sampling-Free Uniformity) to enforce global dispersion, with the trade-off governed by an entropy-guided Adaptive Thermostat that dynamically regulates temperature to reach a metastable state. The approach is motivated as a remedy for over-smoothing and feature collapse in GNNs, and extensive experiments on node classification, node clustering, and link prediction are reported to show consistent improvements across benchmark datasets.

Significance. If the central claims hold, the work supplies a physically motivated objective for controlling the balance between local alignment and global uniformity in hyperspherical GRL, which could offer a scalable, sampling-free alternative to contrastive methods and a dynamic mechanism for mitigating over-smoothing. The adaptive thermostat and sampling-free uniformity are practical strengths that may generalize beyond the evaluated tasks.

major comments (2)

[Method section on Adaptive Thermostat] The central claim that the Adaptive Thermostat drives the system to a reliable metastable balance between cohesion and discriminability rests on unproven dynamics; no Lyapunov-style argument, fixed-point characterization, or discrete-time convergence bound is supplied for the combined Helmholtz objective under mini-batch entropy estimates (see the description of the Adaptive Thermostat and the free-energy formulation).
[Experiments] Performance claims of 'strong performance' and alleviation of over-smoothing are asserted without reported effect sizes, statistical significance, or ablation results isolating the contributions of the binding energy, repulsive potential, and thermostat; this leaves the empirical support for the central claim difficult to evaluate (see Experiments section).

minor comments (2)

[Method] Explicit equations for the Helmholtz free energy, the Structural Binding Energy, and the Mean-Field Repulsive Potential should be provided early in the method section to clarify the objective and the role of the thermostat scaling factor.
[Related Work] The manuscript would benefit from additional citations to prior hyperspherical embedding work and thermodynamic analogies in representation learning to better situate the novelty.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, indicating planned revisions where appropriate to strengthen the manuscript.

read point-by-point responses

Referee: [Method section on Adaptive Thermostat] The central claim that the Adaptive Thermostat drives the system to a reliable metastable balance between cohesion and discriminability rests on unproven dynamics; no Lyapunov-style argument, fixed-point characterization, or discrete-time convergence bound is supplied for the combined Helmholtz objective under mini-batch entropy estimates (see the description of the Adaptive Thermostat and the free-energy formulation).

Authors: We acknowledge that the manuscript does not provide a formal Lyapunov-style argument, fixed-point characterization, or discrete-time convergence bound for the Adaptive Thermostat under mini-batch entropy estimates. The thermostat is designed as an entropy-guided mechanism to dynamically balance the Structural Binding Energy and Mean-Field Repulsive Potential toward a metastable state, with its behavior supported by the overall Helmholtz free-energy formulation. In the revision, we will add an empirical analysis of the training dynamics, including plots tracking the evolution of free-energy components, entropy estimates, and representation metrics over epochs across multiple runs, along with a discussion of observed fixed-point behavior in practice. A complete theoretical convergence proof under general conditions is beyond the scope of this work and would require additional assumptions on graph properties; we will note this limitation explicitly while emphasizing the practical stability observed in experiments. revision: partial
Referee: [Experiments] Performance claims of 'strong performance' and alleviation of over-smoothing are asserted without reported effect sizes, statistical significance, or ablation results isolating the contributions of the binding energy, repulsive potential, and thermostat; this leaves the empirical support for the central claim difficult to evaluate (see Experiments section).

Authors: We agree that the current experimental reporting lacks effect sizes, statistical significance tests, and component-wise ablations, which weakens the evaluation of the central claims. In the revised manuscript, we will include quantitative effect sizes (e.g., mean improvements with standard deviations over 10 random seeds), paired statistical significance tests (e.g., t-tests with p-values) comparing against baselines, and dedicated ablation studies that isolate the Structural Binding Energy, Mean-Field Repulsive Potential, and Adaptive Thermostat. We will also add visualizations and metrics specifically demonstrating alleviation of over-smoothing (e.g., node embedding variance and homophily preservation across layers). These changes will provide clearer, more rigorous empirical support. revision: yes

Circularity Check

1 steps flagged

Helmholtz free energy objective and Adaptive Thermostat defined to enforce the claimed metastable balance by construction

specific steps

self definitional [Abstract]
"we propose HyperGRL, a thermodynamics-driven framework that embeds nodes on a unit hypersphere by minimizing a Helmholtz free energy objective composed of two competing potentials. First, we introduce Structural Binding Energy (via Neighbor-Mean Alignment)... Second, to counteract representation collapse, we impose a Mean-Field Repulsive Potential (via Sampling-Free Uniformity)... Crucially, to govern the trade-off between local alignment and global uniformity, we devise an Adaptive Thermostat. This entropy-guided strategy dynamically regulates the system's 'temperature' during training, guid["

The metastable state balancing local cohesion with global discriminability is not derived; it is the explicit target of the objective that is defined as the sum of the two potentials whose relative weighting is controlled by the thermostat the authors introduce for that purpose. Minimizing the constructed free energy therefore produces the claimed balance by definition.

full rationale

The paper defines the core objective as minimization of a Helmholtz free energy composed of the newly introduced Structural Binding Energy and Mean-Field Repulsive Potential, with the Adaptive Thermostat explicitly devised to regulate their trade-off and drive the system to the desired metastable state. This makes the reported balance of cohesion and discriminability (and alleviation of over-smoothing) a direct consequence of the model construction rather than an independent prediction or emergent result. No external stability analysis or fixed-point derivation is provided to break the self-definition. Experiments then evaluate performance on the same constructed dynamics, yielding partial circularity (score 6).

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 3 invented entities

The framework rests on the domain assumption that graph structure can be faithfully captured by local mean alignment and global uniformity on a hypersphere, plus the ad-hoc introduction of three new energy terms whose functional forms and interaction are defined within the paper.

free parameters (1)

thermostat entropy scaling factor
Controls the dynamic temperature adjustment; its value is not derived from first principles and must be chosen or adapted during training.

axioms (2)

domain assumption Node representations lie on the unit hypersphere
Stated as the embedding space without derivation from graph properties.
ad hoc to paper Helmholtz free energy is an appropriate objective for GRL equilibrium
Invoked to justify the binding-plus-repulsion decomposition.

invented entities (3)

Structural Binding Energy no independent evidence
purpose: Local force encouraging structurally related nodes to cluster
Newly defined via Neighbor-Mean Alignment; no independent physical justification supplied.
Mean-Field Repulsive Potential no independent evidence
purpose: Global force maximizing dispersion without negative samples
Newly defined via Sampling-Free Uniformity; introduced to counteract collapse.
Adaptive Thermostat no independent evidence
purpose: Dynamic regulator of system temperature based on entropy
New control strategy to balance the two potentials during optimization.

pith-pipeline@v0.9.0 · 5570 in / 1559 out tokens · 55312 ms · 2026-05-16T19:17:45.432525+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 3 internal anchors

[1]

Y . Zhu, Y . Xu, F. Yu, Q. Liu, S. Wu, L. Wang, Deep graph contrastive representation learning, arXiv preprint arXiv:2006.04131 (2020). 25

work page arXiv 2006
[2]

Veliˇckovi´c, W

P. Veliˇckovi´c, W. Fedus, W. L. Hamilton, P. Liò, Y . Bengio, R. D. Hjelm, Deep graph infomax, in: International Conference on Learning Representations, 2019

work page 2019
[3]

Thakoor, C

S. Thakoor, C. Tallec, M. G. Azar, R. Munos, P. Veli ˇckovi´c, M. Valko, Boot- strapped representation learning on graphs, in: ICLR 2021 Workshop on Geomet- rical and Topological Representation Learning, 2021

work page 2021
[4]

A. v. d. Oord, Y . Li, O. Vinyals, Representation learning with contrastive predictive coding, arXiv preprint arXiv:1807.03748 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[5]

Hassani, A

K. Hassani, A. H. Khasahmadi, Contrastive multi-view representation learning on graphs, in: International Conference on Machine Learning, V ol. 119, PMLR, 2020, pp. 4116–4126

work page 2020
[6]

Grill, F

J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Do- ersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, et al., Bootstrap your own latent-a new approach to self-supervised learning, Adv. Neural Inf. Process. Syst. 33 (2020) 21271–21284

work page 2020
[7]

T. Wang, P. Isola, Understanding contrastive representation learning through alignment and uniformity on the hypersphere, in: International Conference on Machine Learning, PMLR, 2020, pp. 9929–9939

work page 2020
[8]

Y . Zhu, Y . Xu, F. Yu, Q. Liu, S. Wu, L. Wang, Graph contrastive learning with adaptive augmentation, in: Proceedings of the Web Conference 2021, 2021, pp. 2069–2080

work page 2021
[9]

J. Xia, L. Wu, G. Wang, J. Chen, S. Z. Li, Progcl: Rethinking hard negative mining in graph contrastive learning, in: International Conference on Machine Learning, V ol. 162, PMLR, 2022, pp. 24332–24346

work page 2022
[10]

Zheng, S

Y . Zheng, S. Pan, V . Lee, Y . Zheng, P. S. Yu, Rethinking and scaling up graph contrastive learning: An extremely efficient approach with group discrimination, Adv. Neural Inf. Process. Syst. 35 (2022) 10809–10820. 26

work page 2022
[11]

N. Lee, J. Lee, C. Park, Augmentation-free self-supervised learning on graphs, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 36, 2022, pp. 7372–7380

work page 2022
[12]

W. Sun, J. Li, L. Chen, B. Wu, Y . Bian, Z. Zheng, Rethinking and simplifying boot- strapped graph latents, in: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024, pp. 665–673

work page 2024
[13]

Y . You, T. Chen, Y . Sui, T. Chen, Z. Wang, Y . Shen, Graph contrastive learning with augmentations, Adv. Neural Inf. Process. Syst. 33 (2020) 5812–5823

work page 2020
[14]

Thakoor, C

S. Thakoor, C. Tallec, M. G. Azar, M. Azabou, E. L. Dyer, R. Munos, P. Veliˇckovi´c, M. Valko, Large-scale representation learning on graphs via bootstrapping, in: International Conference on Learning Representations, 2022

work page 2022
[15]

Y . Tian, C. Sun, B. Poole, D. Krishnan, C. Schmid, P. Isola, What makes for good views for contrastive learning?, Adv. Neural Inf. Process. Syst. 33 (2020) 6827–6839

work page 2020
[16]

T. N. Kipf, M. Welling, Variational graph auto-encoders, arXiv preprint arXiv:1611.07308 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[17]

Z. Hou, X. Liu, Y . Cen, Y . Dong, H. Yang, C. Wang, J. Tang, Graphmae: Self- supervised masked graph autoencoders, in: Proceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2022, pp. 594–604

work page 2022
[18]

F.-Y . Sun, J. Hoffman, V . Verma, J. Tang, Infograph: Unsupervised and semi- supervised graph-level representation learning via mutual information maximiza- tion, in: International Conference on Learning Representations, 2020

work page 2020
[19]

X. Wang, L. Peng, R. Hu, P. Hu, X. Zhu, Unsupervised multiplex graph represen- tation learning via maximizing coding rate reduction, Pattern Recognit. 165 (2025) 111557. 27

work page 2025
[20]

Z. Luo, Y . Dong, Q. Zheng, H. Liu, M. Luo, Dual-channel graph contrastive learning for self-supervised graph-level representation learning, Pattern Recognit. 139 (2023) 109448

work page 2023
[21]

J. Fang, S. Liang, Z. Meng, M. De Rijke, Hyperspherical variational co-embedding for attributed networks, ACM Trans. Inf. Syst. 40 (3) (2021) 1–36

work page 2021
[22]

P. Wang, D. Wu, C. Chen, K. Liu, Y . Fu, J. Huang, Y . Zhou, J. Zhan, X. Hua, Deep adaptive graph clustering via von mises-fisher distributions, ACM Trans. Web 18 (2) (2024) 1–21

work page 2024
[23]

J. Lu, D. Wu, F. Nie, R. Wang, X. Li, Hyperspherical prototype node clustering, Trans. Mach. Learn. Res. (2024)

work page 2024
[24]

D. He, L. Shan, J. Zhao, H. Zhang, Z. Wang, W. Zhang, Exploitation of a latent mechanism in graph contrastive learning: Representation scattering, Adv. Neural Inf. Process. Syst. 37 (2024) 115351–115376

work page 2024
[25]

S. Yun, M. Jeong, R. Kim, J. Kang, H. J. Kim, Graph transformer networks, Adv. Neural Inf. Process. Syst. 32 (2019)

work page 2019
[26]

T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: International Conference on Learning Representations, 2017

work page 2017
[27]

Veliˇckovi´c, G

P. Veliˇckovi´c, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y . Bengio, Graph attention networks, in: International Conference on Learning Representations, 2018

work page 2018
[28]

Hamilton, Z

W. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst. 30 (2017)

work page 2017
[29]

and Cangea, C

P. Mernyei, C. Cangea, Wiki-cs: A wikipedia-based benchmark for graph neural networks, arXiv preprint arXiv:2007.02901 (2020)

work page arXiv 2007
[30]

Pitfalls of Graph Neural Network Evaluation

O. Shchur, M. Mumme, A. Bojchevski, S. Günnemann, Pitfalls of graph neural network evaluation, arXiv preprint arXiv:1811.05868 (2018). 28

work page internal anchor Pith review Pith/arXiv arXiv 2018
[31]

Perozzi, R

B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: Online learning of social represen- tations, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 701–710

work page 2014
[32]

Grover, J

A. Grover, J. Leskovec, Node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864

work page 2016
[33]

Z. Peng, W. Huang, M. Luo, Q. Zheng, Y . Rong, T. Xu, J. Huang, Graph represen- tation learning via graphical mutual information maximization, in: Proceedings of the Web Conference 2020, 2020, pp. 259–270

work page 2020
[34]

Zhang, Q

H. Zhang, Q. Wu, J. Yan, D. Wipf, P. S. Yu, From canonical correlation analysis to self-supervised graph neural networks, Adv. Neural Inf. Process. Syst. 34 (2021) 76–89

work page 2021
[35]

Y . Mo, L. Peng, J. Xu, X. Shi, X. Zhu, Simple unsupervised graph representation learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 36, 2022, pp. 7797–7805

work page 2022
[36]

A. K. Menon, C. Elkan, Link prediction via matrix factorization, in: Joint Euro- pean Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2011, pp. 437–452

work page 2011
[37]

Zhang, Y

M. Zhang, Y . Chen, Link prediction based on graph neural networks, Adv. Neural Inf. Process. Syst. 31 (2018)

work page 2018
[38]

B. P. Chamberlain, S. Shirobokov, E. Rossi, F. Frasca, T. Markovich, N. Y . Ham- merla, M. M. Bronstein, M. Hansmire, Graph neural networks for link prediction with subgraph sketching, in: International Conference on Learning Representa- tions, 2023

work page 2023
[39]

Z. Zhu, Z. Zhang, L.-P. Xhonneux, J. Tang, Neural bellman-ford networks: A general graph neural network framework for link prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 29476–29490. 29

work page 2021
[40]

S. Yun, S. Kim, J. Lee, J. Kang, H. J. Kim, Neo-gnns: Neighborhood overlap- aware graph neural networks for link prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 13683–13694

work page 2021
[41]

H. Wang, H. Yin, M. Zhang, P. Li, Equivariant and stable positional encoding for more powerful graph neural networks, in: International Conference on Learning Representations, 2022

work page 2022
[42]

X. Wang, H. Yang, M. Zhang, Neural common neighbor with completion for link prediction, in: International Conference on Learning Representations, 2024

work page 2024
[43]

J. Li, H. Shomer, H. Mao, S. Zeng, Y . Ma, N. Shah, J. Tang, D. Yin, Evaluating graph neural networks for link prediction: Current pitfalls and new benchmarking, Adv. Neural Inf. Process. Syst. 36 (2023) 3853–3866

work page 2023
[44]

L. v. d. Maaten, G. Hinton, Visualizing data using t-sne, J. Mach. Learn. Res. 9 (2008) 2579–2605. 30

work page 2008

[1] [1]

Y . Zhu, Y . Xu, F. Yu, Q. Liu, S. Wu, L. Wang, Deep graph contrastive representation learning, arXiv preprint arXiv:2006.04131 (2020). 25

work page arXiv 2006

[2] [2]

Veliˇckovi´c, W

P. Veliˇckovi´c, W. Fedus, W. L. Hamilton, P. Liò, Y . Bengio, R. D. Hjelm, Deep graph infomax, in: International Conference on Learning Representations, 2019

work page 2019

[3] [3]

Thakoor, C

S. Thakoor, C. Tallec, M. G. Azar, R. Munos, P. Veli ˇckovi´c, M. Valko, Boot- strapped representation learning on graphs, in: ICLR 2021 Workshop on Geomet- rical and Topological Representation Learning, 2021

work page 2021

[4] [4]

A. v. d. Oord, Y . Li, O. Vinyals, Representation learning with contrastive predictive coding, arXiv preprint arXiv:1807.03748 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[5] [5]

Hassani, A

K. Hassani, A. H. Khasahmadi, Contrastive multi-view representation learning on graphs, in: International Conference on Machine Learning, V ol. 119, PMLR, 2020, pp. 4116–4126

work page 2020

[6] [6]

Grill, F

J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Do- ersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, et al., Bootstrap your own latent-a new approach to self-supervised learning, Adv. Neural Inf. Process. Syst. 33 (2020) 21271–21284

work page 2020

[7] [7]

T. Wang, P. Isola, Understanding contrastive representation learning through alignment and uniformity on the hypersphere, in: International Conference on Machine Learning, PMLR, 2020, pp. 9929–9939

work page 2020

[8] [8]

Y . Zhu, Y . Xu, F. Yu, Q. Liu, S. Wu, L. Wang, Graph contrastive learning with adaptive augmentation, in: Proceedings of the Web Conference 2021, 2021, pp. 2069–2080

work page 2021

[9] [9]

J. Xia, L. Wu, G. Wang, J. Chen, S. Z. Li, Progcl: Rethinking hard negative mining in graph contrastive learning, in: International Conference on Machine Learning, V ol. 162, PMLR, 2022, pp. 24332–24346

work page 2022

[10] [10]

Zheng, S

Y . Zheng, S. Pan, V . Lee, Y . Zheng, P. S. Yu, Rethinking and scaling up graph contrastive learning: An extremely efficient approach with group discrimination, Adv. Neural Inf. Process. Syst. 35 (2022) 10809–10820. 26

work page 2022

[11] [11]

N. Lee, J. Lee, C. Park, Augmentation-free self-supervised learning on graphs, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 36, 2022, pp. 7372–7380

work page 2022

[12] [12]

W. Sun, J. Li, L. Chen, B. Wu, Y . Bian, Z. Zheng, Rethinking and simplifying boot- strapped graph latents, in: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024, pp. 665–673

work page 2024

[13] [13]

Y . You, T. Chen, Y . Sui, T. Chen, Z. Wang, Y . Shen, Graph contrastive learning with augmentations, Adv. Neural Inf. Process. Syst. 33 (2020) 5812–5823

work page 2020

[14] [14]

Thakoor, C

S. Thakoor, C. Tallec, M. G. Azar, M. Azabou, E. L. Dyer, R. Munos, P. Veliˇckovi´c, M. Valko, Large-scale representation learning on graphs via bootstrapping, in: International Conference on Learning Representations, 2022

work page 2022

[15] [15]

Y . Tian, C. Sun, B. Poole, D. Krishnan, C. Schmid, P. Isola, What makes for good views for contrastive learning?, Adv. Neural Inf. Process. Syst. 33 (2020) 6827–6839

work page 2020

[16] [16]

T. N. Kipf, M. Welling, Variational graph auto-encoders, arXiv preprint arXiv:1611.07308 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[17] [17]

Z. Hou, X. Liu, Y . Cen, Y . Dong, H. Yang, C. Wang, J. Tang, Graphmae: Self- supervised masked graph autoencoders, in: Proceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2022, pp. 594–604

work page 2022

[18] [18]

F.-Y . Sun, J. Hoffman, V . Verma, J. Tang, Infograph: Unsupervised and semi- supervised graph-level representation learning via mutual information maximiza- tion, in: International Conference on Learning Representations, 2020

work page 2020

[19] [19]

X. Wang, L. Peng, R. Hu, P. Hu, X. Zhu, Unsupervised multiplex graph represen- tation learning via maximizing coding rate reduction, Pattern Recognit. 165 (2025) 111557. 27

work page 2025

[20] [20]

Z. Luo, Y . Dong, Q. Zheng, H. Liu, M. Luo, Dual-channel graph contrastive learning for self-supervised graph-level representation learning, Pattern Recognit. 139 (2023) 109448

work page 2023

[21] [21]

J. Fang, S. Liang, Z. Meng, M. De Rijke, Hyperspherical variational co-embedding for attributed networks, ACM Trans. Inf. Syst. 40 (3) (2021) 1–36

work page 2021

[22] [22]

P. Wang, D. Wu, C. Chen, K. Liu, Y . Fu, J. Huang, Y . Zhou, J. Zhan, X. Hua, Deep adaptive graph clustering via von mises-fisher distributions, ACM Trans. Web 18 (2) (2024) 1–21

work page 2024

[23] [23]

J. Lu, D. Wu, F. Nie, R. Wang, X. Li, Hyperspherical prototype node clustering, Trans. Mach. Learn. Res. (2024)

work page 2024

[24] [24]

D. He, L. Shan, J. Zhao, H. Zhang, Z. Wang, W. Zhang, Exploitation of a latent mechanism in graph contrastive learning: Representation scattering, Adv. Neural Inf. Process. Syst. 37 (2024) 115351–115376

work page 2024

[25] [25]

S. Yun, M. Jeong, R. Kim, J. Kang, H. J. Kim, Graph transformer networks, Adv. Neural Inf. Process. Syst. 32 (2019)

work page 2019

[26] [26]

T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: International Conference on Learning Representations, 2017

work page 2017

[27] [27]

Veliˇckovi´c, G

P. Veliˇckovi´c, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y . Bengio, Graph attention networks, in: International Conference on Learning Representations, 2018

work page 2018

[28] [28]

Hamilton, Z

W. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst. 30 (2017)

work page 2017

[29] [29]

and Cangea, C

P. Mernyei, C. Cangea, Wiki-cs: A wikipedia-based benchmark for graph neural networks, arXiv preprint arXiv:2007.02901 (2020)

work page arXiv 2007

[30] [30]

Pitfalls of Graph Neural Network Evaluation

O. Shchur, M. Mumme, A. Bojchevski, S. Günnemann, Pitfalls of graph neural network evaluation, arXiv preprint arXiv:1811.05868 (2018). 28

work page internal anchor Pith review Pith/arXiv arXiv 2018

[31] [31]

Perozzi, R

B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: Online learning of social represen- tations, in: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 701–710

work page 2014

[32] [32]

Grover, J

A. Grover, J. Leskovec, Node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864

work page 2016

[33] [33]

Z. Peng, W. Huang, M. Luo, Q. Zheng, Y . Rong, T. Xu, J. Huang, Graph represen- tation learning via graphical mutual information maximization, in: Proceedings of the Web Conference 2020, 2020, pp. 259–270

work page 2020

[34] [34]

Zhang, Q

H. Zhang, Q. Wu, J. Yan, D. Wipf, P. S. Yu, From canonical correlation analysis to self-supervised graph neural networks, Adv. Neural Inf. Process. Syst. 34 (2021) 76–89

work page 2021

[35] [35]

Y . Mo, L. Peng, J. Xu, X. Shi, X. Zhu, Simple unsupervised graph representation learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, V ol. 36, 2022, pp. 7797–7805

work page 2022

[36] [36]

A. K. Menon, C. Elkan, Link prediction via matrix factorization, in: Joint Euro- pean Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2011, pp. 437–452

work page 2011

[37] [37]

Zhang, Y

M. Zhang, Y . Chen, Link prediction based on graph neural networks, Adv. Neural Inf. Process. Syst. 31 (2018)

work page 2018

[38] [38]

B. P. Chamberlain, S. Shirobokov, E. Rossi, F. Frasca, T. Markovich, N. Y . Ham- merla, M. M. Bronstein, M. Hansmire, Graph neural networks for link prediction with subgraph sketching, in: International Conference on Learning Representa- tions, 2023

work page 2023

[39] [39]

Z. Zhu, Z. Zhang, L.-P. Xhonneux, J. Tang, Neural bellman-ford networks: A general graph neural network framework for link prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 29476–29490. 29

work page 2021

[40] [40]

S. Yun, S. Kim, J. Lee, J. Kang, H. J. Kim, Neo-gnns: Neighborhood overlap- aware graph neural networks for link prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 13683–13694

work page 2021

[41] [41]

H. Wang, H. Yin, M. Zhang, P. Li, Equivariant and stable positional encoding for more powerful graph neural networks, in: International Conference on Learning Representations, 2022

work page 2022

[42] [42]

X. Wang, H. Yang, M. Zhang, Neural common neighbor with completion for link prediction, in: International Conference on Learning Representations, 2024

work page 2024

[43] [43]

J. Li, H. Shomer, H. Mao, S. Zeng, Y . Ma, N. Shah, J. Tang, D. Yin, Evaluating graph neural networks for link prediction: Current pitfalls and new benchmarking, Adv. Neural Inf. Process. Syst. 36 (2023) 3853–3866

work page 2023

[44] [44]

L. v. d. Maaten, G. Hinton, Visualizing data using t-sne, J. Mach. Learn. Res. 9 (2008) 2579–2605. 30

work page 2008