Impacts of Data Splitting Strategies on Parameterized Link Prediction Algorithms

Tao Zhou; Xinshan Jiao; Yilin Bi; Yuxin Luo

arxiv: 2511.05834 · v2 · submitted 2025-11-08 · 📊 stat.OT

Impacts of Data Splitting Strategies on Parameterized Link Prediction Algorithms

Xinshan Jiao , Yuxin Luo , Yilin Bi , Tao Zhou This is my paper

Pith reviewed 2026-05-18 00:31 UTC · model grok-4.3

classification 📊 stat.OT

keywords link predictioninformation leakagedata splittinghyperparameter tuningevaluation metricnetwork scienceperformance overestimation

0 comments

The pith

Using the test set during hyperparameter tuning inflates link prediction performance by an average 3.6%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how data splitting strategies affect the evaluation of parameterized link prediction algorithms. It identifies that incorporating the test set into hyperparameter tuning creates information leakage that artificially boosts reported performance. To measure this effect, a new metric termed Loss Ratio is introduced. Experiments involving 60 real-world networks from six domains reveal an average performance overestimation of 3.6%, with some algorithms experiencing over 15% bias. Heuristic and random-walk methods demonstrate more resilience to this issue. Overall, the findings advocate for standardized data splitting to promote fair comparisons and reproducibility in the field.

Core claim

The authors show that the common practice of using the test set during hyperparameter tuning in link prediction leads to human-induced information leakage, resulting in an average overestimation of model performance by about 3.6% across 60 networks, with the bias exceeding 15% for specific algorithms. The Loss Ratio metric is proposed to quantify this overestimation, and the analysis reveals that heuristic and random-walk-based methods are more robust and stable.

What carries the argument

Loss Ratio, a metric that quantifies the extent of performance overestimation caused by information leakage from using the test set in hyperparameter tuning.

If this is right

Standardized data splitting strategies are required for fair and reproducible benchmarking of link prediction models.
Parameterized models should be evaluated without access to test data during tuning to avoid inflated results.
Heuristic and random-walk-based methods provide more reliable performance estimates under proper evaluation protocols.
The overestimation issue is widespread across networks in multiple domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Many existing published results on link prediction may have overstated performance and could benefit from re-assessment using the Loss Ratio.
This evaluation issue likely extends to other network analysis tasks involving parameterized models, such as community detection or graph classification.
Researchers should develop automated tools to detect and correct for such leakage in benchmarking pipelines.

Load-bearing premise

The Loss Ratio metric accurately and specifically isolates the performance overestimation caused by using the test set during hyperparameter tuning.

What would settle it

Re-running the experiments on the same 60 networks but with hyperparameters tuned strictly on training and validation data only, then checking whether the observed performance drop matches the Loss Ratio values reported for each algorithm.

Figures

Figures reproduced from arXiv: 2511.05834 by Tao Zhou, Xinshan Jiao, Yilin Bi, Yuxin Luo.

**Figure 2.** Figure 2: Performance of different link prediction algorithms under varying [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: AUC under different training ratios ρ. The left and right panels correspond to the two-set split and three-set split settings, respectively. Star markers denote algorithms that achieve the best performance under a specific ρ in only one of the two split settings, while enlarged circular markers indicate algorithms that are optimal under both two-set split and three-set split for the same ρ. The grey vertic… view at source ↗

read the original abstract

Link prediction is a fundamental problem in network science, aiming to infer potential or missing links based on observed network structures. With the increasing adoption of parameterized models, the rigor of evaluation protocols has become critically important. However, a previously common practice of using the test set during hyperparameter tuning has led to human-induced information leakage, thereby inflating the reported model performance. To address this issue, this study introduces a novel evaluation metric, Loss Ratio, which quantitatively measures the extent of performance overestimation. We conduct large-scale experiments on 60 real-world networks across six domains. The results demonstrate that the information leakage leads to an average overestimation of about 3.6%, with the bias reaching over 15% for specific algorithms. Meanwhile, heuristic and random-walk-based methods exhibit greater robustness and stability. The analysis uncovers a pervasive information leakage issue in link prediction evaluation and underscores the necessity of adopting standardized data splitting strategies to enable fair and reproducible benchmarking of link prediction models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives concrete numbers on how test-set leakage in hyperparameter tuning inflates link prediction scores by 3.6% on average across 60 networks, but the Loss Ratio metric likely mixes leakage with differences in tuning effort.

read the letter

The key point here is that using the test edges during hyperparameter tuning produces an average 3.6% overestimation in reported performance, with some algorithms hitting over 15%. The authors back this with experiments on 60 real networks and introduce Loss Ratio as a way to quantify the gap. Heuristic and random-walk methods come out more stable under the two splitting regimes, which is a useful practical observation for people who actually run these models.

Referee Report

3 major / 2 minor

Summary. The manuscript claims that using the test set during hyperparameter tuning for parameterized link prediction models introduces information leakage that inflates reported performance. It introduces a new metric called Loss Ratio to quantify the resulting overestimation and reports results from experiments on 60 real-world networks across six domains, finding an average overestimation of about 3.6% (with bias exceeding 15% for some algorithms). Heuristic and random-walk methods are described as more robust, and the work recommends standardized data splitting strategies for fair benchmarking.

Significance. If the central empirical findings hold after clarification of controls, the paper would usefully document a methodological issue that affects reproducibility in link prediction research. The scale of the study (60 networks) supplies broad evidence that the problem is pervasive rather than isolated. The proposal of Loss Ratio as a diagnostic offers a concrete, if preliminary, tool for quantifying evaluation bias in graph-based models.

major comments (3)

[§3.2] §3.2 (Loss Ratio definition): the metric is presented as a normalized performance gap between tuning regimes that include versus exclude test edges, but the text does not specify whether hyperparameter search budget, number of trials, early-stopping criteria, or random seeds are held identical across regimes; without these controls the observed gap cannot be attributed solely to information leakage rather than differences in model selection effort.
[Results section] Results section and associated tables (e.g., aggregate statistics over 60 networks): the reported average 3.6% overestimation and per-algorithm maxima above 15% are given without error bars, confidence intervals, or explicit exclusion criteria for networks, preventing assessment of whether the quantitative claims are statistically robust or sensitive to a few outliers.
[§5.3] §5.3 (robustness claims): the statement that heuristic and random-walk methods exhibit greater stability is not accompanied by a controlled comparison (e.g., fixed search space size or parameter count) that would rule out the alternative explanation that these methods simply require less tuning and therefore show smaller apparent gaps.

minor comments (2)

[Abstract] Abstract: the six domains of the 60 networks are mentioned but not enumerated; adding the domain list would improve context for readers.
[Figures] Figure captions: several figures comparing splitting strategies would benefit from explicit labels indicating which curves correspond to 'test-included' versus 'test-excluded' tuning.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We have addressed each major comment point by point below, making revisions to improve clarity, statistical presentation, and controlled comparisons where appropriate.

read point-by-point responses

Referee: [§3.2] §3.2 (Loss Ratio definition): the metric is presented as a normalized performance gap between tuning regimes that include versus exclude test edges, but the text does not specify whether hyperparameter search budget, number of trials, early-stopping criteria, or random seeds are held identical across regimes; without these controls the observed gap cannot be attributed solely to information leakage rather than differences in model selection effort.

Authors: We agree that explicit controls are essential to attribute the gap to information leakage. In the revised manuscript, §3.2 now states that both regimes used an identical search budget of 100 trials, the same early-stopping rule based on validation performance, and fixed random seeds. This ensures the observed difference arises from test-edge inclusion rather than unequal optimization effort. We have added a brief description of the protocol and pseudocode in the appendix. revision: yes
Referee: [Results section] Results section and associated tables (e.g., aggregate statistics over 60 networks): the reported average 3.6% overestimation and per-algorithm maxima above 15% are given without error bars, confidence intervals, or explicit exclusion criteria for networks, preventing assessment of whether the quantitative claims are statistically robust or sensitive to a few outliers.

Authors: We acknowledge the value of statistical robustness indicators. The revised Results section now reports the mean overestimation with standard deviation (4.2%) and 95% confidence intervals across all 60 networks. We explicitly note that no networks were excluded and include a short sensitivity check confirming the average remains stable (3.5%) after removing the top 5% of outliers. revision: yes
Referee: [§5.3] §5.3 (robustness claims): the statement that heuristic and random-walk methods exhibit greater stability is not accompanied by a controlled comparison (e.g., fixed search space size or parameter count) that would rule out the alternative explanation that these methods simply require less tuning and therefore show smaller apparent gaps.

Authors: This point is well taken. We have added a controlled experiment in the revised §5.3 that normalizes the hyperparameter search to an equal number of configurations (50 trials) for every method. Under this equalized tuning budget, heuristic and random-walk methods continue to exhibit smaller gaps than parameterized models. A new supplementary table summarizes the controlled results. revision: yes

Circularity Check

0 steps flagged

Empirical evaluation with introduced metric exhibits no circularity

full rationale

The paper performs large-scale experiments across 60 real-world networks and introduces the Loss Ratio metric to quantify performance overestimation from test-set inclusion in hyperparameter tuning. No equations, derivations, or self-citations reduce the reported 3.6% average overestimation (or per-algorithm biases) to fitted parameters defined by the same experiments or to prior self-work by construction. Central claims rest on direct experimental comparisons between splitting strategies, rendering the analysis self-contained against external benchmarks with no load-bearing reductions to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the validity of the Loss Ratio as a leakage measure and the assumption that observed differences stem primarily from splitting strategy rather than other experimental variables; no free parameters or invented physical entities are introduced.

axioms (1)

domain assumption Standard train-test splits for link prediction should keep the test set completely separate from hyperparameter tuning and model selection.
This premise underpins the identification of information leakage as the source of overestimation.

invented entities (1)

Loss Ratio no independent evidence
purpose: To quantitatively measure the extent of performance overestimation due to information leakage in evaluation protocols.
Newly proposed metric whose definition and calibration are internal to the study.

pith-pipeline@v0.9.0 · 5466 in / 1333 out tokens · 50911 ms · 2026-05-18T00:31:44.757299+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Loss Ratio L = |AUC* − AUC'| / AUC* … nested partitioning with consistent test set … two-set vs three-set split
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

heuristic and random-walk-based methods exhibit greater robustness … deep learning … higher L

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

L. Lü, T. Zhou, Link prediction in complex networks: A survey, Physica A 390 (2011) 1150–1170. https://doi.org/10.1016/j.physa.2010.11.027

work page doi:10.1016/j.physa.2010.11.027 2011
[2]

P. Wang, B. Xu, Y. Wu, X. Zhou, Link prediction in social networks: the state-of-the-art, Sci. China Inf. Sci. 58 (2015) 1–38. https://doi.org/10.1007/s11432-015-5399-y

work page doi:10.1007/s11432-015-5399-y 2015
[3]

Martínez, F

V. Martínez, F. Berzal, J.C. Cubero, A survey of link prediction in complex networks, ACM Comput. Surv. 49 (2016) 69. https://doi.org/10.1145/3012704

work page doi:10.1145/3012704 2016
[4]

Kumar, S.S

A. Kumar, S.S. Singh, K. Singh, B. Biswas, Link prediction techniques, applications, and performance: a survey, Physica A 553 (2020) 124289. https://doi.org/10.1016/j.physa.2020.124289

work page doi:10.1016/j.physa.2020.124289 2020
[5]

Divakaran, A

A. Divakaran, A. Mohan, Temporal link prediction: a survey, New Gener. Comput. 38 (2020) 213–258. https://doi.org/10.1007/s00354-019-00070-4

work page doi:10.1007/s00354-019-00070-4 2020
[6]

Zhou, Progresses and challenges in link prediction, iScience 24 (2021) 103217

T. Zhou, Progresses and challenges in link prediction, iScience 24 (2021) 103217. https://doi.org/10.1016/j.isci.2021.103217

work page doi:10.1016/j.isci.2021.103217 2021
[7]

Arrar, N

D. Arrar, N. Kamel, A. Lakhfif, A comprehensive survey of link prediction methods, J. Supercomput. 80 (2024) 3902–3942. https://doi.org/10.1007/s11227-023-05823-4

work page doi:10.1007/s11227-023-05823-4 2024
[8]

Adamic, E

L.A. Adamic, E. Adar, Friends and neighbors on the Web, Soc. Netw. 25 (2003) 211–230. https://doi.org/10.1016/S0378-8733(03)00009-1

work page doi:10.1016/s0378-8733(03)00009-1 2003
[9]

Kovács, K

I.A. Kovács, K. Luck, K. Spirohn, Y. Wang, C. Pollis, S. Schlabach, W. Bian, D.-K. Kim, N. Kishore, T. Hao, M.A. Calderwood, M. Vidal, A.-L. Barabási, Network-based prediction of protein–protein interactions, Nat. Commun. 10 (2019) 1240. https://doi.org/10.1038/s41467-019-09177-y

work page doi:10.1038/s41467-019-09177-y 2019
[10]

L. Lü, M. Medo, C.H. Yeung, Y.-C. Zhang, Z.-K. Zhang, T. Zhou, Recommender systems, Phys. Rep. 519 (2012) 1–49. https://doi.org/10.1016/j.physrep.2012.02.006

work page doi:10.1016/j.physrep.2012.02.006 2012
[11]

A new status index derived from sociometric analysis

L. Katz, A new status index derived from sociometric analysis, Psychometrika 18 (1953) 39–43. https://doi.org/10.1007/BF02289026

work page doi:10.1007/bf02289026 1953
[12]

E. A. Leicht, P. Holme, M. E. J. Newman, Vertex similarity in networks, Phys. Rev. E 73 (2006) 026120. https://doi.org/10.1103/PhysRevE.73.026120

work page doi:10.1103/physreve.73.026120 2006
[13]

The link-prediction problem for social networks

D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks, Journal of the American SocietyforInformationScienceandTechnology58(2007)1019–1031.https://doi.org/10.1002/asi.20591. 11

work page doi:10.1002/asi.20591 2007
[14]

Fouss, A

F. Fouss, A. Pirotte, J.-M. Renders, M. Saerens, Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation, IEEE Trans. Knowl. Data Eng. 19 (2007) 355–369. https://doi.org/10.1109/TKDE.2007.46

work page doi:10.1109/tkde.2007.46 2007
[15]

T. Zhou, L. Lü, Y.-C. Zhang, Predicting missing links via local information, Eur. Phys. J. B 71 (2009) 623–630. https://doi.org/10.1140/epjb/e2009-00335-8

work page doi:10.1140/epjb/e2009-00335-8 2009
[16]

W. Liu, L. Lü, Link prediction based on local random walk, Europhys. Lett. 89 (2010) 58007. https://doi.org/10.1209/0295-5075/89/58007

work page doi:10.1209/0295-5075/89/58007 2010
[17]

Cannistraci, G

C.V. Cannistraci, G. Alanis-Lobato, T. Ravasi, From link-prediction in brain connectomes and pro- tein interactomes to the local-community-paradigm in complex networks, Sci. Rep. 3 (2013) 1613. https://doi.org/10.1038/srep01613

work page doi:10.1038/srep01613 2013
[18]

Clauset, C

A. Clauset, C. Moore, M.E.J. Newman, Hierarchical structure and the prediction of missing links, Nature 453 (2008) 98–101. https://doi.org/10.1038/nature06830

work page doi:10.1038/nature06830 2008
[19]

Guimerà, M

R. Guimerà, M. Sales-Pardo, Missing and spurious interactions and the reconstruction of complex networks, Proc. Natl. Acad. Sci. U.S.A. 106 (2009) 22073–22078

work page 2009
[20]

Soundarajan, J

S. Soundarajan, J. Hopcroft, Using community information to improve the precision of link prediction methods, in: Proceedings of the 21st International Conference on World Wide Web, 2012, pp. 607–608. https://doi.org/10.1145/2187980.2188150

work page doi:10.1145/2187980.2188150 2012
[21]

L. Lü, L. Pan, T. Zhou, Y. Zhang, H.E. Stanley, Toward link predictability of complex networks, Proc. Natl. Acad. Sci. U.S.A. 112 (2015) 2325–2330. https://doi.org/10.1073/pnas.1424644112

work page doi:10.1073/pnas.1424644112 2015
[22]

Kunegis, A

J. Kunegis, A. Lommatzsch, Learning spectral graph transformations for link prediction, in: Pro- ceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 561–568. https://dl.acm.org/doi/10.1145/1553374.1553447

work page doi:10.1145/1553374.1553447 2009
[23]

Bordes, N

A. Bordes, N. Usunier, A. Garcia-Durán, J. Weston, O. Yakhnenko, Translating embeddings for model- ing multi-relational data, in: Advances in Neural Information Processing Systems, 2013, pp. 2787–2795

work page 2013
[24]

Deepwalk: Online learning of social representations,

B. Perozzi, R. Al-Rfou, S. Skiena, DeepWalk: Online learning of social representations, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710. https://doi.org/10.1145/2623330.2623732

work page doi:10.1145/2623330.2623732 2014
[25]

Yang, W.-T

B. Yang, W.-T. Yih, X. He, J. Gao, L. Deng, Embedding entities and relations for learning and inference in knowledge bases, in: Proceedings of the 3rd International Conference on Learning Representations, 2015

work page 2015
[26]

Trouillon, J

T. Trouillon, J. Welbl, S. Riedel, É. Gaussier, G. Bouchard, Complex embeddings for simple link prediction, in: Proceedings of The 33rd International Conference on Machine Learning, 2016, pp. 2071– 2080

work page 2016
[27]

Node2vec: Scalable feature learning for networks,

A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864. https://doi.org/10.1145/2939672.2939754

work page doi:10.1145/2939672.2939754 2016
[28]

T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: Proceed- ings of the 5th International Conference on Learning Representations, 2017

work page 2017
[29]

Hamilton, Z

W.L. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst. 30 (2017). https://dl.acm.org/doi/10.5555/3294771.3294869

work page doi:10.5555/3294771.3294869 2017
[30]

Veličković, G

P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, in: Proceedings of the International Conference on Learning Representations, 2018. https://openreview.net/forum?id=rJXMpikCZ

work page 2018
[31]

Zhang, Y.-C

M. Zhang, Y.-C. Chen, Link prediction based on graph neural networks, in: Advances in Neural Infor- mation Processing Systems, 2018, pp. 5171–5181. 12

work page 2018
[32]

K. Xu, W. Hu, J. Leskovec, S. Jegelka, How powerful are graph neural networks?, in: Proceedings of the International Conference on Learning Representations, 2019. https://openreview.net/forum?id=ryGs6iA5Km

work page 2019
[33]

S.J. Ahn, M. Kim, Variational graph normalized autoencoders, in: Proceedings of the 30th ACM International Conference on Information and Knowledge Management, 2021, pp. 2827–2836. https://doi.org/10.1145/3459637.3482149

work page doi:10.1145/3459637.3482149 2021
[34]

Chamberlain, S

B.P. Chamberlain, S. Shirobokov, E. Rossi, F. Frasca, T. Markovich, N.Y. Hammerla, M.M. Bronstein, M. Hansmire, Graph neural networks for link prediction with subgraph sketching, in: Proceedings of the Eleventh International Conference on Learning Representations, 2022. https://openreview.net/forum?id=m1oqEOAozQU

work page 2022
[35]

R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, in: Proceedings of the 14th international joint conference on Artificial intelligence, 1995, pp. 1137–1145

work page 1995
[36]

Arlot, A

S. Arlot, A. Celisse, A survey of cross-validation procedures for model selection, Stat. Surv. 4 (2010) 40–79. https://doi.org/10.1214/09-SS054

work page doi:10.1214/09-ss054 2010
[37]

Leak- age in data mining: Formulation, detection, and avoidance

S. Kaufman, S. Rosset, C. Perlich, O. Stitelman, Leakage in data mining: Formula- tion, detection, and avoidance, ACM Trans. Knowl. Discov. Data 6 (2012) 1556–4681. https://doi.org/10.1145/2382577.2382579

work page doi:10.1145/2382577.2382579 2012
[38]

Neural Computation , author =

T.G. Dietterich, Approximate statistical tests for comparing supervised classification learning algo- rithms, Neural Comput. 10 (1998) 1895–1923. https://doi.org/10.1162/089976698300017197

work page doi:10.1162/089976698300017197 1998
[39]

Cawley, N.L.C

G.C. Cawley, N.L.C. Talbot, On over-fitting in model selection and subsequent selection bias in perfor- mance evaluation, J. Mach. Learn. Res. 11 (2010) 2079–2107

work page 2010
[40]

Varma, R

S. Varma, R. Simon, Bias in error estimation when using cross-validation for model selection, BMC Bioinform. 7 (2006) 91. https://doi.org/10.1186/1471-2105-7-91

work page doi:10.1186/1471-2105-7-91 2006
[41]

Forman, M

G. Forman, M. Scholz, Apples-to-apples in cross-validation studies: Pitfalls in clas- sifier performance measurement, ACM SIGKDD Explor. Newsl. 12 (2010) 49–57. https://doi.org/10.1145/1882471.1882479

work page doi:10.1145/1882471.1882479 2010
[42]

Roberts, V

D.R. Roberts, V. Bahn, S. Ciuti, M.S. Boyce, J. Elith, G. Guillera-Arroita, S. Hauenstein, C. Lahoz- Monfort, B. Schröder, W. Thuiller, D.I. Warton, B.A. Wintle, F. Hartig, C. Dormann, Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure, Ecography 40 (2017) 913–929. https://doi.org/10.1111/ecog.02881

work page doi:10.1111/ecog.02881 2017
[43]

Breit, S

A. Breit, S. Ott, A. Agibetov, M. Samwald, OpenBioLink: a benchmarking frame- work for large-scale biomedical link prediction, Bioinformatics 36 (2020) 4097–4098. https://doi.org/10.1093/bioinformatics/btaa274

work page doi:10.1093/bioinformatics/btaa274 2020
[44]

Varoquaux, V

G. Varoquaux, V. Cheplygina, Machine learning for medical imaging: methodological failures and recommendations for the future, NPJ Digit. Med. 5 (2022) 48. https://doi.org/10.1038/s41746-022- 00592-y

work page doi:10.1038/s41746-022- 2022
[45]

D.Sun, T.Zhou, J.G.Liu, R.Stanley, Y.-C.Zhang, Informationfilteringbasedontransferringsimilarity, Phys. Rev. E 80 (2009) 017101. https://doi.org/10.1103/PhysRevE.80.017101

work page doi:10.1103/physreve.80.017101 2009
[46]

Zhou, Discriminating abilities of threshold-free evaluation metrics in link prediction, Physica A 615 (2023) 128529

T. Zhou, Discriminating abilities of threshold-free evaluation metrics in link prediction, Physica A 615 (2023) 128529. https://doi.org/10.1016/j.physa.2023.128529

work page doi:10.1016/j.physa.2023.128529 2023
[47]

X. Jiao, S. Wan, Q. Liu, Y. Bi, Y.-L. Lee, E. Xu, D. Hao, T. Zhou, Comparing discriminating abilities of evaluation metrics in link prediction, J. Phys. Complex. 5 (2024) 025014. https://doi.org/10.1088/2632- 072X/ad46be

work page doi:10.1088/2632- 2024
[48]

Y. Bi, X. Jiao, Y.-L. Lee, T. Zhou, Inconsistency among evaluation metrics in link prediction, PNAS Nexus 3 (2024) pgae498. https://doi.org/10.1093/pnasnexus/pgae498. 13

work page doi:10.1093/pnasnexus/pgae498 2024
[49]

S. Wan, Y. Bi, X. Jiao, T. Zhou, Quantifying discriminability of evaluation metrics in link prediction for real networks, Chaos Solitons Fractals 199 (2025) 116864. https://doi.org/10.1016/j.chaos.2025.116864

work page doi:10.1016/j.chaos.2025.116864 2025
[50]

Internationale Kommunikation

J. Kunegis, KONECT: the Koblenz network collection, in: Proceedings of the 22nd International Con- ference on World Wide Web, 2013, pp. 1343–1350. https://doi.org/10.1145/2487788.2488173

work page doi:10.1145/2487788.2488173 2013
[51]

Rossi, N.K

R.A. Rossi, N.K. Ahmed, The Network Data Repository with Interactive Graph Analytics and Visualization, in: Proceedings of the 29th AAAI Conference on Artificial Intelligence, 2015. https://doi.org/10.1609/aaai.v29i1.9277. 14

work page doi:10.1609/aaai.v29i1.9277 2015

[1] [1]

L. Lü, T. Zhou, Link prediction in complex networks: A survey, Physica A 390 (2011) 1150–1170. https://doi.org/10.1016/j.physa.2010.11.027

work page doi:10.1016/j.physa.2010.11.027 2011

[2] [2]

P. Wang, B. Xu, Y. Wu, X. Zhou, Link prediction in social networks: the state-of-the-art, Sci. China Inf. Sci. 58 (2015) 1–38. https://doi.org/10.1007/s11432-015-5399-y

work page doi:10.1007/s11432-015-5399-y 2015

[3] [3]

Martínez, F

V. Martínez, F. Berzal, J.C. Cubero, A survey of link prediction in complex networks, ACM Comput. Surv. 49 (2016) 69. https://doi.org/10.1145/3012704

work page doi:10.1145/3012704 2016

[4] [4]

Kumar, S.S

A. Kumar, S.S. Singh, K. Singh, B. Biswas, Link prediction techniques, applications, and performance: a survey, Physica A 553 (2020) 124289. https://doi.org/10.1016/j.physa.2020.124289

work page doi:10.1016/j.physa.2020.124289 2020

[5] [5]

Divakaran, A

A. Divakaran, A. Mohan, Temporal link prediction: a survey, New Gener. Comput. 38 (2020) 213–258. https://doi.org/10.1007/s00354-019-00070-4

work page doi:10.1007/s00354-019-00070-4 2020

[6] [6]

Zhou, Progresses and challenges in link prediction, iScience 24 (2021) 103217

T. Zhou, Progresses and challenges in link prediction, iScience 24 (2021) 103217. https://doi.org/10.1016/j.isci.2021.103217

work page doi:10.1016/j.isci.2021.103217 2021

[7] [7]

Arrar, N

D. Arrar, N. Kamel, A. Lakhfif, A comprehensive survey of link prediction methods, J. Supercomput. 80 (2024) 3902–3942. https://doi.org/10.1007/s11227-023-05823-4

work page doi:10.1007/s11227-023-05823-4 2024

[8] [8]

Adamic, E

L.A. Adamic, E. Adar, Friends and neighbors on the Web, Soc. Netw. 25 (2003) 211–230. https://doi.org/10.1016/S0378-8733(03)00009-1

work page doi:10.1016/s0378-8733(03)00009-1 2003

[9] [9]

Kovács, K

I.A. Kovács, K. Luck, K. Spirohn, Y. Wang, C. Pollis, S. Schlabach, W. Bian, D.-K. Kim, N. Kishore, T. Hao, M.A. Calderwood, M. Vidal, A.-L. Barabási, Network-based prediction of protein–protein interactions, Nat. Commun. 10 (2019) 1240. https://doi.org/10.1038/s41467-019-09177-y

work page doi:10.1038/s41467-019-09177-y 2019

[10] [10]

L. Lü, M. Medo, C.H. Yeung, Y.-C. Zhang, Z.-K. Zhang, T. Zhou, Recommender systems, Phys. Rep. 519 (2012) 1–49. https://doi.org/10.1016/j.physrep.2012.02.006

work page doi:10.1016/j.physrep.2012.02.006 2012

[11] [11]

A new status index derived from sociometric analysis

L. Katz, A new status index derived from sociometric analysis, Psychometrika 18 (1953) 39–43. https://doi.org/10.1007/BF02289026

work page doi:10.1007/bf02289026 1953

[12] [12]

E. A. Leicht, P. Holme, M. E. J. Newman, Vertex similarity in networks, Phys. Rev. E 73 (2006) 026120. https://doi.org/10.1103/PhysRevE.73.026120

work page doi:10.1103/physreve.73.026120 2006

[13] [13]

The link-prediction problem for social networks

D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks, Journal of the American SocietyforInformationScienceandTechnology58(2007)1019–1031.https://doi.org/10.1002/asi.20591. 11

work page doi:10.1002/asi.20591 2007

[14] [14]

Fouss, A

F. Fouss, A. Pirotte, J.-M. Renders, M. Saerens, Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation, IEEE Trans. Knowl. Data Eng. 19 (2007) 355–369. https://doi.org/10.1109/TKDE.2007.46

work page doi:10.1109/tkde.2007.46 2007

[15] [15]

T. Zhou, L. Lü, Y.-C. Zhang, Predicting missing links via local information, Eur. Phys. J. B 71 (2009) 623–630. https://doi.org/10.1140/epjb/e2009-00335-8

work page doi:10.1140/epjb/e2009-00335-8 2009

[16] [16]

W. Liu, L. Lü, Link prediction based on local random walk, Europhys. Lett. 89 (2010) 58007. https://doi.org/10.1209/0295-5075/89/58007

work page doi:10.1209/0295-5075/89/58007 2010

[17] [17]

Cannistraci, G

C.V. Cannistraci, G. Alanis-Lobato, T. Ravasi, From link-prediction in brain connectomes and pro- tein interactomes to the local-community-paradigm in complex networks, Sci. Rep. 3 (2013) 1613. https://doi.org/10.1038/srep01613

work page doi:10.1038/srep01613 2013

[18] [18]

Clauset, C

A. Clauset, C. Moore, M.E.J. Newman, Hierarchical structure and the prediction of missing links, Nature 453 (2008) 98–101. https://doi.org/10.1038/nature06830

work page doi:10.1038/nature06830 2008

[19] [19]

Guimerà, M

R. Guimerà, M. Sales-Pardo, Missing and spurious interactions and the reconstruction of complex networks, Proc. Natl. Acad. Sci. U.S.A. 106 (2009) 22073–22078

work page 2009

[20] [20]

Soundarajan, J

S. Soundarajan, J. Hopcroft, Using community information to improve the precision of link prediction methods, in: Proceedings of the 21st International Conference on World Wide Web, 2012, pp. 607–608. https://doi.org/10.1145/2187980.2188150

work page doi:10.1145/2187980.2188150 2012

[21] [21]

L. Lü, L. Pan, T. Zhou, Y. Zhang, H.E. Stanley, Toward link predictability of complex networks, Proc. Natl. Acad. Sci. U.S.A. 112 (2015) 2325–2330. https://doi.org/10.1073/pnas.1424644112

work page doi:10.1073/pnas.1424644112 2015

[22] [22]

Kunegis, A

J. Kunegis, A. Lommatzsch, Learning spectral graph transformations for link prediction, in: Pro- ceedings of the 26th Annual International Conference on Machine Learning, 2009, pp. 561–568. https://dl.acm.org/doi/10.1145/1553374.1553447

work page doi:10.1145/1553374.1553447 2009

[23] [23]

Bordes, N

A. Bordes, N. Usunier, A. Garcia-Durán, J. Weston, O. Yakhnenko, Translating embeddings for model- ing multi-relational data, in: Advances in Neural Information Processing Systems, 2013, pp. 2787–2795

work page 2013

[24] [24]

Deepwalk: Online learning of social representations,

B. Perozzi, R. Al-Rfou, S. Skiena, DeepWalk: Online learning of social representations, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 701–710. https://doi.org/10.1145/2623330.2623732

work page doi:10.1145/2623330.2623732 2014

[25] [25]

Yang, W.-T

B. Yang, W.-T. Yih, X. He, J. Gao, L. Deng, Embedding entities and relations for learning and inference in knowledge bases, in: Proceedings of the 3rd International Conference on Learning Representations, 2015

work page 2015

[26] [26]

Trouillon, J

T. Trouillon, J. Welbl, S. Riedel, É. Gaussier, G. Bouchard, Complex embeddings for simple link prediction, in: Proceedings of The 33rd International Conference on Machine Learning, 2016, pp. 2071– 2080

work page 2016

[27] [27]

Node2vec: Scalable feature learning for networks,

A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855–864. https://doi.org/10.1145/2939672.2939754

work page doi:10.1145/2939672.2939754 2016

[28] [28]

T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, in: Proceed- ings of the 5th International Conference on Learning Representations, 2017

work page 2017

[29] [29]

Hamilton, Z

W.L. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst. 30 (2017). https://dl.acm.org/doi/10.5555/3294771.3294869

work page doi:10.5555/3294771.3294869 2017

[30] [30]

Veličković, G

P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, in: Proceedings of the International Conference on Learning Representations, 2018. https://openreview.net/forum?id=rJXMpikCZ

work page 2018

[31] [31]

Zhang, Y.-C

M. Zhang, Y.-C. Chen, Link prediction based on graph neural networks, in: Advances in Neural Infor- mation Processing Systems, 2018, pp. 5171–5181. 12

work page 2018

[32] [32]

K. Xu, W. Hu, J. Leskovec, S. Jegelka, How powerful are graph neural networks?, in: Proceedings of the International Conference on Learning Representations, 2019. https://openreview.net/forum?id=ryGs6iA5Km

work page 2019

[33] [33]

S.J. Ahn, M. Kim, Variational graph normalized autoencoders, in: Proceedings of the 30th ACM International Conference on Information and Knowledge Management, 2021, pp. 2827–2836. https://doi.org/10.1145/3459637.3482149

work page doi:10.1145/3459637.3482149 2021

[34] [34]

Chamberlain, S

B.P. Chamberlain, S. Shirobokov, E. Rossi, F. Frasca, T. Markovich, N.Y. Hammerla, M.M. Bronstein, M. Hansmire, Graph neural networks for link prediction with subgraph sketching, in: Proceedings of the Eleventh International Conference on Learning Representations, 2022. https://openreview.net/forum?id=m1oqEOAozQU

work page 2022

[35] [35]

R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, in: Proceedings of the 14th international joint conference on Artificial intelligence, 1995, pp. 1137–1145

work page 1995

[36] [36]

Arlot, A

S. Arlot, A. Celisse, A survey of cross-validation procedures for model selection, Stat. Surv. 4 (2010) 40–79. https://doi.org/10.1214/09-SS054

work page doi:10.1214/09-ss054 2010

[37] [37]

Leak- age in data mining: Formulation, detection, and avoidance

S. Kaufman, S. Rosset, C. Perlich, O. Stitelman, Leakage in data mining: Formula- tion, detection, and avoidance, ACM Trans. Knowl. Discov. Data 6 (2012) 1556–4681. https://doi.org/10.1145/2382577.2382579

work page doi:10.1145/2382577.2382579 2012

[38] [38]

Neural Computation , author =

T.G. Dietterich, Approximate statistical tests for comparing supervised classification learning algo- rithms, Neural Comput. 10 (1998) 1895–1923. https://doi.org/10.1162/089976698300017197

work page doi:10.1162/089976698300017197 1998

[39] [39]

Cawley, N.L.C

G.C. Cawley, N.L.C. Talbot, On over-fitting in model selection and subsequent selection bias in perfor- mance evaluation, J. Mach. Learn. Res. 11 (2010) 2079–2107

work page 2010

[40] [40]

Varma, R

S. Varma, R. Simon, Bias in error estimation when using cross-validation for model selection, BMC Bioinform. 7 (2006) 91. https://doi.org/10.1186/1471-2105-7-91

work page doi:10.1186/1471-2105-7-91 2006

[41] [41]

Forman, M

G. Forman, M. Scholz, Apples-to-apples in cross-validation studies: Pitfalls in clas- sifier performance measurement, ACM SIGKDD Explor. Newsl. 12 (2010) 49–57. https://doi.org/10.1145/1882471.1882479

work page doi:10.1145/1882471.1882479 2010

[42] [42]

Roberts, V

D.R. Roberts, V. Bahn, S. Ciuti, M.S. Boyce, J. Elith, G. Guillera-Arroita, S. Hauenstein, C. Lahoz- Monfort, B. Schröder, W. Thuiller, D.I. Warton, B.A. Wintle, F. Hartig, C. Dormann, Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure, Ecography 40 (2017) 913–929. https://doi.org/10.1111/ecog.02881

work page doi:10.1111/ecog.02881 2017

[43] [43]

Breit, S

A. Breit, S. Ott, A. Agibetov, M. Samwald, OpenBioLink: a benchmarking frame- work for large-scale biomedical link prediction, Bioinformatics 36 (2020) 4097–4098. https://doi.org/10.1093/bioinformatics/btaa274

work page doi:10.1093/bioinformatics/btaa274 2020

[44] [44]

Varoquaux, V

G. Varoquaux, V. Cheplygina, Machine learning for medical imaging: methodological failures and recommendations for the future, NPJ Digit. Med. 5 (2022) 48. https://doi.org/10.1038/s41746-022- 00592-y

work page doi:10.1038/s41746-022- 2022

[45] [45]

D.Sun, T.Zhou, J.G.Liu, R.Stanley, Y.-C.Zhang, Informationfilteringbasedontransferringsimilarity, Phys. Rev. E 80 (2009) 017101. https://doi.org/10.1103/PhysRevE.80.017101

work page doi:10.1103/physreve.80.017101 2009

[46] [46]

Zhou, Discriminating abilities of threshold-free evaluation metrics in link prediction, Physica A 615 (2023) 128529

T. Zhou, Discriminating abilities of threshold-free evaluation metrics in link prediction, Physica A 615 (2023) 128529. https://doi.org/10.1016/j.physa.2023.128529

work page doi:10.1016/j.physa.2023.128529 2023

[47] [47]

X. Jiao, S. Wan, Q. Liu, Y. Bi, Y.-L. Lee, E. Xu, D. Hao, T. Zhou, Comparing discriminating abilities of evaluation metrics in link prediction, J. Phys. Complex. 5 (2024) 025014. https://doi.org/10.1088/2632- 072X/ad46be

work page doi:10.1088/2632- 2024

[48] [48]

Y. Bi, X. Jiao, Y.-L. Lee, T. Zhou, Inconsistency among evaluation metrics in link prediction, PNAS Nexus 3 (2024) pgae498. https://doi.org/10.1093/pnasnexus/pgae498. 13

work page doi:10.1093/pnasnexus/pgae498 2024

[49] [49]

S. Wan, Y. Bi, X. Jiao, T. Zhou, Quantifying discriminability of evaluation metrics in link prediction for real networks, Chaos Solitons Fractals 199 (2025) 116864. https://doi.org/10.1016/j.chaos.2025.116864

work page doi:10.1016/j.chaos.2025.116864 2025

[50] [50]

Internationale Kommunikation

J. Kunegis, KONECT: the Koblenz network collection, in: Proceedings of the 22nd International Con- ference on World Wide Web, 2013, pp. 1343–1350. https://doi.org/10.1145/2487788.2488173

work page doi:10.1145/2487788.2488173 2013

[51] [51]

Rossi, N.K

R.A. Rossi, N.K. Ahmed, The Network Data Repository with Interactive Graph Analytics and Visualization, in: Proceedings of the 29th AAAI Conference on Artificial Intelligence, 2015. https://doi.org/10.1609/aaai.v29i1.9277. 14

work page doi:10.1609/aaai.v29i1.9277 2015