pith. sign in

arxiv: 2605.15511 · v1 · pith:UXKJKG3Gnew · submitted 2026-05-15 · 💻 cs.LG

OgBench: A Framework for Evaluating Graph Neural Networks on Omics Data

Pith reviewed 2026-05-19 15:40 UTC · model grok-4.3

classification 💻 cs.LG
keywords Graph Neural NetworksOmics DataBenchmarkingGraph-Level PredictionBiological NetworksMachine Learning BaselinesLow-Sample High-Dimensional Data
0
0 comments X

The pith

Graph neural networks often fail to outperform simple MLPs on omics data tasks with few samples and many nodes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a standardized benchmark for evaluating graph neural networks in the omics domain, where the number of patient samples is small compared to the number of genes or proteins. It constructs families of graphs from raw biological data and tests classical GNNs, specialized variants, MLPs, and other baselines on graph-level prediction tasks. Results indicate that standard GNNs frequently match or fall short of non-graph methods, which questions whether the added graph structure provides meaningful gains in this setting. A sympathetic reader would care because many current approaches in bioinformatics assume graphs help, and this finding points toward the need for methods better matched to high-dimensional, low-sample biological data.

Core claim

OgBench supplies an end-to-end pipeline that turns raw omics measurements into varied featured graphs, then measures the performance of GNNs against MLPs and classical baselines in the n much less than p regime. The central finding is that widely used GNNs do not reliably surpass simpler models, thereby challenging the idea that biological graph structure inherently improves predictive accuracy on such data.

What carries the argument

OgBench, a modular benchmarking platform that generates families of featured graphs from raw omics data and runs standardized graph-level prediction experiments.

If this is right

  • Simpler non-graph models should be included as strong baselines when applying machine learning to omics graphs.
  • New architectures for biological data must explicitly address the low-sample high-node regime rather than borrowing from dense-graph settings.
  • The value of incorporating graph structure from omics measurements requires fresh validation rather than being taken as given.
  • Development of omics-specific GNN variants can now be guided by the standardized evaluation setup provided.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If feature signals dominate over topology in these tasks, then methods that learn adaptive graph construction or edge weighting from data may prove more useful than fixed biological networks.
  • Extending the benchmark to additional omics modalities or multi-task settings could reveal whether the observed pattern holds beyond the current collection of datasets.
  • The results suggest that practitioners might first try classical feature-based models before investing in graph-based pipelines for similar biological prediction problems.

Load-bearing premise

The graphs derived from raw omics data encode biologically meaningful relationships that matter for the downstream prediction tasks.

What would settle it

A concrete test would be to run the benchmark on its provided datasets and observe whether any GNN architecture achieves statistically higher accuracy or AUC than the MLP baseline across repeated trials with fixed hyperparameters.

Figures

Figures reproduced from arXiv: 2605.15511 by Guillermo Bern\'ardez, Johan Mathe, Louisa Cornelis, Louis Van Langendonck, Nina Miolane.

Figure 1
Figure 1. Figure 1: Existing graph benchmarks operate in the n ≫ p regime, where the number of graphs n far exceeds the average number of nodes per graph p. Bar plot of n/p for benchmark graph classification datasets from [39, 25, 55, 18, 17]. On the other hand, existing GNN in￾ductive benchmarks—ranging from the recent GraphBench [55] to established ones like OGB [25], TUDataset [39], and LRGB [18]—predominantly operate in t… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of OgBench: First GNN benchmark platform for omics graph datasets. Left: Transcriptomics or proteomics expression data across p genes/proteins and n samples. Middle — 1) Co-expression or PPI graphs are constructed using classical omics approaches; each sample becomes a graph with nodes representing genes/proteins and normalized expression values as node features. 2) A model is trained on a graph-l… view at source ↗
Figure 3
Figure 3. Figure 3: Best model performance per dataset (selected by validation F1, error bars = std across 3 [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Best test F1 by node selection method and sampling ratio. For each model family, the configuration with the highest validation F1 is selected per method-ratio combination and evaluated on the test set. confidence intervals. On Parkinsons and BRCA, linear baselines remain competitive or superior. Clearly, more complex models (GPS, ChebNet, SAGN, MLA-GNN) do not guarantee better perfor￾mance, with rankings d… view at source ↗
Figure 5
Figure 5. Figure 5: Test F1 by readout type for each GNN backbone, sweeping over node selection method, [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Test F1 by edge construction method for each GNN backbone, sweeping over node selection [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Test F1-macro for validation-best (K=1) vs. top-K ensembles (K=3, 5, 10) across model [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of model performance under (a) traditional single-best-validation selection [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Validation rank vs. test F1 for all hyperparameter configurations (pooled: MLP + GNNs). [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Validation rank vs. test F1 (top 100 validation ranks only). All models show a wide (about [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Linear regression of node sample ratio vs. Test F1 Macro score. Each subplot shows fitted regression lines for all models within a specific dataset and graph construction method. Solid lines indicate p < 0.05; dashed lines indicate p ≥ 0.05. 42 [PITH_FULL_IMAGE:figures/full_fig_p042_11.png] view at source ↗
read the original abstract

Graph Neural Networks (GNNs) have become the dominant framework for inductive graph-level learning. Yet most benchmarks focus on the regime $n \gg p$, where the number of graphs $n$ greatly exceeds the number of nodes per graph $p$. This overlooks biological domains such as omics, which operate in the opposite $n \ll p$ regime, characterized by large graphs of genes, transcripts, or proteins across few patient samples. This raises the question: \textit{how do GNNs perform in this low-sample, high-node omics setting?} We introduce \texttt{OgBench} (Omics-Graph Bench), the first benchmarking platform for graph-level prediction in the $n \ll p$ regime characteristic of omics data. We provide a standardized, end-to-end modular infrastructure from raw omics data to families of featured graphs with varied structural properties. We benchmark classical GNNs, as well as GNNs designed for large graphs and omics applications, alongside MLPs and machine learning baselines to establish reference performances. Our results show that widely used GNNs often do not outperform simple MLPs and classical baselines. These findings challenge the prevailing assumption that graph structure inherently adds value in this domain, fostering a critical reassessment of current learning paradigms. Ultimately, by exposing these limitations, OgBench provides the open-source ecosystem necessary for the community to develop and validate novel architectures explicitly tailored for biological graphs. The code is available at https://github.com/geometric-intelligence/ogbench.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OgBench, a benchmarking framework for graph-level prediction with GNNs on omics data in the n ≪ p regime. It supplies a modular pipeline that converts raw omics data into families of featured graphs with controlled structural properties, then evaluates classical GNNs, large-graph GNNs, omics-specific models, MLPs, and classical ML baselines. The central empirical finding is that widely used GNNs frequently fail to outperform simple MLPs and non-graph baselines, which the authors interpret as evidence that graph structure does not inherently add value in this domain.

Significance. If the constructed graphs can be shown to encode biologically meaningful relationships, the result would be significant: it would supply the first standardized benchmark exposing limitations of current GNN architectures on high-dimensional biological graphs and would motivate the development of new inductive biases tailored to the omics setting. The open-source modular infrastructure is a concrete contribution that could accelerate such work. The current evidence, however, rests on unvalidated graph constructions, which weakens the force of the claim that graph structure itself is unhelpful.

major comments (2)
  1. [§3] §3 (Graph Construction): The manuscript states that families of graphs are built from raw omics data with varied structural properties, yet supplies no external validation—such as overlap with curated pathway databases, gene-set enrichment statistics, or expert review—that the retained edges capture biologically relevant interactions rather than statistical artifacts or arbitrary thresholds. Because the central claim (that GNNs add no value over MLPs) presupposes that the graphs encode task-relevant structure, this omission is load-bearing.
  2. [§5] §5 (Experimental Results): The reported comparisons lack details on statistical testing (e.g., paired t-tests or Wilcoxon tests across random seeds), exact sample sizes per dataset, and the precise graph-construction hyperparameters (thresholds, feature-selection criteria). Without these, it is impossible to judge whether the observed parity or underperformance of GNNs is robust or an artifact of particular dataset realizations.
minor comments (2)
  1. [Abstract] The abstract claims the code is available at the cited GitHub link, but the manuscript should include a permanent archive link (e.g., Zenodo DOI) to satisfy reproducibility standards.
  2. [Tables in §5] Notation for the n ≪ p regime is introduced in the abstract but not consistently reused in the experimental tables; adding a column or row label that explicitly flags this regime would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating where we agree that revisions are warranted to strengthen the work.

read point-by-point responses
  1. Referee: [§3] §3 (Graph Construction): The manuscript states that families of graphs are built from raw omics data with varied structural properties, yet supplies no external validation—such as overlap with curated pathway databases, gene-set enrichment statistics, or expert review—that the retained edges capture biologically relevant interactions rather than statistical artifacts or arbitrary thresholds. Because the central claim (that GNNs add no value over MLPs) presupposes that the graphs encode task-relevant structure, this omission is load-bearing.

    Authors: We thank the referee for this important observation. OgBench is constructed to provide modular control over graph families with differing structural properties (e.g., via varying correlation thresholds and feature-selection criteria) precisely so that the community can test the value of graph structure under different assumptions in the n ≪ p regime. We acknowledge that the absence of external validation against pathway databases or enrichment statistics makes it harder to interpret whether the reported parity between GNNs and MLPs reflects the limited utility of graph structure or the limitations of existing GNN inductive biases. In the revised manuscript we will add quantitative validation: overlap statistics with KEGG and Reactome pathways, as well as gene-set enrichment results for the retained edges across the graph families. These additions will be presented in a new subsection of §3 together with a discussion of how the validation affects the strength of the central claim. revision: yes

  2. Referee: [§5] §5 (Experimental Results): The reported comparisons lack details on statistical testing (e.g., paired t-tests or Wilcoxon tests across random seeds), exact sample sizes per dataset, and the precise graph-construction hyperparameters (thresholds, feature-selection criteria). Without these, it is impossible to judge whether the observed parity or underperformance of GNNs is robust or an artifact of particular dataset realizations.

    Authors: We agree that these experimental details are necessary for reproducibility and for readers to assess robustness. The current version reports mean performance but does not include formal statistical comparisons or the exact construction hyperparameters. In the revised manuscript we will expand §5 (and the supplementary material) to report: (i) paired t-tests and Wilcoxon signed-rank tests across at least five random seeds for all model comparisons, (ii) the precise values of n (number of graphs) and p (number of nodes) for every dataset, and (iii) the full list of graph-construction hyperparameters, including correlation thresholds, p-value cutoffs, and feature-selection procedures. These additions will allow direct evaluation of whether the observed results are stable across realizations. revision: yes

Circularity Check

0 steps flagged

Empirical benchmark with no load-bearing derivations or self-referential reductions

full rationale

The manuscript introduces OgBench as an empirical evaluation framework for GNNs on omics data in the n ≪ p regime. It constructs graph families from raw data, runs standard GNN and MLP baselines on public datasets, and reports comparative performance numbers. No equations, uniqueness theorems, fitted parameters renamed as predictions, or self-citation chains appear in the derivation of the central claim. The results are direct statistical comparisons against external open-source implementations and classical baselines; the claim that GNNs often fail to outperform MLPs follows from those measurements rather than from any internal redefinition or tautological reduction. The paper is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper contributes an empirical benchmark and infrastructure rather than new theoretical derivations; it relies on standard machine-learning evaluation practices and domain conventions for turning omics tables into graphs.

axioms (1)
  • domain assumption Omics measurements can be converted into graphs whose nodes are genes or proteins and whose edges reflect known or inferred biological relationships.
    This conversion step is required to produce the featured graphs used in all experiments.
invented entities (1)
  • OgBench framework no independent evidence
    purpose: Modular end-to-end infrastructure that converts raw omics data into families of graphs and runs standardized GNN and baseline evaluations.
    New platform introduced to address the lack of benchmarks in the n << p regime.

pith-pipeline@v0.9.0 · 5822 in / 1283 out tokens · 47879 ms · 2026-05-19T15:40:50.935953+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · 1 internal anchor

  1. [1]

    Agamah, Jumamurat R

    Francis E. Agamah, Jumamurat R. Bayjanov, Anna Niehues, Kelechi F. Njoku, Michelle Skelton, Gaston K. Mazandu, Thomas H. A. Ederveen, Nicola Mulder, Emile R. Chimusa, and Peter A. C. ’t Hoen. Computational approaches for network-based integrative multi-omics analysis. Frontiers in Molecular Biosciences, 9:967205, November 2022

  2. [2]

    Elbashir, and Mohanad Mo- hammed

    Fadi Alharbi, Aleksandar Vakanski, Boyu Zhang, Murtada K. Elbashir, and Mohanad Mo- hammed. Comparative analysis of multi-omics integration using graph neural networks for cancer classification.IEEE Access, 13:37724–37736, 2025

  3. [3]

    Network biology: understanding the cell’s functional organization.Nature reviews genetics, 5(2):101–113, 2004

    Albert-Laszlo Barabasi and Zoltan N Oltvai. Network biology: understanding the cell’s functional organization.Nature reviews genetics, 5(2):101–113, 2004

  4. [4]

    Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, and Christopher Morris

    Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael M. Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, and Christopher Morris. Position: Graph learning will lose relevance due to poor benchmarks, 2025

  5. [5]

    Bench- mark of filter methods for feature selection in high-dimensional gene expression survival data

    Andrea Bommert, Thomas Welchowski, Matthias Schmid, and Jörg Rahnenführer. Bench- mark of filter methods for feature selection in high-dimensional gene expression survival data. Briefings in Bioinformatics, 23(1):bbab354, 09 2021

  6. [6]

    How attentive are graph attention networks? In International Conference on Learning Representations, 2022

    Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks? In International Conference on Learning Representations, 2022

  7. [7]

    Should we really use graph neural networks for transcriptomic prediction?Briefings in Bioinformatics, 25(2):bbae027, 02 2024

    Céline Brouard, Raphaël Mourad, and Nathalie Vialaneix. Should we really use graph neural networks for transcriptomic prediction?Briefings in Bioinformatics, 25(2):bbae027, 02 2024

  8. [8]

    Hryhorii Chereda, Annalen Bleckmann, Kerstin Menck, Júlia Perera-Bel, Philip Stegmaier, Florian Auer, Frank Kramer, Andreas Leha, and Tim Beißbarth. Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer.Genome medicine, 13(1):42, 2021

  9. [9]

    Support-vector networks.Machine learning, 20(3):273– 297, 1995

    Corinna Cortes and Vladimir Vapnik. Support-vector networks.Machine learning, 20(3):273– 297, 1995

  10. [10]

    Creighton, Margaret Morgan, Preethi H

    Chad J. Creighton, Margaret Morgan, Preethi H. Gunaratne, David A. Wheeler, Richard A. Gibbs, A. Gordon Robertson, Andy Chu, Rameen Beroukhim, Kristian Cibulskis, Sabina Signoretti, Fabio Vandin Hsin-Ta Wu, Benjamin J. Raphael, Roel G. W. Verhaak, Pheroze Tam- boli, Wandaliz Torres-Garcia, Rehan Akbani, John N. Weinstein, Victor Reuter, James J. Hsieh, A....

  11. [11]

    Advances and trends in omics technology development.Frontiers in Medicine, V olume 9 - 2022, 2022

    Xiaofeng Dai and Li Shen. Advances and trends in omics technology development.Frontiers in Medicine, V olume 9 - 2022, 2022

  12. [12]

    The moca: well-suited screen for cognitive impairment in parkinson disease.Neurology, 75(19):1717–1725, 2010

    JC Dalrymple-Alford, MR MacAskill, CT Nakas, L Livingston, C Graham, GP Crucian, TR Melzer, J Kirwan, R Keenan, S Wells, et al. The moca: well-suited screen for cognitive impairment in parkinson disease.Neurology, 75(19):1717–1725, 2010

  13. [13]

    Clinical value of the montreal cognitive assessment (moca) in patients suspected of cognitive impairment in old age psychiatry

    Géraud Dautzenberg, Jeroen Lijmer, and Aartjan Beekman. Clinical value of the montreal cognitive assessment (moca) in patients suspected of cognitive impairment in old age psychiatry. using the moca for triaging to a memory clinic.Cognitive Neuropsychiatry, 26(1):1–17, 2021

  14. [14]

    Convolutional neural networks on graphs with fast localized spectral filtering

    Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. InProceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 3844–3852, Red Hook, NY , USA, 2016. Curran Associates Inc

  15. [15]

    Dobson and Andrew J

    Paul D. Dobson and Andrew J. Doig. Distinguishing enzyme structures from non-enzymes without alignments.Journal of Molecular Biology, 330(4):771–783, July 2003

  16. [16]

    A comprehensive study on large-scale graph training: Benchmarking and rethinking

    Keyu Duan, Zirui Liu, Peihao Wang, Wenqing Zheng, Kaixiong Zhou, Tianlong Chen, Xia Hu, and Zhangyang Wang. A comprehensive study on large-scale graph training: Benchmarking and rethinking. InThirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022

  17. [17]

    Benchmarking graph neural networks

    Vijay Prakash Dwivedi, Chaitanya K Joshi, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. Benchmarking graph neural networks. InInternational Conference on Learning Representations (ICLR), 2020

  18. [18]

    Long range graph benchmark

    Vijay Prakash Dwivedi, Ladislav Rampášek, Mikhail Galkin, Ali Parviz, Guy Wolf, Anh Tuan Luu, and Dominique Beaini. Long range graph benchmark. InThirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. 11

  19. [19]

    Cambridge University Press, Cambridge, UK, 2010

    David Easley and Jon Kleinberg.Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, Cambridge, UK, 2010

  20. [20]

    Fast Graph Representation Learning with PyTorch Geometric

    Matthias Fey and Jan Eric Lenssen. Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019

  21. [21]

    Harnessing the n+1 dimensions of single-cell omics data for the prediction and prevention of human diseases.Seminars in immunopathology, 45:1–2, 02 2023

    Dyani Gaudilliere and Brice Gaudilliere. Harnessing the n+1 dimensions of single-cell omics data for the prediction and prevention of human diseases.Seminars in immunopathology, 45:1–2, 02 2023

  22. [22]

    Schoenholz, Patrick F

    Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. Neural message passing for quantum chemistry. InProceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 1263–1272. JMLR.org, 2017

  23. [23]

    Hamilton, Rex Ying, and Jure Leskovec

    William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. InProceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY , USA, 2017. Curran Associates Inc

  24. [24]

    Springer, 2nd edition, 2010

    Trevor Hastie, Robert Tibshirani, and Jerome Friedman.The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2nd edition, 2010

  25. [25]

    Open graph benchmark: Datasets for machine learning on graphs

    Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Jure Liu, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs. InAdvances in Neural Information Processing Systems (NeurIPS), volume 33, pages 22118–22133, 2020

  26. [26]

    Youssef A Ismail, Huda A Auf, Shahd A Sadik, Nada M Ahmed, and Yasmeen Ali. Sensitivity and specificity of the montreal cognitive assessment using us national alzheimer coordinating centre uniform data set: a retrospective analysis of 16,309 participants.BMC neurology, 25(1):381, 2025

  27. [27]

    Network-based multi-omics integrative analysis methods in drug discovery: a systematic review.BioData Mining, 18(1):27, 2025

    Wei Jiang, Weicai Ye, Xiaoming Tan, and Yun-Juan Bao. Network-based multi-omics integrative analysis methods in drug discovery: a systematic review.BioData Mining, 18(1):27, 2025

  28. [28]

    Lee W Jones, Neil D Eves, Bercedis L Peterson, Jennifer Garst, Jeffrey Crawford, Miranda J West, Stephanie Mabe, David Harpole, William E Kraus, and Pamela S Douglas. Safety and feasibility of aerobic training on cardiopulmonary function and quality of life in postsurgical nonsmall cell lung cancer patients: a pilot study.Cancer, 113(12):3430–3439, 2008

  29. [29]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInterna- tional Conference on Learning Representations (ICLR), 2015

  30. [30]

    Kipf and Max Welling

    Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. InInternational Conference on Learning Representations, 2017

  31. [31]

    Bernhard Kuster, Johanna Tüshaus, and Florian P. Bayer. A new mass analyzer shakes up the proteomics field.Nature Biotechnology, February 2024

  32. [32]

    Justine Labory, Evariste Njomgue-Fotso, and Silvia Bottini. Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data.Computational and Structural Biotechnology Journal, 23:1274–1287, 2024

  33. [33]

    Graph neural networks for single-cell omics data: a review of approaches and applications.Briefings in Bioinformatics, 26(2):bbaf109, 03 2025

    Sijie Li, Heyang Hua, and Shengquan Chen. Graph neural networks for single-cell omics data: a review of approaches and applications.Briefings in Bioinformatics, 26(2):bbaf109, 03 2025

  34. [34]

    Benchmark study of feature selection strategies for multi-omics data.BMC bioinformatics, 23(1):412, 2022

    Yingxia Li, Ulrich Mansmann, Shangming Du, and Roman Hornung. Benchmark study of feature selection strategies for multi-omics data.BMC bioinformatics, 23(1):412, 2022

  35. [35]

    Scdd: a novel single-cell rna-seq imputation method with diffusion and denoising.Briefings in Bioinformatics, 23(5):bbac398, 09 2022

    Jian Liu, Yichen Pan, Zhihan Ruan, and Jun Guo. Scdd: a novel single-cell rna-seq imputation method with diffusion and denoising.Briefings in Bioinformatics, 23(5):bbac398, 09 2022

  36. [36]

    Addneuromed—the european collaboration for the discovery of novel biomarkers for alzheimer’s disease.Annals of the New York Academy of Sciences, 1180(1):36–46, 2009

    Simon Lovestone, Paul Francis, Iwona Kloszewska, Patrizia Mecocci, Andrew Simmons, Hilkka Soininen, Christian Spenger, Magda Tsolaki, Bruno Vellas, Lars-Olof Wahlund, Malcolm Ward, and on behalf of the AddNeuroMed Consortium. Addneuromed—the european collaboration for the discovery of novel biomarkers for alzheimer’s disease.Annals of the New York Academy...

  37. [37]

    Correlation-based feature selection of single cell transcriptomics data from multiple sources.Journal of Big Data, 12(1):4, 2025

    Nenad S Miti´c, Saša N Malkov, Mirjana M Maljkovi´c Ružiˇci´c, Aleksandar N Veljkovi´c, Ivan Lj ˇCuki´c, Xin Lin, Minjie Lyu, and Vladimir Brusi´c. Correlation-based feature selection of single cell transcriptomics data from multiple sources.Journal of Big Data, 12(1):4, 2025

  38. [38]

    Advancements in single-cell rna sequenc- ing and spatial transcriptomics: transforming biomedical research.Acta Biochimica Polonica, 72:13922, 2025

    Getnet Molla Desta and Alemayehu Godana Birhanu. Advancements in single-cell rna sequenc- ing and spatial transcriptomics: transforming biomedical research.Acta Biochimica Polonica, 72:13922, 2025

  39. [39]

    Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann

    Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs. InICML 2020 Workshop on Graph Representation Learning and Beyond (GRL+ 2020), 2020

  40. [40]

    The montreal cognitive assessment, moca: a brief screening tool for mild cognitive impairment.Journal of the American Geriatrics Society, 53(4):695–699, 2005

    Ziad S Nasreddine, Natalie A Phillips, Valérie Bédirian, Simon Charbonneau, Victor Whitehead, Isabelle Collin, Jeffrey L Cummings, and Howard Chertkow. The montreal cognitive assessment, moca: a brief screening tool for mild cognitive impairment.Journal of the American Geriatrics Society, 53(4):695–699, 2005

  41. [41]

    HINT: a database of annotated protein-protein interactions and their homologs.Biophysics, 1:21–24, 2005

    Ashwini Patil and Haruki Nakamura. HINT: a database of annotated protein-protein interactions and their homologs.Biophysics, 1:21–24, 2005

  42. [42]

    Hitz, and Edward Audain

    Yasset Perez-Riverol, Moritz Kuhn, Juan Antonio Vizcaíno, Martin P. Hitz, and Edward Audain. Accurate and fast feature selection workflow for high-dimensional omics data.PLOS ONE, 12(12):e0189875, 2017

  43. [43]

    Classification of cancer types using graph convolutional neural networks.Frontiers in Physics, 8:203, 06 2020

    Ricardo Ramirez, Yu-Chiao Chiu, Allen Hererra, Milad Mostavi, Joshua Ramirez, Yidong Chen, Yufei Huang, and Yu-Fang Jin. Classification of cancer types using graph convolutional neural networks.Frontiers in Physics, 8:203, 06 2020

  44. [44]

    Recipe for a general, powerful, scalable graph transformer.Advances in Neural Information Processing Systems, 35:14501–14515, 2022

    Ladislav Rampášek, Michael Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. Recipe for a general, powerful, scalable graph transformer.Advances in Neural Information Processing Systems, 35:14501–14515, 2022

  45. [45]

    Imputing single-cell rna-seq data by combining graph convolution and autoencoder neural networks.iScience, 24(5):102393, 2021

    Jiahua Rao, Xiang Zhou, Yutong Lu, Huiying Zhao, and Yuedong Yang. Imputing single-cell rna-seq data by combining graph convolution and autoencoder neural networks.iScience, 24(5):102393, 2021

  46. [46]

    Hybrid approach of relation network and localized graph convolutional filtering for breast cancer subtype classification

    Sungmin Rhee, Seokjun Seo, and Sun Kim. Hybrid approach of relation network and localized graph convolutional filtering for breast cancer subtype classification. InProceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 3527–3534. International Joint Conferences on Artificial Intelligence Organization, 7 2018

  47. [47]

    Human plasma proteomic profiles indicative of cardiorespiratory fitness.Nature metabolism, 3(6):786–797, 2021

    Jeremy M Robbins, Bennet Peterson, Daniela Schranner, Usman A Tahir, Theresa Rienmüller, Shuliang Deng, Michelle J Keyes, Daniel H Katz, Pierre M Jean Beltran, Jacob L Barber, et al. Human plasma proteomic profiles indicative of cardiorespiratory fitness.Nature metabolism, 3(6):786–797, 2021

  48. [48]

    Proteome-wide prediction of the mode of inheritance and molecular mechanisms underlying genetic diseases using structural interactomics.iScience, 28(7):112812, 2025

    Ali Saadat and Jacques Fellay. Proteome-wide prediction of the mode of inheritance and molecular mechanisms underlying genetic diseases using structural interactomics.iScience, 28(7):112812, 2025

  49. [49]

    Next-generation sequencing technology: current trends and advancements.Biology, 12(7):997, 2023

    Heena Satam, Kandarp Joshi, Upasana Mangrolia, Sanober Waghoo, Gulnaz Zaidi, Shravani Rawool, Ritesh P Thakare, Shahid Banday, Alok K Mishra, Gautam Das, et al. Next-generation sequencing technology: current trends and advancements.Biology, 12(7):997, 2023

  50. [50]

    The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009

    Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009

  51. [51]

    Brenda, the enzyme database: updates and major new developments

    Ida Schomburg, Antje Chang, Christian Ebeling, Marion Gremse, Christian Heldt, Gregor Huhn, and Dietmar Schomburg. Brenda, the enzyme database: updates and major new developments. Nucleic acids research, 32(suppl_1):D431–D433, 2004

  52. [52]

    Analysis of blood-based gene expression in idiopathic parkinson disease.Neurology, 89(16):1676–1683, 2017

    Roded Shamir, Christine Klein, David Amar, Eva J V ollstedt, et al. Analysis of blood-based gene expression in idiopathic parkinson disease.Neurology, 89(16):1676–1683, 2017. 13

  53. [53]

    High-throughput sequencing for biology and medicine.Molecular Systems Biology, 9(1):640, 2013

    Wendy Weijia Soon, Manoj Hariharan, and Michael P Snyder. High-throughput sequencing for biology and medicine.Molecular Systems Biology, 9(1):640, 2013

  54. [54]

    Stokes, Kevin Yang, Kyle Swanson, Wengong Jin, Andres Cubillos-Ruiz, Nina M

    Jonathan M. Stokes, Kevin Yang, Kyle Swanson, Wengong Jin, Andres Cubillos-Ruiz, Nina M. Donghia, Craig R. MacNair, Shawn French, Lindsey A. Carfrae, Zohar Bloom-Ackermann, Victoria M. Tran, Anush Chiappino-Pepe, Ahmed H. Badran, Ian W. Andrews, Emma J. Chory, George M. Church, Eric D. Brown, Tommi S. Jaakkola, Regina Barzilay, and James J. Collins. A dee...

  55. [55]

    Graphbench: Next-generation graph learning benchmarking, 2025

    Timo Stoll, Chendi Qian, Ben Finkelshtein, Ali Parviz, Darius Weber, Fabrizio Frasca, Hadar Shavit, Antoine Siraudin, Arman Mielke, Marie Anastacio, Erik Müller, Maya Bechler-Speicher, Michael Bronstein, Mikhail Galkin, Holger Hoos, Mathias Niepert, Bryan Perozzi, Jan Tönshoff, and Christopher Morris. Graphbench: Next-generation graph learning benchmarking, 2025

  56. [56]

    Scalable and adaptive graph neural networks with self-label-enhanced training, 2021

    Chuxiong Sun, Hongming Gu, and Jie Hu. Scalable and adaptive graph neural networks with self-label-enhanced training, 2021

  57. [57]

    Measuring and testing dependence by correlation of distances.The Annals of Statistics, pages 2769–2794, 2007

    Gábor J Székely, Maria L Rizzo, and Nail K Bakirov. Measuring and testing dependence by correlation of distances.The Annals of Statistics, pages 2769–2794, 2007

  58. [58]

    The string database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest

    Damian Szklarczyk, Rebecca Kirsch, Mikaela Koutrouli, Katerina Nastou, Farrokh Mehryary, Radja Hachilif, Annika L Gable, Tao Fang, Nadezhda T Doncheva, Sampo Pyysalo, Peer Bork, Lars J Jensen, and Christian von Mering. The string database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest....

  59. [59]

    Amogel: a multi-omics classification framework using associative graph neural networks with prior knowledge for biomarker identification.BMC bioinformatics, 26(1):1–27, 2025

    Chia Yan Tan, Huey Fang Ong, Chern Hong Lim, Mei Sze Tan, Ean Hin Ooi, and KokSheik Wong. Amogel: a multi-omics classification framework using associative graph neural networks with prior knowledge for biomarker identification.BMC bioinformatics, 26(1):1–27, 2025

  60. [60]

    Lev Telyatnikov, Guillermo Bernardez, Marco Montagna, Mustafa Hajij, Martin Carrasco, Pavlo Vasylenko, Mathilde Papillon, Ghada Zamzmi, Michael T Schaub, Jonas Verhellen, Pavel Snopov, Bertran Miquel-Oliver, Manel Gil-Sorribes, Alexis Molina, VICTOR GUAL- LAR, Theodore Long, Julian Suk, Patryk Rygiel, Alexander V Nikitin, Giordan Escalona, Michael Banf, D...

  61. [61]

    Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Series B, 58(1):267–288, 1996

    Robert Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Series B, 58(1):267–288, 1996

  62. [62]

    Gene co-expression analysis for functional classification and gene–disease predictions.Briefings in Bioinformatics, 19(4):575–592, 01 2017

    Sipko van Dam, Urmo Võsa, Adriaan van der Graaf, Lude Franke, and João Pedro de Magalhães. Gene co-expression analysis for functional classification and gene–disease predictions.Briefings in Bioinformatics, 19(4):575–592, 01 2017

  63. [63]

    A cancer survival prediction method based on graph convolutional network.IEEE transactions on nanobioscience, 19(1):117–126, 2019

    Chunyu Wang, Junling Guo, Ning Zhao, Yang Liu, Xiaoyan Liu, Guojun Liu, and Maozu Guo. A cancer survival prediction method based on graph convolutional network.IEEE transactions on nanobioscience, 19(1):117–126, 2019

  64. [64]

    scgnn is a novel graph neural network framework for single-cell rna-seq analyses.Nature communications, 12(1):1882, 2021

    Juexin Wang, Anjun Ma, Yuzhou Chang, Jianting Gong, Yuexu Jiang, Ren Qi, Cankun Wang, Hongjun Fu, Qin Ma, and Dong Xu. scgnn is a novel graph neural network framework for single-cell rna-seq analyses.Nature communications, 12(1):1882, 2021

  65. [65]

    More: a multi-omics data-driven hypergraph integration network for biomedical data classification and biomarker identification.Briefings in Bioinformatics, 26(1):bbae658, 12 2024

    Yuhan Wang, Zhikang Wang, Xuan Yu, Xiaoyu Wang, Jiangning Song, Dong-Jun Yu, and Fang Ge. More: a multi-omics data-driven hypergraph integration network for biomedical data classification and biomarker identification.Briefings in Bioinformatics, 26(1):bbae658, 12 2024

  66. [66]

    Health benefits of physical activity: the evidence.Cmaj, 174(6):801–809, 2006

    Darren ER Warburton, Crystal Whitney Nicol, and Shannon SD Bredin. Health benefits of physical activity: the evidence.Cmaj, 174(6):801–809, 2006

  67. [67]

    Prescribing exercise as preventive therapy.Cmaj, 174(7):961–974, 2006

    Darren ER Warburton, Crystal Whitney Nicol, and Shannon SD Bredin. Prescribing exercise as preventive therapy.Cmaj, 174(7):961–974, 2006. 14

  68. [68]

    Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis.Bioinformatics, 38, 02 2022

    Xiaohan Xing, Fan Yang, Hang Li, Jun Zhang, Yu Zhao, Mingxuan Gao, Junzhou Huang, and Jianhua Yao. Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis.Bioinformatics, 38, 02 2022

  69. [69]

    How powerful are graph neural networks? InInternational Conference on Learning Representations, 2019

    Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InInternational Conference on Learning Representations, 2019

  70. [70]

    Deep representation learning of protein-protein interaction networks for enhanced pattern discovery.Science Advances, 10(51):eadq4324, 2024

    Rui Yan, Md Tauhidul Islam, and Lei Xing. Deep representation learning of protein-protein interaction networks for enhanced pattern discovery.Science Advances, 10(51):eadq4324, 2024

  71. [71]

    Motgnn: interpretable graph neural networks for multi-omics disease classification.arXiv preprint arXiv:2508.07465, 2025

    Tiantian Yang and Zhiqian Chen. Motgnn: interpretable graph neural networks for multi-omics disease classification.arXiv preprint arXiv:2508.07465, 2025

  72. [72]

    Ziwei Yang, Rikuto Kotoge, Xihao Piao, Zheng Chen, Lingwei Zhu, Peng Gao, Yasuko Matsubara, Yasushi Sakurai, and J. Sun. Mlomics: Cancer multi-omics database for machine learning.Scientific Data, 12, 05 2025

  73. [73]

    Assessing and mitigating batch effects in large-scale omics studies.Genome biology, 25(1):254, 2024

    Ying Yu, Yuanbang Mai, Yuanting Zheng, and Leming Shi. Assessing and mitigating batch effects in large-scale omics studies.Genome biology, 25(1):254, 2024

  74. [74]

    GraphSAINT: Graph sampling based inductive learning method

    Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. GraphSAINT: Graph sampling based inductive learning method. InInternational Conference on Learning Representations, 2020

  75. [75]

    Graph neural networks and their current applications in bioinformatics.Frontiers in Genetics, V olume 12 - 2021, 2021

    Xiao-Meng Zhang, Li Liang, Lin Liu, and Ming-Jing Tang. Graph neural networks and their current applications in bioinformatics.Frontiers in Genetics, V olume 12 - 2021, 2021

  76. [76]

    Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2):301–320, 2005

    Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2):301–320, 2005. 15 A Dataset Preprocessing In depth preprocessing steps for each dataset are included below. A.1 Heritage We downloaded the MoTrPAC HERITAGE SomaLogic proteomics matrix and an...

  77. [77]

    Rank all hyperparameter configurations by mean validation F1 across 3 random seeds

  78. [78]

    Select top-K configurations (K∈ {1,3,5,10})

  79. [79]

    For each seed independently: • Load checkpoints for the top-K configs (all trained with that seed) • Obtain class probability predictions on the test set from each checkpoint • Compute ensemble prediction via soft voting:ˆyens =argmax 1 K PK k=1 pk(y|x) • Compute test F1-macro for the ensemble

  80. [80]

    D.3 Results Figure 7 compares single-best-validation selection (K=1, black bars) against ensembles of increasing size

    Report mean±std of ensemble test F1 across the 3 seeds Note that seeds remain independent: we ensemble within each seed’s checkpoints and average performance across seeds, preserving valid uncertainty quantification. D.3 Results Figure 7 compares single-best-validation selection (K=1, black bars) against ensembles of increasing size. In Figure 8 we show K...