OgBench: A Framework for Evaluating Graph Neural Networks on Omics Data
Pith reviewed 2026-05-19 15:40 UTC · model grok-4.3
The pith
Graph neural networks often fail to outperform simple MLPs on omics data tasks with few samples and many nodes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
OgBench supplies an end-to-end pipeline that turns raw omics measurements into varied featured graphs, then measures the performance of GNNs against MLPs and classical baselines in the n much less than p regime. The central finding is that widely used GNNs do not reliably surpass simpler models, thereby challenging the idea that biological graph structure inherently improves predictive accuracy on such data.
What carries the argument
OgBench, a modular benchmarking platform that generates families of featured graphs from raw omics data and runs standardized graph-level prediction experiments.
If this is right
- Simpler non-graph models should be included as strong baselines when applying machine learning to omics graphs.
- New architectures for biological data must explicitly address the low-sample high-node regime rather than borrowing from dense-graph settings.
- The value of incorporating graph structure from omics measurements requires fresh validation rather than being taken as given.
- Development of omics-specific GNN variants can now be guided by the standardized evaluation setup provided.
Where Pith is reading between the lines
- If feature signals dominate over topology in these tasks, then methods that learn adaptive graph construction or edge weighting from data may prove more useful than fixed biological networks.
- Extending the benchmark to additional omics modalities or multi-task settings could reveal whether the observed pattern holds beyond the current collection of datasets.
- The results suggest that practitioners might first try classical feature-based models before investing in graph-based pipelines for similar biological prediction problems.
Load-bearing premise
The graphs derived from raw omics data encode biologically meaningful relationships that matter for the downstream prediction tasks.
What would settle it
A concrete test would be to run the benchmark on its provided datasets and observe whether any GNN architecture achieves statistically higher accuracy or AUC than the MLP baseline across repeated trials with fixed hyperparameters.
Figures
read the original abstract
Graph Neural Networks (GNNs) have become the dominant framework for inductive graph-level learning. Yet most benchmarks focus on the regime $n \gg p$, where the number of graphs $n$ greatly exceeds the number of nodes per graph $p$. This overlooks biological domains such as omics, which operate in the opposite $n \ll p$ regime, characterized by large graphs of genes, transcripts, or proteins across few patient samples. This raises the question: \textit{how do GNNs perform in this low-sample, high-node omics setting?} We introduce \texttt{OgBench} (Omics-Graph Bench), the first benchmarking platform for graph-level prediction in the $n \ll p$ regime characteristic of omics data. We provide a standardized, end-to-end modular infrastructure from raw omics data to families of featured graphs with varied structural properties. We benchmark classical GNNs, as well as GNNs designed for large graphs and omics applications, alongside MLPs and machine learning baselines to establish reference performances. Our results show that widely used GNNs often do not outperform simple MLPs and classical baselines. These findings challenge the prevailing assumption that graph structure inherently adds value in this domain, fostering a critical reassessment of current learning paradigms. Ultimately, by exposing these limitations, OgBench provides the open-source ecosystem necessary for the community to develop and validate novel architectures explicitly tailored for biological graphs. The code is available at https://github.com/geometric-intelligence/ogbench.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces OgBench, a benchmarking framework for graph-level prediction with GNNs on omics data in the n ≪ p regime. It supplies a modular pipeline that converts raw omics data into families of featured graphs with controlled structural properties, then evaluates classical GNNs, large-graph GNNs, omics-specific models, MLPs, and classical ML baselines. The central empirical finding is that widely used GNNs frequently fail to outperform simple MLPs and non-graph baselines, which the authors interpret as evidence that graph structure does not inherently add value in this domain.
Significance. If the constructed graphs can be shown to encode biologically meaningful relationships, the result would be significant: it would supply the first standardized benchmark exposing limitations of current GNN architectures on high-dimensional biological graphs and would motivate the development of new inductive biases tailored to the omics setting. The open-source modular infrastructure is a concrete contribution that could accelerate such work. The current evidence, however, rests on unvalidated graph constructions, which weakens the force of the claim that graph structure itself is unhelpful.
major comments (2)
- [§3] §3 (Graph Construction): The manuscript states that families of graphs are built from raw omics data with varied structural properties, yet supplies no external validation—such as overlap with curated pathway databases, gene-set enrichment statistics, or expert review—that the retained edges capture biologically relevant interactions rather than statistical artifacts or arbitrary thresholds. Because the central claim (that GNNs add no value over MLPs) presupposes that the graphs encode task-relevant structure, this omission is load-bearing.
- [§5] §5 (Experimental Results): The reported comparisons lack details on statistical testing (e.g., paired t-tests or Wilcoxon tests across random seeds), exact sample sizes per dataset, and the precise graph-construction hyperparameters (thresholds, feature-selection criteria). Without these, it is impossible to judge whether the observed parity or underperformance of GNNs is robust or an artifact of particular dataset realizations.
minor comments (2)
- [Abstract] The abstract claims the code is available at the cited GitHub link, but the manuscript should include a permanent archive link (e.g., Zenodo DOI) to satisfy reproducibility standards.
- [Tables in §5] Notation for the n ≪ p regime is introduced in the abstract but not consistently reused in the experimental tables; adding a column or row label that explicitly flags this regime would improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating where we agree that revisions are warranted to strengthen the work.
read point-by-point responses
-
Referee: [§3] §3 (Graph Construction): The manuscript states that families of graphs are built from raw omics data with varied structural properties, yet supplies no external validation—such as overlap with curated pathway databases, gene-set enrichment statistics, or expert review—that the retained edges capture biologically relevant interactions rather than statistical artifacts or arbitrary thresholds. Because the central claim (that GNNs add no value over MLPs) presupposes that the graphs encode task-relevant structure, this omission is load-bearing.
Authors: We thank the referee for this important observation. OgBench is constructed to provide modular control over graph families with differing structural properties (e.g., via varying correlation thresholds and feature-selection criteria) precisely so that the community can test the value of graph structure under different assumptions in the n ≪ p regime. We acknowledge that the absence of external validation against pathway databases or enrichment statistics makes it harder to interpret whether the reported parity between GNNs and MLPs reflects the limited utility of graph structure or the limitations of existing GNN inductive biases. In the revised manuscript we will add quantitative validation: overlap statistics with KEGG and Reactome pathways, as well as gene-set enrichment results for the retained edges across the graph families. These additions will be presented in a new subsection of §3 together with a discussion of how the validation affects the strength of the central claim. revision: yes
-
Referee: [§5] §5 (Experimental Results): The reported comparisons lack details on statistical testing (e.g., paired t-tests or Wilcoxon tests across random seeds), exact sample sizes per dataset, and the precise graph-construction hyperparameters (thresholds, feature-selection criteria). Without these, it is impossible to judge whether the observed parity or underperformance of GNNs is robust or an artifact of particular dataset realizations.
Authors: We agree that these experimental details are necessary for reproducibility and for readers to assess robustness. The current version reports mean performance but does not include formal statistical comparisons or the exact construction hyperparameters. In the revised manuscript we will expand §5 (and the supplementary material) to report: (i) paired t-tests and Wilcoxon signed-rank tests across at least five random seeds for all model comparisons, (ii) the precise values of n (number of graphs) and p (number of nodes) for every dataset, and (iii) the full list of graph-construction hyperparameters, including correlation thresholds, p-value cutoffs, and feature-selection procedures. These additions will allow direct evaluation of whether the observed results are stable across realizations. revision: yes
Circularity Check
Empirical benchmark with no load-bearing derivations or self-referential reductions
full rationale
The manuscript introduces OgBench as an empirical evaluation framework for GNNs on omics data in the n ≪ p regime. It constructs graph families from raw data, runs standard GNN and MLP baselines on public datasets, and reports comparative performance numbers. No equations, uniqueness theorems, fitted parameters renamed as predictions, or self-citation chains appear in the derivation of the central claim. The results are direct statistical comparisons against external open-source implementations and classical baselines; the claim that GNNs often fail to outperform MLPs follows from those measurements rather than from any internal redefinition or tautological reduction. The paper is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Omics measurements can be converted into graphs whose nodes are genes or proteins and whose edges reflect known or inferred biological relationships.
invented entities (1)
-
OgBench framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Francis E. Agamah, Jumamurat R. Bayjanov, Anna Niehues, Kelechi F. Njoku, Michelle Skelton, Gaston K. Mazandu, Thomas H. A. Ederveen, Nicola Mulder, Emile R. Chimusa, and Peter A. C. ’t Hoen. Computational approaches for network-based integrative multi-omics analysis. Frontiers in Molecular Biosciences, 9:967205, November 2022
work page 2022
-
[2]
Elbashir, and Mohanad Mo- hammed
Fadi Alharbi, Aleksandar Vakanski, Boyu Zhang, Murtada K. Elbashir, and Mohanad Mo- hammed. Comparative analysis of multi-omics integration using graph neural networks for cancer classification.IEEE Access, 13:37724–37736, 2025
work page 2025
-
[3]
Albert-Laszlo Barabasi and Zoltan N Oltvai. Network biology: understanding the cell’s functional organization.Nature reviews genetics, 5(2):101–113, 2004
work page 2004
-
[4]
Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, and Christopher Morris
Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael M. Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, and Christopher Morris. Position: Graph learning will lose relevance due to poor benchmarks, 2025
work page 2025
-
[5]
Andrea Bommert, Thomas Welchowski, Matthias Schmid, and Jörg Rahnenführer. Bench- mark of filter methods for feature selection in high-dimensional gene expression survival data. Briefings in Bioinformatics, 23(1):bbab354, 09 2021
work page 2021
-
[6]
Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks? In International Conference on Learning Representations, 2022
work page 2022
-
[7]
Céline Brouard, Raphaël Mourad, and Nathalie Vialaneix. Should we really use graph neural networks for transcriptomic prediction?Briefings in Bioinformatics, 25(2):bbae027, 02 2024
work page 2024
-
[8]
Hryhorii Chereda, Annalen Bleckmann, Kerstin Menck, Júlia Perera-Bel, Philip Stegmaier, Florian Auer, Frank Kramer, Andreas Leha, and Tim Beißbarth. Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer.Genome medicine, 13(1):42, 2021
work page 2021
-
[9]
Support-vector networks.Machine learning, 20(3):273– 297, 1995
Corinna Cortes and Vladimir Vapnik. Support-vector networks.Machine learning, 20(3):273– 297, 1995
work page 1995
-
[10]
Creighton, Margaret Morgan, Preethi H
Chad J. Creighton, Margaret Morgan, Preethi H. Gunaratne, David A. Wheeler, Richard A. Gibbs, A. Gordon Robertson, Andy Chu, Rameen Beroukhim, Kristian Cibulskis, Sabina Signoretti, Fabio Vandin Hsin-Ta Wu, Benjamin J. Raphael, Roel G. W. Verhaak, Pheroze Tam- boli, Wandaliz Torres-Garcia, Rehan Akbani, John N. Weinstein, Victor Reuter, James J. Hsieh, A....
work page 2013
-
[11]
Advances and trends in omics technology development.Frontiers in Medicine, V olume 9 - 2022, 2022
Xiaofeng Dai and Li Shen. Advances and trends in omics technology development.Frontiers in Medicine, V olume 9 - 2022, 2022
work page 2022
-
[12]
JC Dalrymple-Alford, MR MacAskill, CT Nakas, L Livingston, C Graham, GP Crucian, TR Melzer, J Kirwan, R Keenan, S Wells, et al. The moca: well-suited screen for cognitive impairment in parkinson disease.Neurology, 75(19):1717–1725, 2010
work page 2010
-
[13]
Géraud Dautzenberg, Jeroen Lijmer, and Aartjan Beekman. Clinical value of the montreal cognitive assessment (moca) in patients suspected of cognitive impairment in old age psychiatry. using the moca for triaging to a memory clinic.Cognitive Neuropsychiatry, 26(1):1–17, 2021
work page 2021
-
[14]
Convolutional neural networks on graphs with fast localized spectral filtering
Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. InProceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 3844–3852, Red Hook, NY , USA, 2016. Curran Associates Inc
work page 2016
-
[15]
Paul D. Dobson and Andrew J. Doig. Distinguishing enzyme structures from non-enzymes without alignments.Journal of Molecular Biology, 330(4):771–783, July 2003
work page 2003
-
[16]
A comprehensive study on large-scale graph training: Benchmarking and rethinking
Keyu Duan, Zirui Liu, Peihao Wang, Wenqing Zheng, Kaixiong Zhou, Tianlong Chen, Xia Hu, and Zhangyang Wang. A comprehensive study on large-scale graph training: Benchmarking and rethinking. InThirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022
work page 2022
-
[17]
Benchmarking graph neural networks
Vijay Prakash Dwivedi, Chaitanya K Joshi, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. Benchmarking graph neural networks. InInternational Conference on Learning Representations (ICLR), 2020
work page 2020
-
[18]
Vijay Prakash Dwivedi, Ladislav Rampášek, Mikhail Galkin, Ali Parviz, Guy Wolf, Anh Tuan Luu, and Dominique Beaini. Long range graph benchmark. InThirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022. 11
work page 2022
-
[19]
Cambridge University Press, Cambridge, UK, 2010
David Easley and Jon Kleinberg.Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, Cambridge, UK, 2010
work page 2010
-
[20]
Fast Graph Representation Learning with PyTorch Geometric
Matthias Fey and Jan Eric Lenssen. Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1903
-
[21]
Dyani Gaudilliere and Brice Gaudilliere. Harnessing the n+1 dimensions of single-cell omics data for the prediction and prevention of human diseases.Seminars in immunopathology, 45:1–2, 02 2023
work page 2023
-
[22]
Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. Neural message passing for quantum chemistry. InProceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 1263–1272. JMLR.org, 2017
work page 2017
-
[23]
Hamilton, Rex Ying, and Jure Leskovec
William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. InProceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY , USA, 2017. Curran Associates Inc
work page 2017
-
[24]
Trevor Hastie, Robert Tibshirani, and Jerome Friedman.The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2nd edition, 2010
work page 2010
-
[25]
Open graph benchmark: Datasets for machine learning on graphs
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Jure Liu, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs. InAdvances in Neural Information Processing Systems (NeurIPS), volume 33, pages 22118–22133, 2020
work page 2020
-
[26]
Youssef A Ismail, Huda A Auf, Shahd A Sadik, Nada M Ahmed, and Yasmeen Ali. Sensitivity and specificity of the montreal cognitive assessment using us national alzheimer coordinating centre uniform data set: a retrospective analysis of 16,309 participants.BMC neurology, 25(1):381, 2025
work page 2025
-
[27]
Wei Jiang, Weicai Ye, Xiaoming Tan, and Yun-Juan Bao. Network-based multi-omics integrative analysis methods in drug discovery: a systematic review.BioData Mining, 18(1):27, 2025
work page 2025
-
[28]
Lee W Jones, Neil D Eves, Bercedis L Peterson, Jennifer Garst, Jeffrey Crawford, Miranda J West, Stephanie Mabe, David Harpole, William E Kraus, and Pamela S Douglas. Safety and feasibility of aerobic training on cardiopulmonary function and quality of life in postsurgical nonsmall cell lung cancer patients: a pilot study.Cancer, 113(12):3430–3439, 2008
work page 2008
-
[29]
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInterna- tional Conference on Learning Representations (ICLR), 2015
work page 2015
-
[30]
Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. InInternational Conference on Learning Representations, 2017
work page 2017
-
[31]
Bernhard Kuster, Johanna Tüshaus, and Florian P. Bayer. A new mass analyzer shakes up the proteomics field.Nature Biotechnology, February 2024
work page 2024
-
[32]
Justine Labory, Evariste Njomgue-Fotso, and Silvia Bottini. Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data.Computational and Structural Biotechnology Journal, 23:1274–1287, 2024
work page 2024
-
[33]
Sijie Li, Heyang Hua, and Shengquan Chen. Graph neural networks for single-cell omics data: a review of approaches and applications.Briefings in Bioinformatics, 26(2):bbaf109, 03 2025
work page 2025
-
[34]
Yingxia Li, Ulrich Mansmann, Shangming Du, and Roman Hornung. Benchmark study of feature selection strategies for multi-omics data.BMC bioinformatics, 23(1):412, 2022
work page 2022
-
[35]
Jian Liu, Yichen Pan, Zhihan Ruan, and Jun Guo. Scdd: a novel single-cell rna-seq imputation method with diffusion and denoising.Briefings in Bioinformatics, 23(5):bbac398, 09 2022
work page 2022
-
[36]
Simon Lovestone, Paul Francis, Iwona Kloszewska, Patrizia Mecocci, Andrew Simmons, Hilkka Soininen, Christian Spenger, Magda Tsolaki, Bruno Vellas, Lars-Olof Wahlund, Malcolm Ward, and on behalf of the AddNeuroMed Consortium. Addneuromed—the european collaboration for the discovery of novel biomarkers for alzheimer’s disease.Annals of the New York Academy...
work page 2009
-
[37]
Nenad S Miti´c, Saša N Malkov, Mirjana M Maljkovi´c Ružiˇci´c, Aleksandar N Veljkovi´c, Ivan Lj ˇCuki´c, Xin Lin, Minjie Lyu, and Vladimir Brusi´c. Correlation-based feature selection of single cell transcriptomics data from multiple sources.Journal of Big Data, 12(1):4, 2025
work page 2025
-
[38]
Getnet Molla Desta and Alemayehu Godana Birhanu. Advancements in single-cell rna sequenc- ing and spatial transcriptomics: transforming biomedical research.Acta Biochimica Polonica, 72:13922, 2025
work page 2025
-
[39]
Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann
Christopher Morris, Nils M. Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs. InICML 2020 Workshop on Graph Representation Learning and Beyond (GRL+ 2020), 2020
work page 2020
-
[40]
Ziad S Nasreddine, Natalie A Phillips, Valérie Bédirian, Simon Charbonneau, Victor Whitehead, Isabelle Collin, Jeffrey L Cummings, and Howard Chertkow. The montreal cognitive assessment, moca: a brief screening tool for mild cognitive impairment.Journal of the American Geriatrics Society, 53(4):695–699, 2005
work page 2005
-
[41]
Ashwini Patil and Haruki Nakamura. HINT: a database of annotated protein-protein interactions and their homologs.Biophysics, 1:21–24, 2005
work page 2005
-
[42]
Yasset Perez-Riverol, Moritz Kuhn, Juan Antonio Vizcaíno, Martin P. Hitz, and Edward Audain. Accurate and fast feature selection workflow for high-dimensional omics data.PLOS ONE, 12(12):e0189875, 2017
work page 2017
-
[43]
Ricardo Ramirez, Yu-Chiao Chiu, Allen Hererra, Milad Mostavi, Joshua Ramirez, Yidong Chen, Yufei Huang, and Yu-Fang Jin. Classification of cancer types using graph convolutional neural networks.Frontiers in Physics, 8:203, 06 2020
work page 2020
-
[44]
Ladislav Rampášek, Michael Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. Recipe for a general, powerful, scalable graph transformer.Advances in Neural Information Processing Systems, 35:14501–14515, 2022
work page 2022
-
[45]
Jiahua Rao, Xiang Zhou, Yutong Lu, Huiying Zhao, and Yuedong Yang. Imputing single-cell rna-seq data by combining graph convolution and autoencoder neural networks.iScience, 24(5):102393, 2021
work page 2021
-
[46]
Sungmin Rhee, Seokjun Seo, and Sun Kim. Hybrid approach of relation network and localized graph convolutional filtering for breast cancer subtype classification. InProceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 3527–3534. International Joint Conferences on Artificial Intelligence Organization, 7 2018
work page 2018
-
[47]
Jeremy M Robbins, Bennet Peterson, Daniela Schranner, Usman A Tahir, Theresa Rienmüller, Shuliang Deng, Michelle J Keyes, Daniel H Katz, Pierre M Jean Beltran, Jacob L Barber, et al. Human plasma proteomic profiles indicative of cardiorespiratory fitness.Nature metabolism, 3(6):786–797, 2021
work page 2021
-
[48]
Ali Saadat and Jacques Fellay. Proteome-wide prediction of the mode of inheritance and molecular mechanisms underlying genetic diseases using structural interactomics.iScience, 28(7):112812, 2025
work page 2025
-
[49]
Next-generation sequencing technology: current trends and advancements.Biology, 12(7):997, 2023
Heena Satam, Kandarp Joshi, Upasana Mangrolia, Sanober Waghoo, Gulnaz Zaidi, Shravani Rawool, Ritesh P Thakare, Shahid Banday, Alok K Mishra, Gautam Das, et al. Next-generation sequencing technology: current trends and advancements.Biology, 12(7):997, 2023
work page 2023
-
[50]
The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009
work page 2009
-
[51]
Brenda, the enzyme database: updates and major new developments
Ida Schomburg, Antje Chang, Christian Ebeling, Marion Gremse, Christian Heldt, Gregor Huhn, and Dietmar Schomburg. Brenda, the enzyme database: updates and major new developments. Nucleic acids research, 32(suppl_1):D431–D433, 2004
work page 2004
-
[52]
Roded Shamir, Christine Klein, David Amar, Eva J V ollstedt, et al. Analysis of blood-based gene expression in idiopathic parkinson disease.Neurology, 89(16):1676–1683, 2017. 13
work page 2017
-
[53]
High-throughput sequencing for biology and medicine.Molecular Systems Biology, 9(1):640, 2013
Wendy Weijia Soon, Manoj Hariharan, and Michael P Snyder. High-throughput sequencing for biology and medicine.Molecular Systems Biology, 9(1):640, 2013
work page 2013
-
[54]
Stokes, Kevin Yang, Kyle Swanson, Wengong Jin, Andres Cubillos-Ruiz, Nina M
Jonathan M. Stokes, Kevin Yang, Kyle Swanson, Wengong Jin, Andres Cubillos-Ruiz, Nina M. Donghia, Craig R. MacNair, Shawn French, Lindsey A. Carfrae, Zohar Bloom-Ackermann, Victoria M. Tran, Anush Chiappino-Pepe, Ahmed H. Badran, Ian W. Andrews, Emma J. Chory, George M. Church, Eric D. Brown, Tommi S. Jaakkola, Regina Barzilay, and James J. Collins. A dee...
work page 2020
-
[55]
Graphbench: Next-generation graph learning benchmarking, 2025
Timo Stoll, Chendi Qian, Ben Finkelshtein, Ali Parviz, Darius Weber, Fabrizio Frasca, Hadar Shavit, Antoine Siraudin, Arman Mielke, Marie Anastacio, Erik Müller, Maya Bechler-Speicher, Michael Bronstein, Mikhail Galkin, Holger Hoos, Mathias Niepert, Bryan Perozzi, Jan Tönshoff, and Christopher Morris. Graphbench: Next-generation graph learning benchmarking, 2025
work page 2025
-
[56]
Scalable and adaptive graph neural networks with self-label-enhanced training, 2021
Chuxiong Sun, Hongming Gu, and Jie Hu. Scalable and adaptive graph neural networks with self-label-enhanced training, 2021
work page 2021
-
[57]
Gábor J Székely, Maria L Rizzo, and Nail K Bakirov. Measuring and testing dependence by correlation of distances.The Annals of Statistics, pages 2769–2794, 2007
work page 2007
-
[58]
Damian Szklarczyk, Rebecca Kirsch, Mikaela Koutrouli, Katerina Nastou, Farrokh Mehryary, Radja Hachilif, Annika L Gable, Tao Fang, Nadezhda T Doncheva, Sampo Pyysalo, Peer Bork, Lars J Jensen, and Christian von Mering. The string database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest....
work page 2023
-
[59]
Chia Yan Tan, Huey Fang Ong, Chern Hong Lim, Mei Sze Tan, Ean Hin Ooi, and KokSheik Wong. Amogel: a multi-omics classification framework using associative graph neural networks with prior knowledge for biomarker identification.BMC bioinformatics, 26(1):1–27, 2025
work page 2025
-
[60]
Lev Telyatnikov, Guillermo Bernardez, Marco Montagna, Mustafa Hajij, Martin Carrasco, Pavlo Vasylenko, Mathilde Papillon, Ghada Zamzmi, Michael T Schaub, Jonas Verhellen, Pavel Snopov, Bertran Miquel-Oliver, Manel Gil-Sorribes, Alexis Molina, VICTOR GUAL- LAR, Theodore Long, Julian Suk, Patryk Rygiel, Alexander V Nikitin, Giordan Escalona, Michael Banf, D...
work page 2025
-
[61]
Robert Tibshirani. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Series B, 58(1):267–288, 1996
work page 1996
-
[62]
Sipko van Dam, Urmo Võsa, Adriaan van der Graaf, Lude Franke, and João Pedro de Magalhães. Gene co-expression analysis for functional classification and gene–disease predictions.Briefings in Bioinformatics, 19(4):575–592, 01 2017
work page 2017
-
[63]
Chunyu Wang, Junling Guo, Ning Zhao, Yang Liu, Xiaoyan Liu, Guojun Liu, and Maozu Guo. A cancer survival prediction method based on graph convolutional network.IEEE transactions on nanobioscience, 19(1):117–126, 2019
work page 2019
-
[64]
Juexin Wang, Anjun Ma, Yuzhou Chang, Jianting Gong, Yuexu Jiang, Ren Qi, Cankun Wang, Hongjun Fu, Qin Ma, and Dong Xu. scgnn is a novel graph neural network framework for single-cell rna-seq analyses.Nature communications, 12(1):1882, 2021
work page 2021
-
[65]
Yuhan Wang, Zhikang Wang, Xuan Yu, Xiaoyu Wang, Jiangning Song, Dong-Jun Yu, and Fang Ge. More: a multi-omics data-driven hypergraph integration network for biomedical data classification and biomarker identification.Briefings in Bioinformatics, 26(1):bbae658, 12 2024
work page 2024
-
[66]
Health benefits of physical activity: the evidence.Cmaj, 174(6):801–809, 2006
Darren ER Warburton, Crystal Whitney Nicol, and Shannon SD Bredin. Health benefits of physical activity: the evidence.Cmaj, 174(6):801–809, 2006
work page 2006
-
[67]
Prescribing exercise as preventive therapy.Cmaj, 174(7):961–974, 2006
Darren ER Warburton, Crystal Whitney Nicol, and Shannon SD Bredin. Prescribing exercise as preventive therapy.Cmaj, 174(7):961–974, 2006. 14
work page 2006
-
[68]
Xiaohan Xing, Fan Yang, Hang Li, Jun Zhang, Yu Zhao, Mingxuan Gao, Junzhou Huang, and Jianhua Yao. Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis.Bioinformatics, 38, 02 2022
work page 2022
-
[69]
How powerful are graph neural networks? InInternational Conference on Learning Representations, 2019
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InInternational Conference on Learning Representations, 2019
work page 2019
-
[70]
Rui Yan, Md Tauhidul Islam, and Lei Xing. Deep representation learning of protein-protein interaction networks for enhanced pattern discovery.Science Advances, 10(51):eadq4324, 2024
work page 2024
-
[71]
Tiantian Yang and Zhiqian Chen. Motgnn: interpretable graph neural networks for multi-omics disease classification.arXiv preprint arXiv:2508.07465, 2025
-
[72]
Ziwei Yang, Rikuto Kotoge, Xihao Piao, Zheng Chen, Lingwei Zhu, Peng Gao, Yasuko Matsubara, Yasushi Sakurai, and J. Sun. Mlomics: Cancer multi-omics database for machine learning.Scientific Data, 12, 05 2025
work page 2025
-
[73]
Assessing and mitigating batch effects in large-scale omics studies.Genome biology, 25(1):254, 2024
Ying Yu, Yuanbang Mai, Yuanting Zheng, and Leming Shi. Assessing and mitigating batch effects in large-scale omics studies.Genome biology, 25(1):254, 2024
work page 2024
-
[74]
GraphSAINT: Graph sampling based inductive learning method
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. GraphSAINT: Graph sampling based inductive learning method. InInternational Conference on Learning Representations, 2020
work page 2020
-
[75]
Xiao-Meng Zhang, Li Liang, Lin Liu, and Ming-Jing Tang. Graph neural networks and their current applications in bioinformatics.Frontiers in Genetics, V olume 12 - 2021, 2021
work page 2021
-
[76]
Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2):301–320, 2005. 15 A Dataset Preprocessing In depth preprocessing steps for each dataset are included below. A.1 Heritage We downloaded the MoTrPAC HERITAGE SomaLogic proteomics matrix and an...
work page 2005
-
[77]
Rank all hyperparameter configurations by mean validation F1 across 3 random seeds
-
[78]
Select top-K configurations (K∈ {1,3,5,10})
-
[79]
For each seed independently: • Load checkpoints for the top-K configs (all trained with that seed) • Obtain class probability predictions on the test set from each checkpoint • Compute ensemble prediction via soft voting:ˆyens =argmax 1 K PK k=1 pk(y|x) • Compute test F1-macro for the ensemble
-
[80]
Report mean±std of ensemble test F1 across the 3 seeds Note that seeds remain independent: we ensemble within each seed’s checkpoints and average performance across seeds, preserving valid uncertainty quantification. D.3 Results Figure 7 compares single-best-validation selection (K=1, black bars) against ensembles of increasing size. In Figure 8 we show K...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.