A systematic approach maps any-dimensional invariant functions to a unique function on an infinite-dimensional limit space admitting a topology with compact sets where universality holds, with examples of non-universal architectures and fixes.
hub Mixed citations
How Powerful are Graph Neural Networks?
Mixed citation behavior. Most common role is background (56%).
abstract
Graph Neural Networks (GNNs) are an effective framework for representation learning of graphs. GNNs follow a neighborhood aggregation scheme, where the representation vector of a node is computed by recursively aggregating and transforming representation vectors of its neighboring nodes. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations. Here, we present a theoretical framework for analyzing the expressive power of GNNs to capture different graph structures. Our results characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures. We then develop a simple architecture that is provably the most expressive among the class of GNNs and is as powerful as the Weisfeiler-Lehman graph isomorphism test. We empirically validate our theoretical findings on a number of graph classification benchmarks, and demonstrate that our model achieves state-of-the-art performance.
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract Graph Neural Networks (GNNs) are an effective framework for representation learning of graphs. GNNs follow a neighborhood aggregation scheme, where the representation vector of a node is computed by recursively aggregating and transforming representation vectors of its neighboring nodes. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations. Here, we present a theoretical f
co-cited works
representative citing papers
Hyperdimensional fingerprints use algebraic operations on high-dimensional vectors to create training-free molecular representations that preserve similarity better than Morgan fingerprints at low dimensions and improve downstream tasks like property prediction and Bayesian optimization.
Gauge-equivariant graph neural networks embed non-Abelian local symmetries directly into message passing for lattice gauge theories, enabling learning of nonlocal observables from local operations.
BadImplant is the first multi-targeted backdoor attack on GNN graph classification that uses subgraph injection to achieve high success rates on multiple target labels with minimal clean accuracy loss.
k-WL is incomplete on simple spectrum graphs; PRiSM is the first provably complete canonicalization for their eigendecompositions.
ConTact decomposes CDR design into surface fingerprint learning, contact prediction, and contact-gated sequence generation using distance-biased attention and weighted loss, reporting 7% RMSD and 10% F1 gains on CHIMERA-Bench.
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
MSRL represents trajectory segments as PSD matrices to prove additive composition properties and bootstrap value functions for better transfer, reaching 0.73 AUC versus 0.57-0.65 baselines.
Hodge Spectral Duality provides a topology-preserving neural operator by isolating unlearnable topological components via Hodge orthogonality and operator splitting.
scShapeBench supplies synthetic and real annotated single-cell datasets across four shape categories, with scReebTower outperforming PAGA and Mapper on topology-aware metrics.
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
A gauge-invariant GNN using Wilson loops as inputs accurately predicts observables and simulates dynamics in Z2 and U(1) lattice gauge models.
LUMINA-Bench is a standardized evaluation framework for ACOPF surrogate models that tests generalization across multiple grid topologies using accuracy and physics-constraint metrics.
PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.
TF-LLMER resolves optimization barriers in LLM-enhanced recommenders through embedding normalization and Rec-PCA that aligns semantic representations with collaborative co-occurrence graphs.
Concept Graph Convolutions perform message passing on node concepts to increase interpretability of graph neural networks without losing task performance.
A tri-view information-bottleneck model that fuses pairwise, triadic and tetradic O-information outperforms eleven baselines on four fMRI psychiatric datasets while revealing region-level synergy-redundancy patterns.
PROVFUSION fuses three complementary views of provenance data with lightweight schemes and voting to achieve higher detection accuracy and lower false positives than node- or edge-only baselines on nine benchmarks.
R2G is a multi-view circuit graph benchmark showing that representation choice affects GNN accuracy more than model architecture, with node-centric views and deeper decoders performing best.
ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.
DSBD distills a dual-aligned structural basis to adapt GNNs across graphs with structural distribution shifts, outperforming prior methods on benchmarks.
Complex-valued GNNs using phase-equivariant activations achieve global basis invariance for distributed planar control, outperforming real-valued baselines in data efficiency, tracking, and generalization on flocking.
citing papers explorer
-
Any-Dimensional Invariant Universality
A systematic approach maps any-dimensional invariant functions to a unique function on an infinite-dimensional limit space admitting a topology with compact sets where universality holds, with examples of non-universal architectures and fixes.
-
Hyper-Dimensional Fingerprints as Molecular Representations
Hyperdimensional fingerprints use algebraic operations on high-dimensional vectors to create training-free molecular representations that preserve similarity better than Morgan fingerprints at low dimensions and improve downstream tasks like property prediction and Bayesian optimization.
-
Gauge-Equivariant Graph Neural Networks for Lattice Gauge Theories
Gauge-equivariant graph neural networks embed non-Abelian local symmetries directly into message passing for lattice gauge theories, enabling learning of nonlocal observables from local operations.
-
BadImplant: Injection-based Multi-Targeted Graph Backdoor Attack
BadImplant is the first multi-targeted backdoor attack on GNN graph classification that uses subgraph injection to achieve high success rates on multiple target labels with minimal clean accuracy loss.
-
Weisfeiler-Leman Is Incomplete on Simple Spectrum Graphs, so Canonicalize Them
k-WL is incomplete on simple spectrum graphs; PRiSM is the first provably complete canonicalization for their eigendecompositions.
-
ConTact: Contact-First Antibody CDR Design via Explicit Interface Reasoning
ConTact decomposes CDR design into surface fingerprint learning, contact prediction, and contact-gated sequence generation using distance-biased attention and weighted loss, reporting 7% RMSD and 10% F1 gains on CHIMERA-Bench.
-
1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
-
Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry
MSRL represents trajectory segments as PSD matrices to prove additive composition properties and bootstrap value functions for better transfer, reaching 0.73 AUC versus 0.57-0.65 baselines.
-
Topology-Preserving Neural Operator Learning via Hodge Decomposition
Hodge Spectral Duality provides a topology-preserving neural operator by isolating unlearnable topological components via Hodge orthogonality and operator splitting.
-
scShapeBench: Discovering geometry from high dimensional scRNAseq data
scShapeBench supplies synthetic and real annotated single-cell datasets across four shape categories, with scReebTower outperforming PAGA and Mapper on topology-aware metrics.
-
TopoU-Net: a U-Net architecture for topological domains
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
-
CTQWformer: A CTQW-based Transformer for Graph Classification
CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.
-
Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
-
Graph Neural Networks in the Wilson Loop Representation of Abelian Lattice Gauge Theories
A gauge-invariant GNN using Wilson loops as inputs accurately predicts observables and simulates dynamics in Z2 and U(1) lattice gauge models.
-
LUMINA: A Grid Foundation Model for Benchmarking AC Optimal Power Flow Surrogate Learning
LUMINA-Bench is a standardized evaluation framework for ACOPF surrogate models that tests generalization across multiple grid topologies using accuracy and physics-constraint metrics.
-
PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty
PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.
-
Break the Optimization Barrier of LLM-Enhanced Recommenders: A Theoretical Analysis and Practical Framework
TF-LLMER resolves optimization barriers in LLM-enhanced recommenders through embedding normalization and Rec-PCA that aligns semantic representations with collaborative co-occurrence graphs.
-
Concept Graph Convolutions: Message Passing in the Concept Space
Concept Graph Convolutions perform message passing on node concepts to increase interpretability of graph neural networks without losing task performance.
-
Modeling Higher-Order Brain Interactions via a Multi-View Information Bottleneck Framework for fMRI-based Psychiatric Diagnosis
A tri-view information-bottleneck model that fuses pairwise, triadic and tetradic O-information outperforms eleven baselines on four fMRI psychiatric datasets while revealing region-level synergy-redundancy patterns.
-
Beyond Nodes vs. Edges: A Multi-View Fusion Framework for Provenance-Based Intrusion Detection
PROVFUSION fuses three complementary views of provenance data with lightweight schemes and voting to achieve higher detection accuracy and lower false positives than node- or edge-only baselines on nine benchmarks.
-
R2G: A Multi-View Circuit Graph Benchmark Suite from RTL to GDSII
R2G is a multi-view circuit graph benchmark showing that representation choice affects GNN accuracy more than model architecture, with node-centric views and deeper decoders performing best.
-
Graph Topology Information Enhanced Heterogeneous Graph Representation Learning
ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.
-
DSBD: Dual-Aligned Structural Basis Distillation for Graph Domain Adaptation
DSBD distills a dual-aligned structural basis to adapt GNNs across graphs with structural distribution shifts, outperforming prior methods on benchmarks.
-
Complex-Valued GNNs for Distributed Basis-Invariant Control of Planar Systems
Complex-valued GNNs using phase-equivariant activations achieve global basis invariance for distributed planar control, outperforming real-valued baselines in data efficiency, tracking, and generalization on flocking.
-
GraphScout: Empowering Large Language Models with Intrinsic Exploration Ability for Agentic Graph Reasoning
GraphScout trains LLMs to autonomously synthesize structured training data from knowledge graphs via flexible exploration tools, enabling a 4B model to outperform larger LLMs by 16.7% on average with fewer inference tokens and strong cross-domain transfer.
-
Graph Property Inference in Small Language Models: Effects of Representation and Reasoning Strategy
Small instruction-tuned language models cannot reliably estimate graph-theoretic properties from textual encodings, though adjacency-list formats and multi-branch reasoning reduce errors relative to edge lists and single-path inference.
-
DisRFM: Polar Riemannian Flow Matching for Structure-Preserving Graph Domain Adaptation
DisRFM uses polar Riemannian flow matching on constant-curvature manifolds to align graph domains while preserving label-relevant topology via radial Wasserstein and angular confidence matching.
-
Cross-Paradigm Graph Backdoor Attacks with Promptable Subgraph Triggers
CP-GBA distills a queryable repository of promptable subgraph triggers via graph prompt learning to achieve transferable backdoor attacks on GNNs with state-of-the-art success rates across paradigms and defenses.
-
AgForce Enables Antigen-conditioned Generative Antibody Design
AgForce improves antigen-conditioned antibody design by using framework dropout, gated bottlenecks, hyperbolic cross attention, MDN sequence head with Potts-like coupling, annealed MCL, and antigen cycle consistency to achieve 8% better amino acid recovery and superior binding metrics on CHIMERA-BEN
-
EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation
EvoStruct integrates evolutionary priors from a protein language model with structural priors from an E(3)-equivariant GNN to raise amino acid recovery by 16% and diversity by 2.3x on CHIMERA-Bench while cutting perplexity 43%.
-
Multi-level Self-supervised Pretraining on Compositional Hierarchical Graph for Molecular Property Prediction
MolCHG uses a multi-level compositional hierarchical graph with atom-bond cross-view contrastive learning, functional group prediction, and structure tasks to achieve top results on seven of nine MoleculeNet benchmarks.
-
Rethinking Efficient Graph Coarsening via a Non-Selfishness Principle
NOPE coarsens graphs via neighborhood interference rather than selfish pairwise matching to reach linear memory and near-linear time, with NOPE* variant delivering 1.8-10x speedups and comparable or better learning results than full graphs or LLM reasoning.
-
Learning the Interaction Prior for Protein-Protein Interaction Prediction: A Model-Agnostic Approach
L3-PPI reformulates PPI pair classification as graph classification over a prompt graph with controlled virtual L3 paths to inject the biological interaction prior and boost performance on existing models.
-
Fairness of Explanations in Artificial Intelligence (AI): A Unifying Framework, Axioms, and Future Direction toward Responsible AI
A conditional invariance framework defines explanation fairness as explanations being statistically independent of protected attributes given task-relevant features, unifying existing metrics and enabling procedural bias audits.
-
Quantum Injection Pathways for Implicit Graph Neural Networks
Independent quantum signal injection into graph DEQs yields higher test accuracy and fewer solver iterations than state-dependent or backbone-dependent injection and classical equilibrium models on NCI1, PROTEINS, and MUTAG benchmarks.
-
GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model
GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.
-
H3: A Healthcare Three-Hop Index for Physician Referral Network Prediction
H3 is a new three-hop index that predicts physician referrals using normalized indirect pathways and outperforms heuristics and neural nets on Medicare shared-patient data in both within-period and cross-period settings.
-
TransXion: A High-Fidelity Graph Benchmark for Realistic Anti-Money Laundering
TransXion supplies a 3-million-transaction graph benchmark with profile-aware normal activity and stochastic illicit subgraphs that produces lower detection scores than prior AML datasets.
-
GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning
GraphDC applies divide-and-conquer multi-agent LLM reasoning to graph algorithms by decomposing graphs into subgraphs for local agents and integrating via a master agent, outperforming direct methods especially on large scales.
-
UniDetect: LLM-Driven Universal Fraud Detection across Heterogeneous Blockchains
UniDetect is an LLM-based system that generates universal transaction summary texts and uses two-stage multimodal training on text plus graphs to detect fraudulent accounts across heterogeneous blockchains, outperforming baselines by 5.57-7.58% KS and achieving over 94.58% zero-shot cross-chain and
-
NOSE: Neural Olfactory-Semantic Embedding with Tri-Modal Orthogonal Contrastive Learning
NOSE aligns molecular, receptor, and linguistic modalities in a shared embedding space via tri-modal orthogonal contrastive learning and weak positive samples, achieving SOTA performance and zero-shot generalization on olfactory tasks.
-
ML for the hKLM at the 2nd Detector
Graph neural networks trained on simulated hits outperform classical methods for energy resolution, timing, and particle identification in an iron-scintillator sampling calorimeter, with an integrated multi-objective optimization framework for design tradeoffs.
-
BiScale-GTR: Fragment-Aware Graph Transformers for Multi-Scale Molecular Representation Learning
BiScale-GTR achieves claimed state-of-the-art results on MoleculeNet, PharmaBench and LRGB by combining improved fragment tokenization with a parallel GNN-Transformer architecture that operates at both atom and fragment scales.
-
MMP-Refer: Multimodal Path Retrieval-augmented LLMs For Explainable Recommendation
MMP-Refer augments LLMs with multimodal retrieval paths and a trainable collaborative adapter to produce more accurate and explainable recommendations.
-
FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics
FlexMS is a new flexible benchmarking framework that lets researchers dynamically combine deep learning architectures and evaluate their mass spectrum prediction performance on public metabolomics datasets using multiple metrics and retrieval tasks.
-
Fed-Listing: Federated Label Distribution Inference in Graph Neural Networks
Fed-Listing infers client label proportions in FedGNNs from final-layer gradients, outperforming baselines on four datasets and three architectures even in non-i.i.d. settings.
-
How Wide and How Deep? Mitigating Over-Squashing of GNNs via Channel Capacity Constrained Estimation
C3E estimates hidden dimensions and depths for GNNs by treating them as communication channels to reduce over-squashing and improve representation learning.
-
Adaptive Canonicalization with Application to Invariant Anisotropic Geometric Networks
Adaptive canonicalization selects input canonical forms by maximizing network predictive confidence to yield continuous symmetry-preserving models with universal approximation for equivariant geometric networks.
-
Feature Augmentation of GNNs for ILPs: Local Uniqueness Suffices
Local d-hop uniqueness in GNN node features matches global UID expressiveness for ILP solving while providing stronger generalization.
-
Pretraining a Foundation Model for Small-Molecule Natural Products
NaFM is a pretrained foundation model for natural products using scaffold-focused contrastive learning and masked graph objectives that achieves SOTA on taxonomy classification, gene/microbial analysis, and virtual screening tasks.