Operator-level factorial benchmark of 84 MPNN configurations finds message-seed initialization and node-edge fusion drive performance on MoleculeNet tasks more than node updates.
hub
Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks
25 Pith papers cite this work. Polarity classification is still indexing.
abstract
Advancing research in the emerging field of deep graph learning requires new tools to support tensor computation over graphs. In this paper, we present the design principles and implementation of Deep Graph Library (DGL). DGL distills the computational patterns of GNNs into a few generalized sparse tensor operations suitable for extensive parallelization. By advocating graph as the central programming abstraction, DGL can perform optimizations transparently. By cautiously adopting a framework-neutral design, DGL allows users to easily port and leverage the existing components across multiple deep learning frameworks. Our evaluation shows that DGL significantly outperforms other popular GNN-oriented frameworks in both speed and memory consumption over a variety of benchmarks and has little overhead for small scale workloads.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Ocean uses HyperLogLog estimators to skip the costly symbolic phase of GPU SpGEMM, pairs it with dynamic workflow choice and a shared-plus-global hash accumulator, and reports 1.4-2.8x speedups over prior GPU implementations.
AsyncSparse presents BCSR and WCSR kernels that use TMA and warp specialization to accelerate SpMM, outperforming prior libraries by 1.47-6.24x on SuiteSparse and achieving 2.66x end-to-end speedup on Qwen2.5-7B at 90% block sparsity.
Oblivious MPGNNs cannot simulate WL color refinement with shallow depth and small messages without randomness; bounded-error randomness enables logarithmic resources for large color sets, while small color sets force layer-message trade-offs.
GAT uses static attention where neighbor rankings ignore the query node and thus cannot express some graph problems; GATv2 enables dynamic attention and outperforms GAT on 11 OGB and other benchmarks with equal parameters.
IBP is a new lossless bit-packing algorithm with GPU-optimized decompression that speeds up GNN training by 74%, DLRM lookups by 180%, and LLM inference by 24% by reducing CPU-GPU data movement.
GRE-MC retrieves relevant subgraphs and uses a graph transformer plus sparse codebook to complete missing modalities, outperforming prior methods on recommendation benchmarks.
LogosKG delivers a novel hardware-aligned system for efficient multi-hop retrieval on billion-edge knowledge graphs without sacrificing fidelity, demonstrated via biomedical KG-LLM applications.
A new distributed framework for graph transformer training auto-selects parallel strategies and optimizes sparse operations to deliver up to 6x speedup on 8 GPUs and 78% memory reduction.
ModernSASST is the first simplicial complex-based spatiotemporal model that combines random walks on high-dimensional complexes with parallelizable temporal convolutional networks for efficient high-order topology capture.
Cluster attention uses off-the-shelf community detection to define attention scopes within graph clusters, augmenting MPNNs and Graph Transformers to achieve larger receptive fields with preserved structural inductive biases and improved performance on diverse graph datasets.
ScaleGNN uses communication-free sampling and 4D parallelism to scale mini-batch GNN training to 2048 GPUs, achieving 3.5x speedup over prior state-of-the-art on ogbn-products.
FlexMS is a new flexible benchmarking framework that lets researchers dynamically combine deep learning architectures and evaluate their mass spectrum prediction performance on public metabolomics datasets using multiple metrics and retrieval tasks.
SHIRO achieves geometric mean speedups of 221.5x to 8.8x over four baselines in distributed SpMM on up to 128 GPUs by exploiting sparsity patterns and two-tier network topologies.
A new open-source library standardizes 20 hierarchical graph pooling operations under one SRCL interface with uniform outputs and batch handling for PyTorch Geometric.
Introduces FraudSquad, a hybrid model using language model embeddings and a gated graph transformer that outperforms baselines on newly created LLM-generated spam review datasets.
A physics-informed GNN-transformer model performs unsupervised modal decomposition and identification for populations of structures from sparse dynamic measurements.
A GNN pretrained on 120M simulated HEP events generalizes to unseen processes and ATLAS data; fine-tuning boosts accuracy especially with small datasets, with CKA showing preserved encoders but altered intermediate layers.
SPAN is a hierarchical attention framework that constructs multi-scale pyramid representations from single-scale patch inputs for WSI classification and segmentation while preserving spatial relationships.
Transductive Sharpening adds an entropy-minimization term on unlabeled-node predictions to the training objective for graph node classification.
GreenDyGNN applies Double-DQN to adapt cache management in distributed GNN training, cutting energy by up to 43% under congestion versus static policies.
TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.
AutoGraphAD applies a heterogeneous variational graph autoencoder with unsupervised and contrastive learning to detect network anomalies on connection-IP graphs without labeled data, achieving comparable performance to Anomal-E with over an order of magnitude faster training and inference.
G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.
citing papers explorer
-
FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics
FlexMS is a new flexible benchmarking framework that lets researchers dynamically combine deep learning architectures and evaluate their mass spectrum prediction performance on public metabolomics datasets using multiple metrics and retrieval tasks.
-
G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge
G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.