hub

Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song · 2019 · cs.LG · arXiv 1909.01315

25 Pith papers cite this work. Polarity classification is still indexing.

25 Pith papers citing it

open full Pith review browse 25 citing papers arXiv PDF

abstract

Advancing research in the emerging field of deep graph learning requires new tools to support tensor computation over graphs. In this paper, we present the design principles and implementation of Deep Graph Library (DGL). DGL distills the computational patterns of GNNs into a few generalized sparse tensor operations suitable for extensive parallelization. By advocating graph as the central programming abstraction, DGL can perform optimizations transparently. By cautiously adopting a framework-neutral design, DGL allows users to easily port and leverage the existing components across multiple deep learning frameworks. Our evaluation shows that DGL significantly outperforms other popular GNN-oriented frameworks in both speed and memory consumption over a variety of benchmarks and has little overhead for small scale workloads.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

What drives performance in molecular MPNNs? An operator-level factorial benchmark

cond-mat.mtrl-sci · 2026-05-28 · unverdicted · novelty 7.0

Operator-level factorial benchmark of 84 MPNN configurations finds message-seed initialization and node-edge fusion drive performance on MoleculeNet tasks more than node updates.

Ocean: Fast Estimation-Based Sparse General Matrix-Matrix Multiplication on GPU

cs.DC · 2026-04-21 · unverdicted · novelty 7.0

Ocean uses HyperLogLog estimators to skip the costly symbolic phase of GPU SpGEMM, pairs it with dynamic workflow choice and a shared-plus-global hash accumulator, and reports 1.4-2.8x speedups over prior GPU implementations.

AsyncSparse: Accelerating Sparse Matrix-Matrix Multiplication on Asynchronous GPU Architectures

cs.DC · 2026-04-20 · unverdicted · novelty 7.0

AsyncSparse presents BCSR and WCSR kernels that use TMA and warp specialization to accelerate SpMM, outperforming prior libraries by 1.47-6.24x on SuiteSparse and achieving 2.66x end-to-end speedup on Qwen2.5-7B at 90% block sparsity.

How Hard Is It for Message-Passing GNNs to Simulate One Weisfeiler-Lehman Color-Refinement Step?

cs.LG · 2024-10-02 · unverdicted · novelty 7.0

Oblivious MPGNNs cannot simulate WL color refinement with shallow depth and small messages without randomness; bounded-error randomness enables logarithmic resources for large color sets, while small color sets force layer-message trade-offs.

How Attentive are Graph Attention Networks?

cs.LG · 2021-05-30 · conditional · novelty 7.0

GAT uses static attention where neighbor rankings ignore the query node and thus cannot express some graph problems; GATv2 enables dynamic attention and outperforms GAT on 11 OGB and other benchmarks with equal parameters.

Reducing the GPU Memory Bottleneck with Lossless Compression for ML -- Extended

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

IBP is a new lossless bit-packing algorithm with GPU-optimized decompression that speeds up GNN training by 74%, DLRM lookups by 180%, and LLM inference by 24% by reducing CPU-GPU data movement.

Robust Multimodal Recommendation via Graph Retrieval-Enhanced Modality Completion

cs.IR · 2026-05-01 · unverdicted · novelty 6.0

GRE-MC retrieves relevant subgraphs and uses a graph transformer plus sparse codebook to complete missing modalities, outperforming prior methods on recommendation benchmarks.

LogosKG: Hardware-Optimized Scalable and Interpretable Knowledge Graph Retrieval

cs.CL · 2026-04-20 · unverdicted · novelty 6.0

LogosKG delivers a novel hardware-aligned system for efficient multi-hop retrieval on billion-edge knowledge graphs without sacrificing fidelity, demonstrated via biomedical KG-LLM applications.

Scalable and Adaptive Parallel Training of Graph Transformer on Large Graphs

cs.DC · 2026-04-17 · unverdicted · novelty 6.0

A new distributed framework for graph transformer training auto-selects parallel strategies and optimizes sparse operations to deliver up to 6x speedup on 8 GPUs and 78% memory reduction.

Modern Structure-Aware Simplicial Spatiotemporal Neural Network

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

ModernSASST is the first simplicial complex-based spatiotemporal model that combines random walks on high-dimensional complexes with parallelizable temporal convolutional networks for efficient high-order topology capture.

Cluster Attention for Graph Machine Learning

cs.LG · 2026-04-08 · unverdicted · novelty 6.0

Cluster attention uses off-the-shelf community detection to define attention scopes within graph clusters, augmenting MPNNs and Graph Transformers to achieve larger receptive fields with preserved structural inductive biases and improved performance on diverse graph datasets.

Communication-free Sampling and 4D Hybrid Parallelism for Scalable Mini-batch GNN Training

cs.LG · 2026-04-03 · unverdicted · novelty 6.0

ScaleGNN uses communication-free sampling and 4D parallelism to scale mini-batch GNN training to 2048 GPUs, achieving 3.5x speedup over prior state-of-the-art on ogbn-products.

FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

cs.AI · 2026-02-26 · unverdicted · novelty 6.0

FlexMS is a new flexible benchmarking framework that lets researchers dynamically combine deep learning architectures and evaluate their mass spectrum prediction performance on public metabolomics datasets using multiple metrics and retrieval tasks.

SHIRO: Near-Optimal Communication Strategies for Distributed Sparse Matrix Multiplication

cs.DC · 2025-12-23 · unverdicted · novelty 6.0

SHIRO achieves geometric mean speedups of 221.5x to 8.8x over four baselines in distributed SpMM on up to 128 GPUs by exploiting sparsity patterns and two-tier network topologies.

Torch Geometric Pool: the PyTorch library for pooling in Graph Neural Networks

cs.LG · 2025-12-14 · accept · novelty 6.0

A new open-source library standardizes 20 hierarchical graph pooling operations under one SRCL interface with uniform outputs and batch handling for PyTorch Geometric.

Detecting LLM-Generated Spam Reviews by Integrating Language Model Embeddings and Graph Neural Network

cs.CL · 2025-10-02 · unverdicted · novelty 6.0

Introduces FraudSquad, a hybrid model using language model embeddings and a gated graph transformer that outperforms baselines on newly created LLM-generated spam review datasets.

Modal Decomposition and Identification for a Population of Structures Using Physics-Informed Graph Neural Networks and Transformers

cs.CE · 2025-05-06 · unverdicted · novelty 6.0

A physics-informed GNN-transformer model performs unsupervised modal decomposition and identification for populations of structures from sparse dynamic measurements.

Pretrained Event Classification Model for High Energy Physics Analysis

hep-ph · 2024-12-14 · unverdicted · novelty 6.0

A GNN pretrained on 120M simulated HEP events generalizes to unseen processes and ATLAS data; fine-tuning boosts accuracy especially with small datasets, with CKA showing preserved encoders but altered intermediate layers.

Learning Spatial-Preserving Hierarchical Representations for Digital Pathology

cs.CV · 2024-06-13 · unverdicted · novelty 6.0

SPAN is a hierarchical attention framework that constructs multi-scale pyramid representations from single-scale patch inputs for WSI classification and segmentation while preserving spatial relationships.

Graph Transductive Sharpening: Leveraging Unlabeled Predictions in Node Classification

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

Transductive Sharpening adds an entropy-minimization term on unlabeled-node predictions to the training objective for graph node classification.

GreenDyGNN: Runtime-Adaptive Energy-Efficient Communication for Distributed GNN Training

cs.DC · 2026-04-25 · unverdicted · novelty 5.0

GreenDyGNN applies Double-DQN to adapt cache management in distributed GNN training, cutting energy by up to 43% under congestion versus static policies.

TabEmb: Joint Semantic-Structure Embedding for Table Annotation

cs.LG · 2026-04-21 · unverdicted · novelty 5.0

TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.

AutoGraphAD: Unsupervised network anomaly detection using Variational Graph Autoencoders

cs.CR · 2025-11-21 · unverdicted · novelty 5.0

AutoGraphAD applies a heterogeneous variational graph autoencoder with unsupervised and contrastive learning to detect network anomalies on connection-IP graphs without labeled data, achieving comparable performance to Anomal-E with over an order of magnitude faster training and inference.

G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

cs.AI · 2025-09-29 · unverdicted · novelty 5.0

G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.

citing papers explorer

Showing 2 of 2 citing papers after filters.

FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics cs.AI · 2026-02-26 · unverdicted · none · ref 52 · internal anchor
FlexMS is a new flexible benchmarking framework that lets researchers dynamically combine deep learning architectures and evaluate their mass spectrum prediction performance on public metabolomics datasets using multiple metrics and retrieval tasks.
G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge cs.AI · 2025-09-29 · unverdicted · none · ref 43 · internal anchor
G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.

Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer