IO-aware GPU kernels for SpMM convolutions, degree-aware reductions, and fused attention layers deliver median speedups of 1.6-2.6x (up to 10x) and memory reductions up to 76x over DGL/PyG baselines on realistic graphs.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
On Efficient Scaling of GNNs via IO-Aware Layers Implementations
IO-aware GPU kernels for SpMM convolutions, degree-aware reductions, and fused attention layers deliver median speedups of 1.6-2.6x (up to 10x) and memory reductions up to 76x over DGL/PyG baselines on realistic graphs.