FusionRCG uses liveness-aware graph orchestration, Cartesian-to-spherical fusion, and multi-tier kernels to cut intermediate data by up to 7.7x and deliver 3.09x SCF speedup on A100 GPUs.
TVM: An automated end- to-end optimizing compiler for deep learning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
physics.comp-ph 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
FusionRCG: Orchestrating Recursive Computation Graphs across GPU Memory Hierarchies
FusionRCG uses liveness-aware graph orchestration, Cartesian-to-spherical fusion, and multi-tier kernels to cut intermediate data by up to 7.7x and deliver 3.09x SCF speedup on A100 GPUs.