Interconnect bandwidth heterogeneity on amd mi250x and infinity fabric

Carl Pearson · 2023 · arXiv 2302.14827

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

DAK: Direct-Access-Enabled GPU Memory Offloading with Optimal Efficiency for LLM Inference

cs.DC · 2026-04-28 · unverdicted · novelty 6.0

DAK enables direct GPU access to remote memory for LLM inference via TMA repurposing and a greedy offloading algorithm, achieving up to 3x gains over prefetching baselines on NVLink-C2C and 1.8x on PCIe.

The Landscape of GPU-Centric Communication

cs.DC · 2024-09-15 · unverdicted · novelty 2.0

A survey categorizing vendor mechanisms and user-level libraries for GPU-centric communication within and across nodes, with discussion of benefits, challenges, and open questions.

citing papers explorer

Showing 2 of 2 citing papers.

DAK: Direct-Access-Enabled GPU Memory Offloading with Optimal Efficiency for LLM Inference cs.DC · 2026-04-28 · unverdicted · none · ref 28
DAK enables direct GPU access to remote memory for LLM inference via TMA repurposing and a greedy offloading algorithm, achieving up to 3x gains over prefetching baselines on NVLink-C2C and 1.8x on PCIe.
The Landscape of GPU-Centric Communication cs.DC · 2024-09-15 · unverdicted · none · ref 92
A survey categorizing vendor mechanisms and user-level libraries for GPU-centric communication within and across nodes, with discussion of benefits, challenges, and open questions.

Interconnect bandwidth heterogeneity on amd mi250x and infinity fabric

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer