MMA routes host-GPU transfers over multiple available paths to deliver 4.62x higher peak bandwidth and lower latencies in LLM serving without hardware or driver changes.
In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), pages 87–101
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
MultiPath Memory Access: Breaking Host-GPU Bandwidth Bottlenecks in LLM Services
MMA routes host-GPU transfers over multiple available paths to deliver 4.62x higher peak bandwidth and lower latencies in LLM serving without hardware or driver changes.