Title resolution pending

URL https://arxiv · arXiv 2507.14392

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Understanding and Improving Communication Performance in Multi-node LLM Inference

cs.DC · 2025-11-12 · conditional · novelty 5.0

Performance analysis of multi-node LLM inference identifies all-reduce bottlenecks and introduces NVRAR hierarchical all-reduce achieving 1.9-3.6x lower latency than NCCL and up to 1.72x end-to-end batch latency reduction for Llama 3.1 405B in decode-heavy tensor-parallel workloads.

citing papers explorer

Showing 1 of 1 citing paper.

Understanding and Improving Communication Performance in Multi-node LLM Inference cs.DC · 2025-11-12 · conditional · none · ref 18
Performance analysis of multi-node LLM inference identifies all-reduce bottlenecks and introduces NVRAR hierarchical all-reduce achieving 1.9-3.6x lower latency than NCCL and up to 1.72x end-to-end batch latency reduction for Llama 3.1 405B in decode-heavy tensor-parallel workloads.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer