NCCLbpf adds a verified eBPF runtime to NCCL plugins for safe composable policies, hot-reloads without downtime, and up to 27% AllReduce throughput improvement in the 4-128 MiB range with 80-130 ns overhead.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication
NCCLbpf adds a verified eBPF runtime to NCCL plugins for safe composable policies, hot-reloads without downtime, and up to 27% AllReduce throughput improvement in the 4-128 MiB range with 80-130 ns overhead.