Bridge reduces All-to-All completion time by typically 3x to 10x and improves AllReduce by up to 6.6x over Ring by reusing optical subrings across multiple steps in reconfigurable networks.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
UCCL-Zip adds lossless compression to GPU communication to reduce LLM bottlenecks while preserving exact numerical correctness.
LatencyScope models 5G RAN latency sources across the protocol stack and provides a configuration analyzer that identifies settings meeting latency-reliability targets, validated on open-source testbeds and commercial network measurements where it outperforms prior models and simulators.
ReTri achieves all-to-all in ⌈log₃ n⌉ phases for ORNs by co-designing bidirectional exchanges and reconfiguration strategy, with simulations showing up to 10× improvement over static and 2.1× over prior reconfigurable Bruck.
DBLP is a training-phase-aware bounded-loss transport protocol that reduces end-to-end distributed ML training time by 24.4% on average (up to 33.9%) and achieves up to 5.88x communication speedup during microbursts while maintaining comparable test accuracy.
citing papers explorer
-
UCCL-Zip: Lossless Compression Supercharged GPU Communication
UCCL-Zip adds lossless compression to GPU communication to reduce LLM bottlenecks while preserving exact numerical correctness.
-
LatencyScope: A System-Level Mathematical Framework for 5G RAN Latency
LatencyScope models 5G RAN latency sources across the protocol stack and provides a configuration analyzer that identifies settings meeting latency-reliability targets, validated on open-source testbeds and commercial network measurements where it outperforms prior models and simulators.
-
Revisiting Bruck: Phase-Efficient All-to-All Communication in Reconfigurable Networks
ReTri achieves all-to-all in ⌈log₃ n⌉ phases for ORNs by co-designing bidirectional exchanges and reconfiguration strategy, with simulations showing up to 10× improvement over static and 2.1× over prior reconfigurable Bruck.
-
DBLP: Phase-Aware Bounded-Loss Transport for Burst-Resilient Distributed ML Training
DBLP is a training-phase-aware bounded-loss transport protocol that reduces end-to-end distributed ML training time by 24.4% on average (up to 33.9%) and achieves up to 5.88x communication speedup during microbursts while maintaining comparable test accuracy.