TAH-Quant introduces tile-wise adaptive Hadamard quantization for activations in pipeline parallelism, achieving 3-4 bit compression with up to 4.3x throughput speedup and O(1/sqrt(T)) convergence matching SGD.
Decentralized training of foundation models in heterogeneous environments.Advances in Neural Information Processing Systems, 35:25464–25477
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
TAH-Quant introduces tile-wise adaptive Hadamard quantization for activations in pipeline parallelism, achieving 3-4 bit compression with up to 4.3x throughput speedup and O(1/sqrt(T)) convergence matching SGD.