STQuant dynamically allocates quantization bits for optimizer states in multimodal model training, reducing memory by 84.4% to an average 5.1 bits while preserving quality on GPT-2 and ViT.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
STQuant: Spatio-Temporal Adaptive Framework for Optimizer Quantization in Large Multimodal Model Training
STQuant dynamically allocates quantization bits for optimizer states in multimodal model training, reducing memory by 84.4% to an average 5.1 bits while preserving quality on GPT-2 and ViT.