BEExformer integrates binarization-aware training using a second-order sign approximation and entropy-based early exits with SLFN to achieve 21.3x size reduction, 52% fewer FLOPs, and 3.22% accuracy gain on NLP tasks.
Q-bert: Hessian based ultra low precision quantization of bert
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
BEExformer: A Fast Inferencing Binarized Transformer with Early Exits
BEExformer integrates binarization-aware training using a second-order sign approximation and entropy-based early exits with SLFN to achieve 21.3x size reduction, 52% fewer FLOPs, and 3.22% accuracy gain on NLP tasks.