AAAC uses two adaptive 64-byte codebooks per layer for 4-bit LLM weight quantization, choosing the optimal one per group to minimize activation-weighted error with zero storage overhead and fast runtime.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
BWLA is the first post-training quantization method for LLMs that achieves 1-bit weights paired with low-bit activations such as 6 bits, using OKT to reshape weights and suppress activation tails plus PSP for low-rank refinement.
citing papers explorer
-
AAAC: Activation-Aware Adaptive Codebooks for 4-bit LLM Weight Quantization
AAAC uses two adaptive 64-byte codebooks per layer for 4-bit LLM weight quantization, choosing the optimal one per group to minimize activation-weighted error with zero storage overhead and fast runtime.
-
BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs
BWLA is the first post-training quantization method for LLMs that achieves 1-bit weights paired with low-bit activations such as 6 bits, using OKT to reshape weights and suppress activation tails plus PSP for low-rank refinement.