MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.
CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Post-training LLM compression often relies on low-rank approximations, which force all columns of a projection matrix to share a single low-dimensional subspace. We propose CoSpaDi, a training-free compression framework that replaces this single-subspace assumption with a union-of-subspaces model via sparse dictionary learning. CoSpaDi factorizes each weight matrix into a dense dictionary and column-sparse coefficients, allowing different columns to select different subsets of dictionary atoms at the same storage budget. To preserve model behavior, we use calibration activations to transform functional reconstruction into a standard dictionary learning problem. Across Llama and Qwen models, CoSpaDi improves accuracy--compression and perplexity--compression trade-offs over SVD-based and structured pruning baselines at 20--40\% compression ratios, while naturally supporting sparse--dense computation and post-training quantization of sparse coefficients.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Motion-Compensated Weight Compression
MCWC aligns permutation-symmetric blocks across layers to enable sequential prediction and residual entropy coding, improving rate-accuracy tradeoffs versus quantization and prior codecs on language and vision models.