GPTQ-intrinsic LoRA augments GPTQ with intrinsic low-rank compensation via Hessian modification to achieve layer-wise reconstruction bounds that match information-theoretic lower bounds under structural assumptions.
Oats: Outlier-aware pruning through sparse and low rank decomposition.arXiv preprint arXiv:2409.13652
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
RT-Lynx shifts DiT sparsity from weights to activations, reports up to 1.55x linear-layer speedup while preserving generation quality across multiple diffusion models.
ELAS pre-trains low-rank LLMs by applying 2:4 activation sparsity after squared ReLU to cut memory and accelerate training with minimal performance loss.
citing papers explorer
-
GPTQ-intrinsic LoRA: A Near-optimal Algorithm for Low-precision Quantization with Low-rank Adaptation
GPTQ-intrinsic LoRA augments GPTQ with intrinsic low-rank compensation via Hessian modification to achieve layer-wise reconstruction bounds that match information-theoretic lower bounds under structural assumptions.
-
RT-Lynx: Putting the GEMM Sparsity In a Right Way for Diffusion Models
RT-Lynx shifts DiT sparsity from weights to activations, reports up to 1.55x linear-layer speedup while preserving generation quality across multiple diffusion models.
-
ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity
ELAS pre-trains low-rank LLMs by applying 2:4 activation sparsity after squared ReLU to cut memory and accelerate training with minimal performance loss.