Beyond size: How gradients shape pruning decisions in large language models

Rocktim Jyoti Das, Mingjie Sun, Liqun Ma, Zhiqiang Shen · 2023 · arXiv 2311.04902

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on LLMs

cs.LG · 2025-06-15 · unverdicted · novelty 6.0

MaskPro learns categorical distributions over groups of M weights to generate exact (N:M) sparsity via N-way sampling without replacement and stabilizes training with a moving average tracker of loss residuals.

RAP: Runtime Adaptive Pruning for LLM Inference

cs.LG · 2025-05-22 · unverdicted · novelty 5.0

RAP is a reinforcement learning framework for runtime-adaptive pruning of LLMs that jointly optimizes model weights and KV-cache usage under varying memory budgets.

citing papers explorer

Showing 2 of 2 citing papers.

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on LLMs cs.LG · 2025-06-15 · unverdicted · none · ref 7
MaskPro learns categorical distributions over groups of M weights to generate exact (N:M) sparsity via N-way sampling without replacement and stabilizes training with a moving average tracker of loss residuals.
RAP: Runtime Adaptive Pruning for LLM Inference cs.LG · 2025-05-22 · unverdicted · none · ref 9
RAP is a reinforcement learning framework for runtime-adaptive pruning of LLMs that jointly optimizes model weights and KV-cache usage under varying memory budgets.

Beyond size: How gradients shape pruning decisions in large language models

fields

years

verdicts

representative citing papers

citing papers explorer