Cast: Continuous and differentiable semi-structured sparsity-aware training for large language models.ArXiv, abs/2509.25996

Weiyu Huang, Yuezhou Hu, Jun Zhu, Jianfei Chen · arXiv 2509.25996

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

SparseForge: Efficient Semi-Structured LLM Sparsification via Annealing of Hessian-Guided Soft-Mask

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

SparseForge achieves 57.27% zero-shot accuracy on LLaMA-2-7B at 2:4 sparsity using only 5B retraining tokens, beating the dense baseline and nearly matching a 40B-token SOTA method.

citing papers explorer

Showing 1 of 1 citing paper.

SparseForge: Efficient Semi-Structured LLM Sparsification via Annealing of Hessian-Guided Soft-Mask cs.LG · 2026-05-07 · unverdicted · none · ref 16
SparseForge achieves 57.27% zero-shot accuracy on LLaMA-2-7B at 2:4 sparsity using only 5B retraining tokens, beating the dense baseline and nearly matching a 40B-token SOTA method.

Cast: Continuous and differentiable semi-structured sparsity-aware training for large language models.ArXiv, abs/2509.25996

fields

years

verdicts

representative citing papers

citing papers explorer