Sirius: Contextual sparsity with correction for efficient llms.ArXiv preprint, abs/2409.03856, 2024

Yang Zhou, Zhuoming Chen, Zhaozhuo Xu, Victoria Lin, Beidi Chen · 2024 · arXiv 2409.03856

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Continual LLM Upcycling: A Predictor-Gated Bank-Wise Sparsity Training Recipe for Dense-to-Sparse LLMs

cs.CL · 2026-06-09 · unverdicted · novelty 5.0

Continual training recipe upcycles dense Qwen2.5-8B LLM to 4x channel-sparse model via predictor-gated bank-wise sparsity in SwiGLU FFN with a single-layer repair for long-context failure on RULER-CWE.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Continual LLM Upcycling: A Predictor-Gated Bank-Wise Sparsity Training Recipe for Dense-to-Sparse LLMs cs.CL · 2026-06-09 · unverdicted · none · ref 7
Continual training recipe upcycles dense Qwen2.5-8B LLM to 4x channel-sparse model via predictor-gated bank-wise sparsity in SwiGLU FFN with a single-layer repair for long-context failure on RULER-CWE.

Sirius: Contextual sparsity with correction for efficient llms.ArXiv preprint, abs/2409.03856, 2024

fields

years

verdicts

representative citing papers

citing papers explorer