STUN : Structured-Then-Unstructured Pruning for Scalable M o E Pruning

Lee, Jaeseong, Hwang, Seung-won, Qiao, Aurick, Campos, Daniel F, Yao, Zhewei, He, Yuxiong · 2025 · DOI 10.18653/v1/2025.acl-long.671

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference

cs.LG · 2026-06-09 · unverdicted · novelty 6.0

SpenseGPT introduces a hybrid sparse-dense weight format and one-shot pruning that delivers 1.2x end-to-end LLM decoding speedup on B200 GPUs with FP8 while preserving accuracy on Qwen3-32B and Seed-OSS-36B.

TENP: Trapezoidal Expert Neuron Pruning For Mixture-of-Experts

cs.LG · 2026-06-03 · unverdicted · novelty 6.0

TENP applies trapezoidal expert-neuron pruning to MoE models, retaining key experts while pruning others via projected neuron contribution, yielding only 1-point accuracy drop at 40% sparsity on DeepSeek with 10% code-generation gain.

citing papers explorer

Showing 2 of 2 citing papers after filters.

SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference cs.LG · 2026-06-09 · unverdicted · none · ref 13
SpenseGPT introduces a hybrid sparse-dense weight format and one-shot pruning that delivers 1.2x end-to-end LLM decoding speedup on B200 GPUs with FP8 while preserving accuracy on Qwen3-32B and Seed-OSS-36B.
TENP: Trapezoidal Expert Neuron Pruning For Mixture-of-Experts cs.LG · 2026-06-03 · unverdicted · none · ref 10
TENP applies trapezoidal expert-neuron pruning to MoE models, retaining key experts while pruning others via projected neuron contribution, yielding only 1-point accuracy drop at 40% sparsity on DeepSeek with 10% code-generation gain.

STUN : Structured-Then-Unstructured Pruning for Scalable M o E Pruning

fields

years

verdicts

representative citing papers

citing papers explorer