TENP applies trapezoidal expert-neuron pruning to MoE models, retaining key experts while pruning others via projected neuron contribution, yielding only 1-point accuracy drop at 40% sparsity on DeepSeek with 10% code-generation gain.
DIVE into M o E : Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TENP: Trapezoidal Expert Neuron Pruning For Mixture-of-Experts
TENP applies trapezoidal expert-neuron pruning to MoE models, retaining key experts while pruning others via projected neuron contribution, yielding only 1-point accuracy drop at 40% sparsity on DeepSeek with 10% code-generation gain.