Blockffn: Towards end-side acceleration-friendly mixture-of-experts with chunk-level activation sparsity.arXiv preprint arXiv:2507.08771, 2025

Chenyang Song, Weilin Zhao, Xu Han, Chaojun Xiao, Yingfa Chen, Yuxuan Li, Zhiyuan Liu, Maosong Sun · 2025 · arXiv 2507.08771

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

dMoE: dLLMs with Learnable Block Experts

cs.CL · 2026-05-29 · unverdicted · novelty 6.0

dMoE aggregates token expert distributions to block level in dLLMs, cutting unique experts from 69.5 to 14.6, memory by 76-80%, and latency by 1.14-1.66x while retaining 99.11% performance.

citing papers explorer

Showing 1 of 1 citing paper.

dMoE: dLLMs with Learnable Block Experts cs.CL · 2026-05-29 · unverdicted · none · ref 21
dMoE aggregates token expert distributions to block level in dLLMs, cutting unique experts from 69.5 to 14.6, memory by 76-80%, and latency by 1.14-1.66x while retaining 99.11% performance.

Blockffn: Towards end-side acceleration-friendly mixture-of-experts with chunk-level activation sparsity.arXiv preprint arXiv:2507.08771, 2025

fields

years

verdicts

representative citing papers

citing papers explorer