Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer

https://arxiv · arXiv 2511.06719

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale Deployment

cs.LG · 2026-03-16 · unverdicted · novelty 6.0

MobileLLM-Flash creates 350M-1.4B parameter LLMs via latency-guided search and attention skipping, delivering up to 1.8x faster prefill and 1.6x faster decode on mobile CPUs with comparable or better quality.

citing papers explorer

Showing 1 of 1 citing paper.

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale Deployment cs.LG · 2026-03-16 · unverdicted · none · ref 7
MobileLLM-Flash creates 350M-1.4B parameter LLMs via latency-guided search and attention skipping, delivering up to 1.8x faster prefill and 1.6x faster decode on mobile CPUs with comparable or better quality.

Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer

fields

years

verdicts

representative citing papers

citing papers explorer