pith. sign in

Fast on-device llm inference with npus

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.AR 1 cs.NI 1

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

NPU Design for Diffusion Language Model Inference

cs.AR · 2026-01-28 · unverdicted · novelty 8.0

Introduces the first NPU accelerator for diffusion language models with dLLM-specific ISA, hardware execution model, BAOS KV quantization, and 7nm RTL synthesis.

citing papers explorer

Showing 1 of 1 citing paper after filters.