Slim-Llama: A 4.69mw large-language- model processor with binary/ternary weights for billion-parameter llama model,

· 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices

cs.AR · 2026-05-01 · unverdicted · novelty 7.0

VitaLLM demonstrates a 16nm silicon prototype accelerator achieving 72.46 tokens/s decode for 3B ternary LLMs in 0.214 mm² area with reduced KV cache traffic via predictive sparse attention.

citing papers explorer

Showing 1 of 1 citing paper.

VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices cs.AR · 2026-05-01 · unverdicted · none · ref 3
VitaLLM demonstrates a 16nm silicon prototype accelerator achieving 72.46 tokens/s decode for 3B ternary LLMs in 0.214 mm² area with reduced KV cache traffic via predictive sparse attention.

Slim-Llama: A 4.69mw large-language- model processor with binary/ternary weights for billion-parameter llama model,

fields

years

verdicts

representative citing papers

citing papers explorer