A formalized design-space framework with generator and TSMC 16nm-validated cost model shows that LUT reuse gains depend on activation type and that larger cores improve density, yielding 2.2x area reduction over multiplier baselines.
Quantifying the Capabilities of LLMs across Scale and Precision
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Empirical evaluation of quantization effects on eight LLMs across bit widths, showing performance generally declines at lower precision but with model-size-dependent resilience and acceptable accuracy at 2 bits for many cases.
citing papers explorer
-
Hardware Generation and Exploration of Lookup Table-Based Accelerators for 1.58-bit LLM Inference
A formalized design-space framework with generator and TSMC 16nm-validated cost model shows that LUT reuse gains depend on activation type and that larger cores improve density, yielding 2.2x area reduction over multiplier baselines.
-
K-Quantization and its Impact on Output Performance
Empirical evaluation of quantization effects on eight LLMs across bit widths, showing performance generally declines at lower precision but with model-size-dependent resilience and acceptable accuracy at 2 bits for many cases.