SHIELD reduces eDRAM refresh energy by 35% for LLM inference on edge NPUs by isolating sign/exponent from mantissa bits, disabling refresh on transient QO mantissas, and relaxing it on persistent KV mantissas while keeping accuracy intact.
Amd xdna npu in ryzen ai processors,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SHIELD: A Segmented Hierarchical Memory Architecture for Energy-Efficient LLM Inference on Edge NPUs
SHIELD reduces eDRAM refresh energy by 35% for LLM inference on edge NPUs by isolating sign/exponent from mantissa bits, disabling refresh on transient QO mantissas, and relaxing it on persistent KV mantissas while keeping accuracy intact.