pith. sign in

LOOKAT: Lookup-Optimized Key-Attention for Memory- Efficient Transformers

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.LG 2

years

2026 2

representative citing papers

AXELRAM: Quantize Once, Never Dequantize

cs.LG · 2026-04-03 · conditional · novelty 6.0

AXELRAM performs attention on quantized KV cache using a fixed orthogonal-transform codebook, reducing multiplications by 102.4x and fixing sign-sensitivity spikes via gradient-free calibration.

citing papers explorer

Showing 2 of 2 citing papers.

  • AXELRAM: Quantize Once, Never Dequantize cs.LG · 2026-04-03 · conditional · none · ref 8

    AXELRAM performs attention on quantized KV cache using a fixed orthogonal-transform codebook, reducing multiplications by 102.4x and fixing sign-sensitivity spikes via gradient-free calibration.

  • HeadQ: Model-Visible Distortion and Score-Space Correction for KV-Cache Quantization cs.LG · 2026-05-05 · unverdicted · none · ref 24 · 2 links

    HeadQ applies score-space logit corrections for keys and attention-weighted surrogates for values to KV-cache quantization, removing 84-94% of excess perplexity in 2-bit key experiments across six models.