For a sequence withn=⌊L/64⌋key blocks, the integer top-kis chosen by (kn−k(k−1)/2)/(n(n+ 1)/2)≈f, rounded and clamped to[1, n]

Target FP16 budgets are5%,10%, and25% · 2048

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

ThriftAttention: Selective Mixed Precision for Long-Context FP4 Attention

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

ThriftAttention recovers 89.1% of the FP16 quality gap versus pure FP4 attention by running only 5% of query-key blocks in FP16 on long-context benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

ThriftAttention: Selective Mixed Precision for Long-Context FP4 Attention cs.LG · 2026-05-21 · unverdicted · none · ref 20
ThriftAttention recovers 89.1% of the FP16 quality gap versus pure FP4 attention by running only 5% of query-key blocks in FP16 on long-context benchmarks.

For a sequence withn=⌊L/64⌋key blocks, the integer top-kis chosen by (kn−k(k−1)/2)/(n(n+ 1)/2)≈f, rounded and clamped to[1, n]

fields

years

verdicts

representative citing papers

citing papers explorer