pith. sign in

Max , Quantizing for minimum distortion , Trans

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

UltraQuant: 4-bit KV Caching for Context-Heavy Agents

cs.LG · 2026-06-18 · unverdicted · novelty 4.0

UltraQuant applies 4-bit KV caching with TurboQuant-style rotation and custom AMD kernels to context-heavy agent workloads, delivering 3.47x P50 TTFT reduction in cache-pressured rounds and 1.63x throughput gain over FP8.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • UltraQuant: 4-bit KV Caching for Context-Heavy Agents cs.LG · 2026-06-18 · unverdicted · none · ref 19

    UltraQuant applies 4-bit KV caching with TurboQuant-style rotation and custom AMD kernels to context-heavy agent workloads, delivering 3.47x P50 TTFT reduction in cache-pressured rounds and 1.63x throughput gain over FP8.