pith. sign in

arxiv: 2604.17249 · v1 · submitted 2026-04-19 · 💻 cs.CR · cs.AR· cs.LG

Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

Pith reviewed 2026-05-10 06:31 UTC · model grok-4.3

classification 💻 cs.CR cs.ARcs.LG
keywords bit-flip attacksKV-cacheLLM servingsilent divergenceprefix cachingpersistent damageintegrity protectiondata corruption
0
0 comments X

The pith

Shared KV-cache blocks in LLM serving systems can be corrupted by bit flips, causing silent but persistent changes in responses for all requests using the same prefix.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that shared KV-cache blocks, stored as a single physical copy for prefix reuse in LLM serving, lack integrity protection and form a new target for bit-flip attacks. Software fault injection reveals that 13 of 16 BF16 bit positions produce coherent but incorrect outputs indistinguishable from normal responses without a baseline comparison. These changes affect only requests that share the corrupted prefix and persist without decay, so the total impact grows linearly with each new request using the block. The authors demonstrate that adding a checksum check at scheduling time detects any single-bit change and limits damage to one batch, at negligible cost. Readers should care because production systems keep popular prefixes cached for long periods, creating an opportunity for undetected, accumulating degradation that differs from attacks on model weights.

Core claim

Shared KV-cache blocks exist as a single physical copy without integrity protection. Software fault injection under ideal bit targeting shows that 13 of 16 BF16 bit positions produce coherent but altered outputs that are indistinguishable from legitimate responses without a clean baseline. Only requests sharing the targeted prefix are affected, and the corruption persists with no temporal decay, so cumulative damage grows linearly with subsequent requests. This profile enables detection evasion and unchecked amplification bounded only by cache lifetime. A checksum-based countermeasure detects any single-bit corruption at scheduling time, bounding cumulative damage to one batch independent of

What carries the argument

The shared KV-cache block as a single unprotected physical copy in prefix caching, subjected to software fault injection on individual BF16 bits to induce and characterize silent data corruption.

If this is right

  • Corrupted blocks cause silent divergence where outputs change coherently but incorrectly in 13 of 16 bit positions.
  • Corruption propagates selectively only to requests that share the targeted prefix.
  • Damage accumulates persistently with no decay, growing linearly as more requests use the block.
  • The proposed checksum detects single-bit errors at scheduling, limiting total damage to one batch regardless of how long the block stays cached.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Long-running conversations or repeated queries using common prefixes would experience escalating degradation if a block is hit.
  • Without such protections, attackers could target high-traffic prefixes to affect many users over extended periods.
  • Similar integrity mechanisms might be needed for other shared structures in LLM systems like attention caches.
  • Validating these effects with actual hardware-based bit flips rather than software simulation would strengthen the findings.

Load-bearing premise

Software fault injection under ideal bit targeting accurately represents the effects and feasibility of real Rowhammer attacks on GPU DRAM in production LLM serving systems with shared prefix caches.

What would settle it

An experiment that either succeeds or fails to induce the described silent divergence, selective propagation, and persistent accumulation by performing actual Rowhammer attacks on GPU memory holding KV-cache blocks in an LLM serving system.

Figures

Figures reproduced from arXiv: 2604.17249 by Satoshi Matsuura, Yuji Yamamoto.

Figure 1
Figure 1. Figure 1: Trial flow for a single experimental condition. Each trial proceeds through four sequential phases; the warm-up [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Mean TCR by bit position, averaged across all five [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Per-request corruption rate c¯i over 100 sequential requests, split by bit position (circles: Qwen3-8B; trian￾gles: DeepSeek-R1). Shaded bands: mean ±1 SD (darker: Qwen3-8B; lighter: DeepSeek-R1). 6 COUNTERMEASURES The linear damage growth established in Section 5.3 stems from the absence of integrity verification on cached blocks. Adding integrity verification— detecting and invalidating corrupted blocks … view at source ↗
Figure 4
Figure 4. Figure 4: Mean cumulative affected count C¯N over 100 requests. Shaded regions: ±1 SD; dashed lines: OLS linear fits. To the best of our knowledge, no prior work has pro￾posed runtime integrity verification of cached KV ten￾sors. 6.1 Detection Mechanism 6.1.1 Mechanism To meet these requirements, the mechanism verifies block integrity via hash comparison at two lifecycle events in vLLM’s block pool. On cache: when a… view at source ↗
read the original abstract

Rowhammer on GPU DRAM has enabled adversarial bit flips in model weights; shared KV-cache blocks in LLM serving systems present an analogous but previously unexamined target. In vLLM's Prefix Caching, these blocks exist as a single physical copy without integrity protection. Using software fault injection under ideal bit targeting, we characterize worst-case severity and identify three properties: (1) Silent divergence - 13 of 16 BF16 bit positions produce coherent but altered outputs, indistinguishable from legitimate responses without a clean baseline. (2) Selective propagation - only requests sharing the targeted prefix are affected. (3) Persistent accumulation - no temporal decay occurs, so cumulative damage grows linearly with subsequent requests. Together, these constitute a threat profile distinct from weight corruption: silent divergence and selective propagation enable detection evasion; persistent accumulation then proceeds unchecked, yielding damage amplification bounded only by how long the block remains cached. A checksum-based countermeasure detects any single-bit corruption at scheduling time, bounding cumulative damage to one batch independent of the block's cache lifetime, with negligible overhead. These results argue for integrity protection of prefix blocks before end-to-end exploitation is demonstrated.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript examines bit-flip vulnerabilities targeting shared KV-cache blocks in LLM serving systems such as vLLM's prefix caching. Using software fault injection to model ideal single-bit flips in BF16 KV entries, the authors characterize three properties: silent divergence (13 of 16 bit positions produce coherent but altered outputs indistinguishable without a baseline), selective propagation (only requests sharing the targeted prefix are affected), and persistent accumulation (no temporal decay, with damage growing linearly). They propose a checksum-based countermeasure that detects single-bit corruptions at scheduling time to bound damage to one batch, with negligible overhead. The work positions this as a distinct threat from weight corruption due to evasion and amplification potential.

Significance. If the ideal software fault injection results map to achievable physical attacks, the paper identifies a previously unexamined attack surface in production LLM serving with stealthy, accumulating effects on shared prefix caches. The empirical characterization of bit-position sensitivity and propagation behaviors provides concrete, reproducible data on worst-case severity under the modeled conditions, and the checksum mitigation is a practical, low-overhead defense that directly addresses the identified properties. These elements strengthen the case for integrity protections in KV caches.

major comments (1)
  1. [Abstract] Abstract: The threat profile (silent divergence enabling detection evasion, selective propagation, and persistent accumulation leading to unbounded damage) and the motivation for the checksum countermeasure are load-bearing on the assumption that software fault injection under ideal bit targeting accurately represents feasible Rowhammer attacks on GPU DRAM. The manuscript provides no hardware validation, analysis of row adjacency, DRAM organization, refresh rates, or ECC effects that would establish whether precise, isolated flips in BF16 KV entries are possible without collateral damage; this leaves real-world exploitability and the distinctiveness of the threat profile unproven.
minor comments (2)
  1. [Abstract] The abstract states the checksum has 'negligible overhead' but does not reference a specific evaluation section, table, or quantitative results (e.g., latency or throughput impact) to support this claim.
  2. Clarify in the methods or results whether the 13/16 bit-position finding is specific to BF16 or generalizes, and include error bars or multiple runs for the fault-injection experiments to strengthen reproducibility.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback. We address the major comment on the assumptions underlying our fault-injection model below.

read point-by-point responses
  1. Referee: The threat profile (silent divergence enabling detection evasion, selective propagation, and persistent accumulation leading to unbounded damage) and the motivation for the checksum countermeasure are load-bearing on the assumption that software fault injection under ideal bit targeting accurately represents feasible Rowhammer attacks on GPU DRAM. The manuscript provides no hardware validation, analysis of row adjacency, DRAM organization, refresh rates, or ECC effects that would establish whether precise, isolated flips in BF16 KV entries are possible without collateral damage; this leaves real-world exploitability and the distinctiveness of the threat profile unproven.

    Authors: We agree that the threat characterization depends on the modeled ideal single-bit flips. Our study employs software fault injection to quantify worst-case effects under precise targeting, a common methodology in early vulnerability analyses to surface potential attack surfaces before physical feasibility is established. We do not assert that such isolated flips are presently achievable on GPU DRAM for KV-cache blocks. We will revise the abstract, introduction, and threat-model section to explicitly qualify all results as holding under ideal bit-flip conditions and to discuss the additional hurdles (DRAM organization, refresh, ECC, and row-adjacency constraints) that would need to be overcome for a practical Rowhammer exploit. This revision will make clear that the distinctiveness of the threat profile is conditional on exploitability. revision: partial

standing simulated objections not resolved
  • Hardware validation or detailed analysis of Rowhammer feasibility on GPU DRAM targeting KV-cache blocks

Circularity Check

0 steps flagged

No circularity; empirical characterization stands on its own

full rationale

The paper presents an empirical study using software fault injection to characterize three properties of bit-flips in shared KV-cache blocks. No equations, fitted parameters, or derivations are invoked that reduce predictions to inputs by construction. No self-citations appear in the provided text as load-bearing for the central claims. The threat profile and countermeasure motivation follow directly from the experimental observations without self-referential reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that software fault injection models real hardware bit-flip effects and on the standard assumption that prefix caching shares physical blocks without integrity checks.

axioms (2)
  • domain assumption Software fault injection under ideal bit targeting accurately models Rowhammer effects on GPU DRAM in LLM serving systems
    Invoked to characterize worst-case severity without demonstrating physical attacks.
  • standard math Prefix caching in vLLM maintains a single physical copy of KV blocks without integrity protection
    Standard description of the system architecture used as baseline.

pith-pipeline@v0.9.0 · 5499 in / 1376 out tokens · 72214 ms · 2026-05-10T06:31:08.495171+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    Al Nahian, M., Almalky, A. M. A., Aragonda, G., Zhou, R., Ahmed, S., Ponomarev, D., Yang, L., Angizi, S., and Rakin, A. S. (2025). Cachetrap: Injecting trojans in llms without leaving any traces in inputs or weights

  2. [2]

    ShareGPT\_Vicuna\_unfiltered [dataset]

    anon8231489123 (2023). ShareGPT\_Vicuna\_unfiltered [dataset]

  3. [3]

    Claude opus 4.6 system card

    Anthropic (2026). Claude opus 4.6 system card

  4. [4]

    S., Qu, J., Sun, Y., Chen, S., Yang, L., Saileshwar, G., Nair, P., Fang, B., and Hong, S

    Coalson, Z., Woo, J., Lin, C. S., Qu, J., Sun, Y., Chen, S., Yang, L., Saileshwar, G., Nair, P., Fang, B., and Hong, S. (2025). Prisonbreak: Jailbreaking large language models with at most twenty-five targeted bit-flips

  5. [5]

    Das, S., Bhattacharya, S., Kundu, S., Kundu, S., Menon, A., Raha, A., and Basu, K. (2025). Genbfa: An evolutionary optimization approach to bit-flip attacks on llms. In NeurIPS

  6. [6]

    Deepseek-r1 incentivizes reasoning in llms through reinforcement learning

    DeepSeek-AI (2025). Deepseek-r1 incentivizes reasoning in llms through reinforcement learning. Nature , 645:633--638

  7. [7]

    Prevent side-channel attacks via cache salting

    dr75 (2025). Prevent side-channel attacks via cache salting. vLLM Pull Request \#17045

  8. [8]

    Frigo, P., Vannacci, E., Hassan, H., van der Veen, V., Mutlu, O., Giuffrida, C., Bos, H., and Razavi, K. (2020). TRRespass : Exploiting the many sides of target row refresh. In IEEE Symposium on Security and Privacy (S&P) , pages 747--762

  9. [9]

    Ganesh, M., Iyer, K., and Ananthan, A. B. S. (2025). Whose narrative is it anyway? a kv cache manipulation attack

  10. [10]

    Gruss, D., Maurice, C., and Mangard, S. (2016). Rowhammer.js: A remote software-induced fault attack in JavaScript . In DIMVA , volume 9721 of LNCS . Springer

  11. [11]

    Guo, J., Chakrabarti, C., and Fan, D. (2025). SBFA : Single sneaky bit flip attack to break large language models

  12. [12]

    Guo, J., Chakrabarti, C., and Fan, D. (2026). TFL : Targeted bit-flip attack on large language model

  13. [13]

    T., Jammalamadaka, N., Huang, J., Yuen, H., Yang, J., Park, J., Heinecke, A., Georganas, E., Srinivasan, S., Kundu, A., Smelyanskiy, M., Kaul, B., and Dubey, P

    Kalamkar, D., Mudigere, D., Mellempudi, N., Das, D., Banerjee, K., Avancha, S., Vooturi, D. T., Jammalamadaka, N., Huang, J., Yuen, H., Yang, J., Park, J., Heinecke, A., Georganas, E., Srinivasan, S., Kundu, A., Smelyanskiy, M., Kaul, B., and Dubey, P. (2019). A study of bfloat16 for deep learning training

  14. [14]

    H., Lee, D., Wilkerson, C., Lai, K., and Mutlu, O

    Kim, Y., Daly, R., Kim, J., Fallin, C., Lee, J. H., Lee, D., Wilkerson, C., Lai, K., and Mutlu, O. (2014). Flipping bits in memory without accessing them: An experimental study of dram disturbance errors. In ISCA

  15. [15]

    H., Gonzalez, J

    Kwon, W., Li, Z., Zhuang, S., Sheng, Y., Zheng, L., Yu, C. H., Gonzalez, J. E., Zhang, H., and Stoica, I. (2023). Efficient memory management for large language model serving with pagedattention. In SOSP

  16. [16]

    Kwong, A., Genkin, D., Gruss, D., and Yarom, Y. (2020). RAMBleed : Reading bits in memory without accessing them. In IEEE Symposium on Security and Privacy (S&P)

  17. [17]

    Li, X., Meng, Y., Chen, J., Luo, L., and Zeng, Q. (2025). Rowhammer-based trojan injection: One bit flip is sufficient for backdooring dnns. In USENIX Security Symposium

  18. [18]

    S., Qu, J., and Saileshwar, G

    Lin, C. S., Qu, J., and Saileshwar, G. (2025). Gpuhammer: Rowhammer attacks on gpu memories are practical. In USENIX Security Symposium

  19. [19]

    S., Yan, Y., Ding, G., Qu, J., Zhu, J., Lie, D., and Saileshwar, G

    Lin, C. S., Yan, Y., Ding, G., Qu, J., Zhu, J., Lie, D., and Saileshwar, G. (2026). GPUBreach : Privilege escalation attacks on GPUs using rowhammer. In IEEE Symposium on Security and Privacy (S&P)

  20. [20]

    Lin, C.-Y. (2004). ROUGE : A package for automatic evaluation of summaries. In Text Summarization Branches Out

  21. [21]

    G., Papaioannou, K., Guarnieri, M., and Doudali, T

    Pennas, P. G., Papaioannou, K., Guarnieri, M., and Doudali, T. D. (2026). Cachesolidarity: Preventing prefix caching side channels in multi-tenant llm serving systems

  22. [22]

    Qwen3 technical report

    Qwen Team (2025). Qwen3 technical report

  23. [23]

    S., He, Z., and Fan, D

    Rakin, A. S., He, Z., and Fan, D. (2019). Bit-flip attack: Crushing neural network with progressive bit search. In ICCV

  24. [24]

    and Dullien, T

    Seaborn, M. and Dullien, T. (2015). Exploiting the DRAM rowhammer bug to gain kernel privileges. In Black Hat USA

  25. [25]

    Song, L., Pang, Z., Wang, W., Wang, Z., Wang, X., Chen, H., Song, W., Jin, Y., Meng, D., and Hou, R. (2025). The early bird catches the leak: Unveiling timing side channels in llm serving systems. IEEE Transactions on Information Forensics and Security , 20:11431--11446

  26. [26]

    Wu, G., Zhang, Z., Zhang, Y., Wang, W., Niu, J., Wu, Y., and Zhang, Y. (2025). I know what you asked: Prompt leakage via kv-cache sharing in multi-tenant llm serving. In NDSS

  27. [27]

    Wu, X., Ying, L., Chen, G., Gu, Y., and Qu, H. (2026). Cache me, catch you: Cache related security threats in llm serving frameworks. In NDSS

  28. [28]

    and Zhang, S

    Xia, T. and Zhang, S. Q. (2025). Kelle: Co-design kv caching and edram for efficient llm serving in edge computing. In MICRO

  29. [29]

    Yan, Y., Lu, S., Gao, Y., Li, Z., Zhao, Z., Yuan, Q., and Wang, Y. (2025). Has the two-decade-old prophecy come true? Artificial Bad Intelligence triggered by merely a single-bit flip in large language models

  30. [30]

    S., and Fan, D

    Yao, F., Rakin, A. S., and Fan, D. (2020). Deephammer: Depleting the intelligence of deep neural networks through targeted chain of bit flips. In USENIX Security Symposium

  31. [31]

    Q., and Artzi, Y

    Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., and Artzi, Y. (2020). BERTS core: Evaluating text generation with BERT . In ICLR