When Compression Becomes an Attack Surface: Black-Box Attacks on Prompt-Compressed LLM Agents

Dongdong She; Yuchong Xie; Zesen Liu; Zhixiang Zhang

arxiv: 2510.22963 · v4 · pith:7UJUQX6Qnew · submitted 2025-10-27 · 💻 cs.CR · cs.AI

When Compression Becomes an Attack Surface: Black-Box Attacks on Prompt-Compressed LLM Agents

Zesen Liu , Zhixiang Zhang , Yuchong Xie , Dongdong She This is my paper

classification 💻 cs.CR cs.AI

keywords compressionattackbackendcompressoragentsbeforeblack-boxcoma

0 comments

read the original abstract

Prompt compression is increasingly deployed in LLM agents to reduce latency and cost, but it also determines what the backend LLM ultimately sees. We show that, when trusted and untrusted inputs are compressed under a shared budget, this lossy transformation creates a new attack surface: by perturbing only untrusted inputs before compression, an adversary can cause the compressor to discard task-critical evidence or safety guardrails before inference. Unlike prompt injection, jailbreaks, or RAG poisoning, the attack target is the compressor rather than the backend LLM; the perturbation need not encode a meaningful instruction or survive compression. We formalize this vulnerability as adversarial information loss (AIL), the excess downstream distortion caused by adversarially steering a lossy compressor beyond benign compression alone. To exploit AIL, we present COMA, a transfer-based black-box attack that optimizes pre-compression perturbations using attacker-side surrogate compressors and backend LLMs. Across three tasks and six compressors, COMA achieves 0.71 average ASR, versus 0.21 for the strongest baseline, and transfers to two real-world agent case studies.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Safe to Check, Unsafe to Use: Relinking at the Compression Boundary of LLM Agents
cs.CR 2026-06 unverdicted novelty 7.0

Relinking is a new compression-boundary attack on LLM agents where summarization of split benign fragments produces malicious instructions, shown via Relink tool at 86.9% success rate and mitigated by KBRA defense to 0%.
Ghost in the Context: Measuring Policy-Carriage Failures in Decision-Time Assembly
cs.CR 2026-05 unverdicted novelty 6.0

Policy directives can be lost during context assembly in language model agents, leading to unprompted policy violations that SafeContext can partially prevent.
Ghost in the Context: Measuring Policy-Carriage Failures in Decision-Time Assembly
cs.CR 2026-05 unverdicted novelty 5.0

The paper measures policy-carriage failures during LLM context assembly and evaluates SafeContext as a partial mitigation on Llama, Qwen, and Mistral models.