A Closer Look at System Prompt Robustness

Norman Mu et al · 2025 · arXiv 2502.12197

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

When Alignment Isn't Enough: Response-Path Attacks on LLM Agents

cs.CR · 2026-05-04 · unverdicted · novelty 7.0

A malicious relay can strategically rewrite aligned LLM outputs in BYOK agent architectures to achieve up to 99.1% attack success on benchmarks like AgentDojo and ASB.

Library Hallucinations in LLM-Generated Code: A Risk Analysis Grounded in Developer Queries

cs.SE · 2025-09-26 · unverdicted · novelty 7.0

A study of seven LLMs finds that realistic prompt variations such as one-character misspellings trigger library hallucinations in up to 26% of cases, fabricated names in up to 99%, and time-based prompts in up to 85%, and introduces LibHalluBench for evaluation.

A Study of LLMs' Preferences for Libraries and Programming Languages

cs.SE · 2025-03-21 · unverdicted · novelty 6.0

Empirical study of eight LLMs finds overuse of popular libraries like NumPy in up to 45% of unnecessary cases and strong default preference for Python even when suboptimal.

Political Plasticity: An Analysis of Ideological Adaptability in Large Language Models

cs.AI · 2026-05-08 · unverdicted · novelty 5.0

LLMs display political plasticity via prompt-driven ideological adaptation that is more reliable in larger newer models, but inverted questions produce counterintuitive shifts suggesting data leakage.

The Pitfalls of KV Cache Compression

cs.LG · 2025-09-30 · conditional · novelty 5.0

KV cache compression causes certain instructions to degrade rapidly and be ignored in multi-instruction prompting, with system prompt leakage worsened by method choice, instruction order, and eviction bias; simple policy changes can mitigate this.

citing papers explorer

Showing 5 of 5 citing papers.

When Alignment Isn't Enough: Response-Path Attacks on LLM Agents cs.CR · 2026-05-04 · unverdicted · none · ref 12
A malicious relay can strategically rewrite aligned LLM outputs in BYOK agent architectures to achieve up to 99.1% attack success on benchmarks like AgentDojo and ASB.
Library Hallucinations in LLM-Generated Code: A Risk Analysis Grounded in Developer Queries cs.SE · 2025-09-26 · unverdicted · none · ref 35
A study of seven LLMs finds that realistic prompt variations such as one-character misspellings trigger library hallucinations in up to 26% of cases, fabricated names in up to 99%, and time-based prompts in up to 85%, and introduces LibHalluBench for evaluation.
A Study of LLMs' Preferences for Libraries and Programming Languages cs.SE · 2025-03-21 · unverdicted · none · ref 54
Empirical study of eight LLMs finds overuse of popular libraries like NumPy in up to 45% of unnecessary cases and strong default preference for Python even when suboptimal.
Political Plasticity: An Analysis of Ideological Adaptability in Large Language Models cs.AI · 2026-05-08 · unverdicted · none · ref 11
LLMs display political plasticity via prompt-driven ideological adaptation that is more reliable in larger newer models, but inverted questions produce counterintuitive shifts suggesting data leakage.
The Pitfalls of KV Cache Compression cs.LG · 2025-09-30 · conditional · none · ref 7
KV cache compression causes certain instructions to degrade rapidly and be ignored in multi-instruction prompting, with system prompt leakage worsened by method choice, instruction order, and eviction bias; simple policy changes can mitigate this.

A Closer Look at System Prompt Robustness

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer