JailAgent red-teams LLM agents by hijacking reasoning trajectories and tightening constraints without prompt changes, claiming strong cross-model and cross-scenario performance.
InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10542– 10560
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents
JailAgent red-teams LLM agents by hijacking reasoning trajectories and tightening constraints without prompt changes, claiming strong cross-model and cross-scenario performance.