OTora provides the first unified framework for reasoning-level denial-of-service attacks on LLM agents, achieving up to 10x more reasoning tokens and order-of-magnitude latency increases while preserving task accuracy across multiple agent types and models.
Bad- think: Triggered overthinking attacks on chain-of-thought reasoning in large language models.arXiv preprint arXiv:2511.10714
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
RT-LRM benchmark finds Large Reasoning Models more fragile than standard LLMs to risks like CoT-hijacking and prompt-induced issues.
An external zero-shot monitor detects nine unsafe reasoning behaviors in LLMs at 87% step-level accuracy with low false positives and low latency.
citing papers explorer
-
OTora: A Unified Red Teaming Framework for Reasoning-Level Denial-of-Service in LLM Agents
OTora provides the first unified framework for reasoning-level denial-of-service attacks on LLM agents, achieving up to 10x more reasoning tokens and order-of-magnitude latency increases while preserving task accuracy across multiple agent types and models.
-
Red Teaming Large Reasoning Models
RT-LRM benchmark finds Large Reasoning Models more fragile than standard LLMs to risks like CoT-hijacking and prompt-induced issues.
-
Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models
An external zero-shot monitor detects nine unsafe reasoning behaviors in LLMs at 87% step-level accuracy with low false positives and low latency.