X-boundary: Establishing ex- act safety boundary to shield llms from multi-turn jail- breaks without compromising usability.arXiv preprint arXiv:2502.09990

Lu, X · 2025 · arXiv 2502.09990

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

MultiBreak: A Scalable and Diverse Multi-turn Jailbreak Benchmark for Evaluating LLM Safety

cs.CL · 2026-05-03 · unverdicted · novelty 6.0

MultiBreak is a large diverse multi-turn jailbreak benchmark that achieves substantially higher attack success rates on LLMs than prior datasets and reveals topic-specific vulnerabilities in multi-turn settings.

CoopGuard: Stateful Cooperative Agents Safeguarding LLMs Against Evolving Multi-Round Attacks

cs.CR · 2026-04-05 · unverdicted · novelty 6.0

CoopGuard deploys cooperative agents to track conversation history and counter evolving multi-round attacks on LLMs, achieving a 78.9% reduction in attack success rate on a new 5,200-sample benchmark.

Towards provable probabilistic safety for scalable embodied AI systems

eess.SY · 2025-06-05 · unverdicted · novelty 4.0

The paper proposes a paradigm of provable probabilistic safety to enable scalable, safe deployment of embodied AI in critical applications.

citing papers explorer

Showing 3 of 3 citing papers.

MultiBreak: A Scalable and Diverse Multi-turn Jailbreak Benchmark for Evaluating LLM Safety cs.CL · 2026-05-03 · unverdicted · none · ref 67
MultiBreak is a large diverse multi-turn jailbreak benchmark that achieves substantially higher attack success rates on LLMs than prior datasets and reveals topic-specific vulnerabilities in multi-turn settings.
CoopGuard: Stateful Cooperative Agents Safeguarding LLMs Against Evolving Multi-Round Attacks cs.CR · 2026-04-05 · unverdicted · none · ref 18
CoopGuard deploys cooperative agents to track conversation history and counter evolving multi-round attacks on LLMs, achieving a 78.9% reduction in attack success rate on a new 5,200-sample benchmark.
Towards provable probabilistic safety for scalable embodied AI systems eess.SY · 2025-06-05 · unverdicted · none · ref 150
The paper proposes a paradigm of provable probabilistic safety to enable scalable, safe deployment of embodied AI in critical applications.

X-boundary: Establishing ex- act safety boundary to shield llms from multi-turn jail- breaks without compromising usability.arXiv preprint arXiv:2502.09990

fields

years

verdicts

representative citing papers

citing papers explorer