arXiv preprint arXiv:2512.07059 , year=

Replicating TEMPEST at Scale: Multi-Turn Adversarial Attacks Against Trillion-Parameter Frontier Models , author= · arXiv 2512.07059

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

CHASE: Adversarial Red-Blue Teaming for Improving LLM Safety using Reinforcement Learning

cs.CL · 2026-06-04 · unverdicted · novelty 6.0

CHASE uses co-evolutionary RL with GRPO to harden LLMs against black-box prompt-rewriting attacks, cutting mean StrongREJECT scores by 43.2% on held-out families while keeping zero false refusals on benign prompts.

citing papers explorer

Showing 1 of 1 citing paper.

CHASE: Adversarial Red-Blue Teaming for Improving LLM Safety using Reinforcement Learning cs.CL · 2026-06-04 · unverdicted · none · ref 22
CHASE uses co-evolutionary RL with GRPO to harden LLMs against black-box prompt-rewriting attacks, cutting mean StrongREJECT scores by 43.2% on held-out families while keeping zero false refusals on benign prompts.

arXiv preprint arXiv:2512.07059 , year=

fields

years

verdicts

representative citing papers

citing papers explorer