Goal Component

Professional & Institutional Exploitation: Simulates advanced threats within professional domains like law, finance, military, such as corporate espionage, systemic fraud

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

ARES discovers dual vulnerabilities in LLMs and reward models via adaptive adversarial prompt composition and repairs them through sequential fine-tuning of the reward model followed by policy optimization.

citing papers explorer

Showing 1 of 1 citing paper.

ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System cs.AI · 2026-04-20 · unverdicted · none · ref 9
ARES discovers dual vulnerabilities in LLMs and reward models via adaptive adversarial prompt composition and repairs them through sequential fine-tuning of the reward model followed by policy optimization.

Goal Component

fields

years

verdicts

representative citing papers

citing papers explorer