Reddebate: Safer responses through multi-agent red teaming debates.arXiv preprint arXiv:2506.11083,

Ali Asad, Stephen Obadinma, Radin Shayanfar, Xiaodan Zhu · arXiv 2506.11083

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

AgenticEval: Toward Agentic and Self-Evolving Safety Evaluation of Large Language Models

cs.AI · 2025-09-30 · unverdicted · novelty 6.0

AgenticEval is a multi-agent framework that ingests unstructured policies to generate and self-evolve comprehensive safety benchmarks for LLMs, with experiments showing declining safety rates as tests harden.

citing papers explorer

Showing 1 of 1 citing paper.

AgenticEval: Toward Agentic and Self-Evolving Safety Evaluation of Large Language Models cs.AI · 2025-09-30 · unverdicted · none · ref 1
AgenticEval is a multi-agent framework that ingests unstructured policies to generate and self-evolve comprehensive safety benchmarks for LLMs, with experiments showing declining safety rates as tests harden.

Reddebate: Safer responses through multi-agent red teaming debates.arXiv preprint arXiv:2506.11083,

fields

years

verdicts

representative citing papers

citing papers explorer