Athena: Safe Autonomous Agents with Verbal Contrastive Learning

Sadhu, Tanmana, Pesaranghader, Ali, Chen, Yanan, Yi, Dong Hoon · 2024 · DOI 10.18653/v1/2024.emnlp-industry.84

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Membrane: A Self-Evolving Contrastive Safety Memory for LLM Agent Defense

cs.CR · 2026-06-04 · unverdicted · novelty 6.0

A contrastive memory system evolves without retraining to defend LLM agents against jailbreaks, achieving top F1 scores and low benign refusal on HarmBench and AgentHarm benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

Membrane: A Self-Evolving Contrastive Safety Memory for LLM Agent Defense cs.CR · 2026-06-04 · unverdicted · none · ref 57
A contrastive memory system evolves without retraining to defend LLM agents against jailbreaks, achieving top F1 scores and low benign refusal on HarmBench and AgentHarm benchmarks.

Athena: Safe Autonomous Agents with Verbal Contrastive Learning

fields

years

verdicts

representative citing papers

citing papers explorer