Jailbreak chat

Alex Albert · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

cs.CR · 2024-03-28 · accept · novelty 6.0

JailbreakBench supplies an evolving set of jailbreak prompts, a 100-behavior dataset aligned with usage policies, a standardized evaluation framework, and a leaderboard to enable comparable assessments of attacks and defenses on LLMs.

citing papers explorer

Showing 1 of 1 citing paper.

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models cs.CR · 2024-03-28 · accept · none · ref 4
JailbreakBench supplies an evolving set of jailbreak prompts, a 100-behavior dataset aligned with usage policies, a standardized evaluation framework, and a leaderboard to enable comparable assessments of attacks and defenses on LLMs.

Jailbreak chat

fields

years

verdicts

representative citing papers

citing papers explorer