DeepInception constructs nested virtual scenes to induce continuous jailbreaks in LLMs, achieving high harmfulness rates on Llama-2/3, GPT-3.5/4, and GPT-4o.
Scaling laws for neural language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2023 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
DeepInception: Hypnotize Large Language Model to Be Jailbreaker
DeepInception constructs nested virtual scenes to induce continuous jailbreaks in LLMs, achieving high harmfulness rates on Llama-2/3, GPT-3.5/4, and GPT-4o.