Malla: Demystifying Real-world Large Language Model In- tegrated Malicious Services

Zilong Lin, Jian Cui, Xiaojing Liao, XiaoFeng Wang · 2024 · arXiv 2401.03315

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

cs.CR · 2023-08-07 · unverdicted · novelty 6.0

Real-world jailbreak prompts collected from the wild achieve up to 0.95 attack success rates against major LLMs including GPT-4, with some persisting for over 240 days.

Gemma 2: Improving Open Language Models at a Practical Size

cs.CL · 2024-07-31 · conditional · novelty 3.0

Gemma 2 models achieve leading performance at their sizes by combining established Transformer modifications with knowledge distillation for the 2B and 9B variants.

citing papers explorer

Showing 2 of 2 citing papers.

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models cs.CR · 2023-08-07 · unverdicted · none · ref 44
Real-world jailbreak prompts collected from the wild achieve up to 0.95 attack success rates against major LLMs including GPT-4, with some persisting for over 240 days.
Gemma 2: Improving Open Language Models at a Practical Size cs.CL · 2024-07-31 · conditional · none · ref 32
Gemma 2 models achieve leading performance at their sizes by combining established Transformer modifications with knowledge distillation for the 2B and 9B variants.

Malla: Demystifying Real-world Large Language Model In- tegrated Malicious Services

fields

years

verdicts

representative citing papers

citing papers explorer