pith. sign in

Badedit: Backdooring large language models by model editing

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

dataset 1

citation-polarity summary

fields

cs.CR 5

years

2026 3 2024 2

verdicts

UNVERDICTED 5

roles

dataset 1

polarities

use dataset 1

representative citing papers

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers

cs.CR · 2026-04-23 · unverdicted · novelty 6.0

BadStyle creates stealthy backdoors in LLMs by poisoning samples with imperceptible style triggers and using an auxiliary loss to stabilize payload injection, achieving high attack success rates across multiple models while evading defenses.

citing papers explorer

Showing 5 of 5 citing papers.