A lifecycle-based survey of LLM fine-tuning security that reviews attacks and defenses by intervention phase and reports unified empirical findings on model-dependent attack effectiveness and limited defense generalization.
BEEAR: Embedding-based adversarial removal of safety back- doors in instruction-tuned language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions
A lifecycle-based survey of LLM fine-tuning security that reviews attacks and defenses by intervention phase and reports unified empirical findings on model-dependent attack effectiveness and limited defense generalization.