MSD enables cross-lingual safety transfer in LLMs via self-distillation with Dual-Perspective Safety Weighting, improving safety in low-resource languages without target response data.
The Thirteenth International Conference on Learning Representations , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it