‘No, Alexa, no!’: designing child-safe AI and pro- tecting children from the risks of the ‘empathy gap’in large language models

Nomisha Kurian · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Optimus: A Robust Defense Framework for Mitigating Toxicity while Fine-Tuning Conversational AI

cs.CR · 2025-07-08 · unverdicted · novelty 6.0

Optimus mitigates toxicity during LLM fine-tuning by combining repurposed LLM safety alignments for detection with synthetic data and DPO alignment, remaining effective even with highly biased classifiers and against attacks.

citing papers explorer

Showing 1 of 1 citing paper.

Optimus: A Robust Defense Framework for Mitigating Toxicity while Fine-Tuning Conversational AI cs.CR · 2025-07-08 · unverdicted · none · ref 43
Optimus mitigates toxicity during LLM fine-tuning by combining repurposed LLM safety alignments for detection with synthetic data and DPO alignment, remaining effective even with highly biased classifiers and against attacks.

‘No, Alexa, no!’: designing child-safe AI and pro- tecting children from the risks of the ‘empathy gap’in large language models

fields

years

verdicts

representative citing papers

citing papers explorer