IDK-based Methods (’I don’t know’)

Loss cGA cFG cRT cFGKL cRTKL GA 1 0 0 0 0 GA+RT 1 0 1 0 0 GA+KL 1 0 0 0 1 IDK+RT 0 1 1 0 0 Table 2: The weights for different components in GA-based loss functions, IDK+RT loss · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

cs.LG · 2024-04-08 · conditional · novelty 8.0

NPO enables stable unlearning of 50%+ training data in LLMs on TOFU by making collapse exponentially slower than gradient ascent, preserving sensible outputs where prior methods fail.

citing papers explorer

Showing 1 of 1 citing paper.

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning cs.LG · 2024-04-08 · conditional · none · ref 33
NPO enables stable unlearning of 50%+ training data in LLMs on TOFU by making collapse exponentially slower than gradient ascent, preserving sensible outputs where prior methods fail.

IDK-based Methods (’I don’t know’)

fields

years

verdicts

representative citing papers

citing papers explorer