Sequential DPO produces varied effects on prior preferences (partial degradation, stability, pair-level redistribution, or positive transfer) depending on objective relationships rather than uniform forgetting.
and Sreedhar, Makesh Narsimhan and Kuchaiev, Oleksii , booktitle =
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Beyond Uniform Forgetting: A Study of Sequential Direct Preference Optimization Across Preference Settings
Sequential DPO produces varied effects on prior preferences (partial degradation, stability, pair-level redistribution, or positive transfer) depending on objective relationships rather than uniform forgetting.