PREFINE adapts Direct Preference Optimization to trajectory-level preferences in RL for joint reward retention and safety alignment in continuous domains.
Christiano, Jan Leike, Tom B
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
PREFINE: Preference-Based Implicit Reward and Cost Fine-Tuning for Safety Alignment
PREFINE adapts Direct Preference Optimization to trajectory-level preferences in RL for joint reward retention and safety alignment in continuous domains.