PREFINE adapts Direct Preference Optimization to trajectory-level preferences in RL for joint reward retention and safety alignment in continuous domains.
Safety gymnasium: A unified safe reinforcement learning benchmark.Advances in Neural Information Processing Systems, 36:18964–18993, 2023
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
SafeVLA applies constrained reinforcement learning via CMDP min-max optimization to VLAs, cutting safety violation costs by 83.58% while preserving task success on long-horizon mobile manipulation tasks.
citing papers explorer
-
PREFINE: Preference-Based Implicit Reward and Cost Fine-Tuning for Safety Alignment
PREFINE adapts Direct Preference Optimization to trajectory-level preferences in RL for joint reward retention and safety alignment in continuous domains.
-
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
SafeVLA applies constrained reinforcement learning via CMDP min-max optimization to VLAs, cutting safety violation costs by 83.58% while preserving task success on long-horizon mobile manipulation tasks.