MOCI jointly infers shared constraints and individual preferences from heterogeneous expert trajectories via multi-objective inverse reinforcement learning and outperforms baselines on grid-world predictive performance.
A practical guide to multi-objective reinforcement learning and planning: Cf hayes et al
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
FATE lets LLM agents self-evolve safer behaviors by generating and filtering repairs from their own failure trajectories using verifiers and Pareto optimization.
citing papers explorer
-
Multi-Objective Constraint Inference using Inverse reinforcement learning
MOCI jointly infers shared constraints and individual preferences from heterogeneous expert trajectories via multi-objective inverse reinforcement learning and outperforms baselines on grid-world predictive performance.
-
On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
FATE lets LLM agents self-evolve safer behaviors by generating and filtering repairs from their own failure trajectories using verifiers and Pareto optimization.