Learning the demonstrator's planning algorithm via a differentiable planner improves IRL reward inference over incorrect bias assumptions but underperforms exact planners.
Risk-sensitive inverse reinforcement learning via coherent risk models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference
Learning the demonstrator's planning algorithm via a differentiable planner improves IRL reward inference over incorrect bias assumptions but underperforms exact planners.