Learning a prior over intent via meta-inverse reinforcement learning

Xu, K · 2018 · arXiv 1805.12573

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents

cs.AI · 2025-11-22 · unverdicted · novelty 6.0 · 2 refs

QuickLAP fuses LLM-extracted language observations with physical feedback in a closed-form Bayesian update to cut reward learning error by over 70% in a driving simulator and improve user preference in a 15-person study.

On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference

cs.LG · 2019-06-23 · unverdicted · novelty 6.0

Learning the demonstrator's planning algorithm via a differentiable planner improves IRL reward inference over incorrect bias assumptions but underperforms exact planners.

citing papers explorer

Showing 2 of 2 citing papers.

QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents cs.AI · 2025-11-22 · unverdicted · none · ref 63 · 2 links
QuickLAP fuses LLM-extracted language observations with physical feedback in a closed-form Bayesian update to cut reward learning error by over 70% in a driving simulator and improve user preference in a 15-person study.
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference cs.LG · 2019-06-23 · unverdicted · none · ref 34
Learning the demonstrator's planning algorithm via a differentiable planner improves IRL reward inference over incorrect bias assumptions but underperforms exact planners.

Learning a prior over intent via meta-inverse reinforcement learning

fields

years

verdicts

representative citing papers

citing papers explorer