Robots detect underspecified reward features via demonstration variation and query targeted natural language explanations to improve reward recovery from imperfect demos.
Correcting robot plans with natural language feedback
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
QuickLAP fuses LLM-extracted language observations with physical feedback in a closed-form Bayesian update to cut reward learning error by over 70% in a driving simulator and improve user preference in a 15-person study.
citing papers explorer
-
Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations
Robots detect underspecified reward features via demonstration variation and query targeted natural language explanations to improve reward recovery from imperfect demos.
-
QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents
QuickLAP fuses LLM-extracted language observations with physical feedback in a closed-form Bayesian update to cut reward learning error by over 70% in a driving simulator and improve user preference in a 15-person study.