Harmonized benchmarking of expert-guided RL methods on continuous control tasks surfaces three failure modes and produces a testable decision rule for method selection keyed on expert quality, task termination, and perturbation type.
International Conference on Learning Representations (ICLR) , year =
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2roles
baseline 1polarities
baseline 1representative citing papers
The paper decomposes simulator value errors into identifiable shifts and irreducible residuals, shows passive learning fails on reachability, and introduces Fisher-SEP to minimize posterior value variance via targeted experiments.
citing papers explorer
-
When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning
Harmonized benchmarking of expert-guided RL methods on continuous control tasks surfaces three failure modes and produces a testable decision rule for method selection keyed on expert quality, task termination, and perturbation type.
-
Mind the Sim-to-Real Gap & Think Like a Scientist
The paper decomposes simulator value errors into identifiable shifts and irreducible residuals, shows passive learning fails on reachability, and introduces Fisher-SEP to minimize posterior value variance via targeted experiments.