Multi-Advisor Reinforcement Learning

· 2017 · cs.LG · arXiv 1704.00756

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

We consider tackling a single-agent RL problem by distributing it to $n$ learners. These learners, called advisors, endeavour to solve the problem from a different focus. Their advice, taking the form of action values, is then communicated to an aggregator, which is in control of the system. We show that the local planning method for the advisors is critical and that none of the ones found in the literature is flawless: the egocentric planning overestimates values of states where the other advisors disagree, and the agnostic planning is inefficient around danger zones. We introduce a novel approach called empathic and discuss its theoretical aspects. We empirically examine and validate our theoretical findings on a fruit collection task.

representative citing papers

On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning

cs.LG · 2019-07-01 · unverdicted · novelty 5.0

Landmark topological coverings derived from traversibility metrics enable three transfer mechanisms with theoretical Q-value bounds in goal-based multi-task lifelong RL.

citing papers explorer

Showing 1 of 1 citing paper.

On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning cs.LG · 2019-07-01 · unverdicted · none · ref 9 · internal anchor
Landmark topological coverings derived from traversibility metrics enable three transfer mechanisms with theoretical Q-value bounds in goal-based multi-task lifelong RL.

Multi-Advisor Reinforcement Learning

fields

years

verdicts

representative citing papers

citing papers explorer