RL policies decompose into information-regularized primitives that compete by requesting state information amounts, with the greediest one acting, yielding better generalization than flat or hierarchical baselines.
Hierarchical reinforcement learning with the maxq value function decomposition
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
RL policies decompose into information-regularized primitives that compete by requesting state information amounts, with the greediest one acting, yielding better generalization than flat or hierarchical baselines.