Generalized advantage estimation combined with trust region optimization enables stable neural network policy learning for complex continuous control from raw kinematics.
Neuronlike adaptive elements that can solve difficult learning control problems
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2015 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
High-Dimensional Continuous Control Using Generalized Advantage Estimation
Generalized advantage estimation combined with trust region optimization enables stable neural network policy learning for complex continuous control from raw kinematics.