Stochastic Variance-Reduced Policy Gradient

· 2018 · cs.LG · arXiv 1806.05618

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

In this paper, we propose a novel reinforcement- learning algorithm consisting in a stochastic variance-reduced version of policy gradient for solving Markov Decision Processes (MDPs). Stochastic variance-reduced gradient (SVRG) methods have proven to be very successful in supervised learning. However, their adaptation to policy gradient is not straightforward and needs to account for I) a non-concave objective func- tion; II) approximations in the full gradient com- putation; and III) a non-stationary sampling pro- cess. The result is SVRPG, a stochastic variance- reduced policy gradient algorithm that leverages on importance weights to preserve the unbiased- ness of the gradient estimate. Under standard as- sumptions on the MDP, we provide convergence guarantees for SVRPG with a convergence rate that is linear under increasing batch sizes. Finally, we suggest practical variants of SVRPG, and we empirically evaluate them on continuous MDPs.

representative citing papers

Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)

eess.SY · 2025-08-22 · unverdicted · novelty 6.0

YANN-RL initializes RL actor and critic networks with explicit multi-parametric linear MPC solutions via YANNs to start from linear optimal control performance and then learn nonlinear policies through online interaction.

citing papers explorer

Showing 1 of 1 citing paper.

Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs) eess.SY · 2025-08-22 · unverdicted · none · ref 67 · internal anchor
YANN-RL initializes RL actor and critic networks with explicit multi-parametric linear MPC solutions via YANNs to start from linear optimal control performance and then learn nonlinear policies through online interaction.

Stochastic Variance-Reduced Policy Gradient

fields

years

verdicts

representative citing papers

citing papers explorer