A constrained optimization perspective on actor critic algorithms and application to network routing
classification
💻 cs.LG
math.OC
keywords
actoralgorithmsapplicationnetworkoptimizationroutingactor-criticalgorithm
read the original abstract
We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.