pith. machine review for the scientific record. sign in

arxiv: 1411.7974 · v2 · submitted 2014-11-28 · 💻 cs.AI · cs.GT· cs.MA

Recognition: unknown

Solving Games with Functional Regret Estimation

Authors on Pith no claims yet
classification 💻 cs.AI cs.GTcs.MA
keywords regretabstractionfunctiongamesmethodalgorithmapproachapproximator
0
0 comments X
read the original abstract

We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work on abstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.