pith. machine review for the scientific record. sign in

arxiv: 1207.1411 · v1 · submitted 2012-07-04 · 💻 cs.GT · cs.AI

Recognition: unknown

Bayes' Bluff: Opponent Modelling in Poker

Authors on Pith no claims yet
classification 💻 cs.GT cs.AI
keywords opponentpokerdemonstratedistributiondynamicsgamemodellingplaying
0
0 comments X
read the original abstract

Poker is a challenging problem for artificial intelligence, with non-deterministic dynamics, partial observability, and the added difficulty of unknown adversaries. Modelling all of the uncertainties in this domain is not an easy task. In this paper we present a Bayesian probabilistic model for a broad class of poker games, separating the uncertainty in the game dynamics from the uncertainty of the opponent's strategy. We then describe approaches to two key subproblems: (i) inferring a posterior over opponent strategies given a prior distribution and observations of their play, and (ii) playing an appropriate response to that distribution. We demonstrate the overall approach on a reduced version of poker using Dirichlet priors and then on the full game of Texas hold'em using a more informed prior. We demonstrate methods for playing effective responses to the opponent, based on the posterior.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

    cs.AI 2026-05 unverdicted novelty 7.0

    Bandit algorithms can be adapted to Tree MDPs by treating policies as arms with shared-data confidence bounds, achieving polynomial memory and instance-dependent bounds on sample complexity and regret that depend on t...

  2. AlphaExploitem: Going Beyond the Nash Equilibrium in Poker by Learning to Exploit Suboptimal Play

    cs.LG 2026-05 unverdicted novelty 5.0

    AlphaExploitem adds a hierarchical transformer encoder and a diverse pool of exploitable opponents to AlphaHoldem, enabling exploitation of suboptimal poker play while preserving performance against Nash-equilibrium o...