Domination-Avoiding Learning Agents Cannot Collude

Emmanuel Zerah; Noam Nisan

arxiv: 2606.01275 · v1 · pith:Y63VQBGAnew · submitted 2026-05-31 · 💻 cs.GT

Domination-Avoiding Learning Agents Cannot Collude

Noam Nisan , Emmanuel Zerah This is my paper

Pith reviewed 2026-06-28 16:21 UTC · model grok-4.3

classification 💻 cs.GT

keywords domination-avoiding agentscollusionregret minimizationpricing gamesiterated dominancemean-based learninginternal regretgame theory

0 comments

The pith

Domination-avoiding agents provably cannot collude when selling in repeated pricing games.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper first shows that external-regret-minimizing agents can collude in the standard market model. It then defines a broad class called domination-avoiding agents and proves that every agent in this class refuses to collude. The same class also refuses to play any strategy that survives only by repeated elimination of strictly dominated strategies. The class contains mean-based learners, internal-regret minimizers, and multiplicative-weights agents with variable rates.

Core claim

We identify a very general class of agents, which we term Domination-Avoiding agents, that provably do not collude in such markets. This class contains all Mean-Based agents and all internal-regret-minimizing agents, as well as others such as Multiplicative-Weight agents with variable learning rate and contextual variants thereof. More generally we show that, in any game, this class of agents is guaranteed to jointly learn to almost never play strategies that are eliminated by repeated elimination of purely dominated strategies.

What carries the argument

Domination-Avoiding agents: the property that an agent's sequence of play asymptotically places zero probability on any strategy eliminated by iterated strict dominance.

If this is right

External-regret-minimizing agents can collude in the pricing market.
Mean-based agents never collude.
Internal-regret-minimizing agents never collude.
Multiplicative-weights agents with variable learning rates never collude.
In every finite game the agents jointly avoid all strategies removed by iterated elimination of strictly dominated strategies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Market designers could require pricing algorithms to be internal-regret minimizers rather than external-regret minimizers to block collusion by construction.
The separation between external and internal regret offers a concrete test for whether observed algorithmic collusion is driven by the regret notion rather than the market structure.
The iterated-dominance guarantee may apply directly to other repeated games in which collusion is modeled as play of dominated joint strategies.

Load-bearing premise

The agent's update rule satisfies the domination-avoiding property that rules out play of iteratively dominated strategies.

What would settle it

An explicit construction or simulation in which a domination-avoiding agent places positive probability on an iteratively dominated strategy in the limit would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.01275 by Emmanuel Zerah, Noam Nisan.

read the original abstract

An influential paper of Calvano et al. empirically demonstrated that Q-learning agents spontaneously collude when placed as sellers that compete on prices in a natural market model. More recent results of Fish et al. empirically demonstrated that similar collusion happens with commercial LLMs. We formally prove that such collusion can also happen with external-regret-minimizing agents. We identify a very general class of agents, which we term Domination-Avoiding agents, that provably do not collude in such markets. This class contains all Mean-Based agents and all internal-regret-minimizing agents, as well as others such as Multiplicative-Weight agents with variable learning rate and contextual variants thereof. More generally we show that, in any game, this class of agents is guaranteed to jointly learn to almost never play strategies that are eliminated by repeated elimination of purely dominated strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves Domination-Avoiding agents avoid IESDS strategies in any game and claims this blocks collusion, but the market application may need an extra step since collusive prices often survive IESDS in Bertrand settings.

read the letter

The core contribution is a general theorem: Domination-Avoiding agents, which include all mean-based learners and internal-regret minimizers, jointly play strategies eliminated by iterated elimination of strictly dominated strategies with probability approaching zero in any finite game. The authors also show that plain external-regret minimizers can still collude in the Calvano-style pricing market. This is a clean formal extension beyond the usual regret bounds.

The general result on dominated strategies looks like the strongest part. It gives a broader class than prior work and applies to any game, not just the market model. That is useful on its own.

The soft spot is the jump from the general theorem to the specific claim that these agents cannot collude in the pricing game. In standard discrete Bertrand models the monopoly price is rationalizable, so it survives IESDS. If the non-collusion proof simply invokes the general theorem without showing that any sustained collusive path must use an eliminated strategy, or without a separate market argument, then the implication does not go through. The abstract does not spell out that extra step, and the stress-test concern lands on the material given.

This is for people working on learning dynamics in games or on design criteria for AI pricing agents. The formal guarantee is worth a serious referee even if the market conclusion needs tightening or an additional lemma. I would send it to review.

Referee Report

1 major / 0 minor

Summary. The manuscript proves that external-regret-minimizing agents can collude in a Bertrand-style pricing market. It defines a broader class of Domination-Avoiding agents (containing all mean-based agents and all internal-regret minimizers, plus Multiplicative-Weights with variable rates and contextual variants) that provably do not collude in such markets. Separately, it shows that in any game these agents jointly play strategies eliminated by iterated elimination of strictly dominated strategies (IESDS) with probability approaching zero.

Significance. If the proofs hold, the result supplies a clean theoretical demarcation between classes of no-regret learners that can and cannot sustain collusion, directly addressing the empirical collusion observed with Q-learning and LLMs. The general game-theoretic guarantee on avoidance of IESDS strategies is a potentially useful contribution to the literature on learning in games.

major comments (1)

[market non-collusion argument (following the general theorem)] The central non-collusion claim for the market model rests on the general IESDS-avoidance theorem. In standard discrete Bertrand pricing games the monopoly (collusive) price is typically rationalizable and survives IESDS. The manuscript must therefore contain an explicit additional step showing that any sustained collusive path requires positive probability on an IESDS-eliminated action; if the market result is obtained simply by invoking the general theorem, the implication is incomplete.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying a point that requires clarification in the connection between our general result and the market application.

read point-by-point responses

Referee: [market non-collusion argument (following the general theorem)] The central non-collusion claim for the market model rests on the general IESDS-avoidance theorem. In standard discrete Bertrand pricing games the monopoly (collusive) price is typically rationalizable and survives IESDS. The manuscript must therefore contain an explicit additional step showing that any sustained collusive path requires positive probability on an IESDS-eliminated action; if the market result is obtained simply by invoking the general theorem, the implication is incomplete.

Authors: We agree that the general IESDS-avoidance theorem by itself does not rule out collusion if collusive prices survive IESDS. The manuscript's market non-collusion claim therefore requires an explicit additional argument establishing that any sustained collusive path in the discrete Bertrand model places positive limiting probability on an IESDS-eliminated action. We will add this step as a dedicated lemma in the market section of the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; formal definition and proof chain is self-contained

full rationale

The paper defines a new class (Domination-Avoiding agents) and states two theorems: (1) external-regret minimizers can collude and (2) the larger class provably avoids IESDS-eliminated strategies in any game (hence does not collude in the market model). These are presented as formal results derived from the definition rather than from any fitted parameter, self-referential renaming, or load-bearing self-citation chain. No equation or step reduces the claimed non-collusion result to the input definition by construction; the derivation remains a standard mathematical argument from stated assumptions to conclusions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The result rests on standard repeated-game assumptions plus the newly introduced Domination-Avoiding property; no free parameters or invented physical entities appear in the abstract.

axioms (1)

standard math Standard assumptions of repeated normal-form games and regret minimization from prior literature
The non-collusion claim is stated for the market model of Calvano et al. and for any game.

invented entities (1)

Domination-Avoiding agents no independent evidence
purpose: Class of agents guaranteed not to collude or play iteratively dominated strategies
Newly defined class whose properties are proved in the paper; no independent evidence outside the definition is mentioned.

pith-pipeline@v0.9.1-grok · 5668 in / 1192 out tokens · 27959 ms · 2026-06-28T16:21:09.819339+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 6 canonical work pages · 1 internal anchor

[1]

Gagan Aggarwal, Ashwinkumar Badanidiyuru, Santiago R. Balseiro, Kshipra Bhawalkar, Yuan Deng, Zhe Feng, Gagan Goel, Christopher Liaw, Haihao Lu, Mohammad Mahdian, Jieming 9 Mao, Aranyak Mehta, Vahab Mirrokni, Renato Paes Leme, Andres Perlroth, Georgios Piliouras, Jon Schneider, Ariel Schvartzman, Balasubramanian Sivan, Kelly Spendlove, Yifeng Teng, Di Wan...

2024
[2]

Cambridge University Press, 2015

Álvaro Cartea, Sebastian Jaimungal, and José Penalva.Algorithmic and high-frequency trading. Cambridge University Press, 2015

2015
[3]

Algorithmic pricing and competition: Empirical evidence from the german retail gasoline market.Journal of Political Economy, 132(3):723–771, 2024

Stephanie Assad, Robert Clark, Daniel Ershov, and Lei Xu. Algorithmic pricing and competition: Empirical evidence from the german retail gasoline market.Journal of Political Economy, 132(3):723–771, 2024

2024
[4]

An empirical analysis of algorithmic pricing on amazon marketplace

Le Chen, Alan Mislove, and Christo Wilson. An empirical analysis of algorithmic pricing on amazon marketplace. InProceedings of the 25th international conference on World Wide Web, pages 1339–1349, 2016

2016
[5]

Chaos in auto- bidding auctions.arXiv preprint arXiv:2602.09118, 2026

Ioannis Anagnostides, Ian Gemp, Georgios Piliouras, and Kelly Spendlove. Chaos in auto- bidding auctions.arXiv preprint arXiv:2602.09118, 2026. https://arxiv.org/abs/2602. 09118

work page arXiv 2026
[6]

Agents of Chaos

Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki, Alex Loftus, Aditya Ratan Jannali, Nikhil Prakash, Jasmine Cui, Giordano Rogers, Jannik Brinkmann, Can Rager, Amir Zur, Michael Ripa, Aruna Sankaranarayanan, David Atkinson, Rohit Gandikota, Jaden Fiotto-Kaufman, EunJeong Hwang, Hadas Orgad, P Sam Sahil, Neg...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[7]

Artificial intelligence, algorithmic pricing, and collusion.American Economic Review, 110(10):3267– 3297, 2020

Emilio Calvano, Giacomo Calzolari, Vincenzo Denicolò, and Sergio Pastorello. Artificial intelligence, algorithmic pricing, and collusion.American Economic Review, 110(10):3267– 3297, 2020

2020
[8]

Q-learning.Machine learning, 8(3-4):279–292, 1992

Christopher JCH Watkins and Peter Dayan. Q-learning.Machine learning, 8(3-4):279–292, 1992

1992
[9]

MIT press, second edition, 2018

Richard S Sutton and Andrew G Barto.Reinforcement learning: An introduction. MIT press, second edition, 2018

2018
[10]

MIT Press, Cambridge, MA, 1991

Drew Fudenberg and Jean Tirole.Game Theory. MIT Press, Cambridge, MA, 1991

1991
[11]

Osborne and Ariel Rubinstein.A Course in Game Theory

Martin J. Osborne and Ariel Rubinstein.A Course in Game Theory. MIT Press, Cambridge, MA, 1994

1994
[12]

Gonczarowski, and Ran I

Sara Fish, Yannai A. Gonczarowski, and Shohini Kundu. Algorithmic collusion by large language models.arXiv preprint arXiv:2404.00806, 2024. https://arxiv.org/abs/2404. 00806

work page arXiv 2024
[13]

Hartline, Sheng Long, and Chenhao Zhang

Jason D. Hartline, Sheng Long, and Chenhao Zhang. Regulation of algorithmic collusion. In Proceedings of the 4th ACM Symposium on Computer Science and Law (CSLaw). Association for Computing Machinery, 2024

2024
[14]

Cambridge university press, 2006

Nicolò Cesa-Bianchi and Gábor Lugosi.Prediction, learning, and games. Cambridge university press, 2006

2006
[15]

The economics of no-regret learning algorithms.arXiv preprint arXiv:2601.22079, 2026.https://arxiv.org/abs/2601.22079

Jason Hartline. The economics of no-regret learning algorithms.arXiv preprint arXiv:2601.22079, 2026.https://arxiv.org/abs/2601.22079

work page arXiv 2026
[16]

No regret learning in oligopolies: Cournot vs

Uri Nadav and Georgios Piliouras. No regret learning in oligopolies: Cournot vs. Bertrand. In International Symposium on Algorithmic Game Theory, pages 300–311. Springer, 2010

2010
[17]

Auctions between regret-minimizing agents

Yoav Kolumbus and Noam Nisan. Auctions between regret-minimizing agents. InProceedings of the ACM Web Conference 2022, WWW ’22, page 100–111, New York, NY , USA, 2022. Association for Computing Machinery. 10

2022
[18]

Correlated and coarse equilibria of single- item auctions

Michal Feldman, Brendan Lucier, and Noam Nisan. Correlated and coarse equilibria of single- item auctions. In Yang Cai and Adrian Vetta, editors,Web and Internet Economics, pages 131–144, 2016

2016
[19]

Algorithmic collusion without threats.arXiv preprint arXiv:2409.03956, 2024

Eshwar Ram Arunachaleswaran, Natalie Collina, Sampath Kannan, Aaron Roth, and Juba Ziani. Algorithmic collusion without threats.arXiv preprint arXiv:2409.03956, 2024. https: //arxiv.org/abs/2409.03956

work page arXiv 2024
[20]

Matthew Weinberg

Mark Braverman, Jieming Mao, Jon Schneider, and S. Matthew Weinberg. Selling to a no-regret buyer. InProceedings of the 19th ACM Conference on Economics and Computation (EC), pages 523–538. Association for Computing Machinery, 2018

2018
[21]

Nash convergence of mean-based learning algorithms in first price auctions

Xiaotie Deng, Xinyan Hu, Tao Lin, and Weiqiang Zheng. Nash convergence of mean-based learning algorithms in first price auctions. InProceedings of the ACM Web Conference 2022, WWW ’22, page 141–150, New York, NY , USA, 2022. Association for Computing Machinery

2022
[22]

Conver- gence analysis of no-regret bidding algorithms in repeated auctions.Proceedings of the AAAI Conference on Artificial Intelligence, 35(6):5399–5406, May 2021

Zhe Feng, Guru Guruganesh, Christopher Liaw, Aranyak Mehta, and Abhishek Sethi. Conver- gence analysis of no-regret bidding algorithms in repeated auctions.Proceedings of the AAAI Conference on Artificial Intelligence, 35(6):5399–5406, May 2021

2021
[23]

Online optimization algorithms in repeated price competition: Equilibrium learning and algorithmic collusion.arXiv preprint arXiv:2412.15707, 2024.https://arxiv.org/abs/2412.15707

Martin Bichler, Julius Durmann, and Matthias Oberlechner. Online optimization algorithms in repeated price competition: Equilibrium learning and algorithmic collusion.arXiv preprint arXiv:2412.15707, 2024.https://arxiv.org/abs/2412.15707

work page arXiv 2024
[24]

MIT press, 1997

Jörgen W Weibull.Evolutionary game theory. MIT press, 1997

1997
[25]

Strategizing against no-regret learners

Yuan Deng, Jon Schneider, and Balasubramanian Sivan. Strategizing against no-regret learners. Advances in neural information processing systems, 32, 2019

2019
[26]

Regret analysis of stochastic and nonstochastic multi-armed bandit problems.Foundations and Trends in Machine Learning, 5(1):1–122, 2012

Sébastien Bubeck and Nicolò Cesa-Bianchi. Regret analysis of stochastic and nonstochastic multi-armed bandit problems.Foundations and Trends in Machine Learning, 5(1):1–122, 2012

2012
[27]

Attention in delay of gratification.Journal of Personality and Social Psychology, 16(2):329–337, 1970

Walter Mischel and Ebbe B Ebbesen. Attention in delay of gratification.Journal of Personality and Social Psychology, 16(2):329–337, 1970

1970
[28]

Doubling Trick

Simon P Anderson, André De Palma, and Jacques-François Thisse.Discrete choice theory of product differentiation. MIT press, 1992. A Classes of Domination-Avoiding Algorithms We establish the DA property for four broad classes of algorithms: Mean-Based dynamics (in- cluding finite-horizon variants via the Doubling Trick), the standard infinite-horizon Mult...

1992
[29]

A.2 Multiplicative Weights with Variable Learning Rate The Multiplicative Weights (MW) algorithm (Algorithm 1) is a cornerstone of online learning

Thus: UTA({dt(δ)})≤ ϵ 4 + ϵ 4 = ϵ 2 < ϵ This satisfies the Domination-Avoiding condition. A.2 Multiplicative Weights with Variable Learning Rate The Multiplicative Weights (MW) algorithm (Algorithm 1) is a cornerstone of online learning. While MW with a constant learning rate is Mean-Based and regret-minimizing in finite-horizon settings, it fails to achi...
[30]

Because the threshold boundaries expand symmetrically, p∗ inherently approaches the exact midpoint of the surviving interval. Consequently, for sufficiently dense grids, the maximum distance from the equilibrium to any surviving price tightens to a single ϵ, asymptotically confining the IESPDS set to[p ∗ −ϵ, p ∗ +ϵ]. We now provide the deferred proofs for...

[1] [1]

Gagan Aggarwal, Ashwinkumar Badanidiyuru, Santiago R. Balseiro, Kshipra Bhawalkar, Yuan Deng, Zhe Feng, Gagan Goel, Christopher Liaw, Haihao Lu, Mohammad Mahdian, Jieming 9 Mao, Aranyak Mehta, Vahab Mirrokni, Renato Paes Leme, Andres Perlroth, Georgios Piliouras, Jon Schneider, Ariel Schvartzman, Balasubramanian Sivan, Kelly Spendlove, Yifeng Teng, Di Wan...

2024

[2] [2]

Cambridge University Press, 2015

Álvaro Cartea, Sebastian Jaimungal, and José Penalva.Algorithmic and high-frequency trading. Cambridge University Press, 2015

2015

[3] [3]

Algorithmic pricing and competition: Empirical evidence from the german retail gasoline market.Journal of Political Economy, 132(3):723–771, 2024

Stephanie Assad, Robert Clark, Daniel Ershov, and Lei Xu. Algorithmic pricing and competition: Empirical evidence from the german retail gasoline market.Journal of Political Economy, 132(3):723–771, 2024

2024

[4] [4]

An empirical analysis of algorithmic pricing on amazon marketplace

Le Chen, Alan Mislove, and Christo Wilson. An empirical analysis of algorithmic pricing on amazon marketplace. InProceedings of the 25th international conference on World Wide Web, pages 1339–1349, 2016

2016

[5] [5]

Chaos in auto- bidding auctions.arXiv preprint arXiv:2602.09118, 2026

Ioannis Anagnostides, Ian Gemp, Georgios Piliouras, and Kelly Spendlove. Chaos in auto- bidding auctions.arXiv preprint arXiv:2602.09118, 2026. https://arxiv.org/abs/2602. 09118

work page arXiv 2026

[6] [6]

Agents of Chaos

Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki, Alex Loftus, Aditya Ratan Jannali, Nikhil Prakash, Jasmine Cui, Giordano Rogers, Jannik Brinkmann, Can Rager, Amir Zur, Michael Ripa, Aruna Sankaranarayanan, David Atkinson, Rohit Gandikota, Jaden Fiotto-Kaufman, EunJeong Hwang, Hadas Orgad, P Sam Sahil, Neg...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[7] [7]

Artificial intelligence, algorithmic pricing, and collusion.American Economic Review, 110(10):3267– 3297, 2020

Emilio Calvano, Giacomo Calzolari, Vincenzo Denicolò, and Sergio Pastorello. Artificial intelligence, algorithmic pricing, and collusion.American Economic Review, 110(10):3267– 3297, 2020

2020

[8] [8]

Q-learning.Machine learning, 8(3-4):279–292, 1992

Christopher JCH Watkins and Peter Dayan. Q-learning.Machine learning, 8(3-4):279–292, 1992

1992

[9] [9]

MIT press, second edition, 2018

Richard S Sutton and Andrew G Barto.Reinforcement learning: An introduction. MIT press, second edition, 2018

2018

[10] [10]

MIT Press, Cambridge, MA, 1991

Drew Fudenberg and Jean Tirole.Game Theory. MIT Press, Cambridge, MA, 1991

1991

[11] [11]

Osborne and Ariel Rubinstein.A Course in Game Theory

Martin J. Osborne and Ariel Rubinstein.A Course in Game Theory. MIT Press, Cambridge, MA, 1994

1994

[12] [12]

Gonczarowski, and Ran I

Sara Fish, Yannai A. Gonczarowski, and Shohini Kundu. Algorithmic collusion by large language models.arXiv preprint arXiv:2404.00806, 2024. https://arxiv.org/abs/2404. 00806

work page arXiv 2024

[13] [13]

Hartline, Sheng Long, and Chenhao Zhang

Jason D. Hartline, Sheng Long, and Chenhao Zhang. Regulation of algorithmic collusion. In Proceedings of the 4th ACM Symposium on Computer Science and Law (CSLaw). Association for Computing Machinery, 2024

2024

[14] [14]

Cambridge university press, 2006

Nicolò Cesa-Bianchi and Gábor Lugosi.Prediction, learning, and games. Cambridge university press, 2006

2006

[15] [15]

The economics of no-regret learning algorithms.arXiv preprint arXiv:2601.22079, 2026.https://arxiv.org/abs/2601.22079

Jason Hartline. The economics of no-regret learning algorithms.arXiv preprint arXiv:2601.22079, 2026.https://arxiv.org/abs/2601.22079

work page arXiv 2026

[16] [16]

No regret learning in oligopolies: Cournot vs

Uri Nadav and Georgios Piliouras. No regret learning in oligopolies: Cournot vs. Bertrand. In International Symposium on Algorithmic Game Theory, pages 300–311. Springer, 2010

2010

[17] [17]

Auctions between regret-minimizing agents

Yoav Kolumbus and Noam Nisan. Auctions between regret-minimizing agents. InProceedings of the ACM Web Conference 2022, WWW ’22, page 100–111, New York, NY , USA, 2022. Association for Computing Machinery. 10

2022

[18] [18]

Correlated and coarse equilibria of single- item auctions

Michal Feldman, Brendan Lucier, and Noam Nisan. Correlated and coarse equilibria of single- item auctions. In Yang Cai and Adrian Vetta, editors,Web and Internet Economics, pages 131–144, 2016

2016

[19] [19]

Algorithmic collusion without threats.arXiv preprint arXiv:2409.03956, 2024

Eshwar Ram Arunachaleswaran, Natalie Collina, Sampath Kannan, Aaron Roth, and Juba Ziani. Algorithmic collusion without threats.arXiv preprint arXiv:2409.03956, 2024. https: //arxiv.org/abs/2409.03956

work page arXiv 2024

[20] [20]

Matthew Weinberg

Mark Braverman, Jieming Mao, Jon Schneider, and S. Matthew Weinberg. Selling to a no-regret buyer. InProceedings of the 19th ACM Conference on Economics and Computation (EC), pages 523–538. Association for Computing Machinery, 2018

2018

[21] [21]

Nash convergence of mean-based learning algorithms in first price auctions

Xiaotie Deng, Xinyan Hu, Tao Lin, and Weiqiang Zheng. Nash convergence of mean-based learning algorithms in first price auctions. InProceedings of the ACM Web Conference 2022, WWW ’22, page 141–150, New York, NY , USA, 2022. Association for Computing Machinery

2022

[22] [22]

Conver- gence analysis of no-regret bidding algorithms in repeated auctions.Proceedings of the AAAI Conference on Artificial Intelligence, 35(6):5399–5406, May 2021

Zhe Feng, Guru Guruganesh, Christopher Liaw, Aranyak Mehta, and Abhishek Sethi. Conver- gence analysis of no-regret bidding algorithms in repeated auctions.Proceedings of the AAAI Conference on Artificial Intelligence, 35(6):5399–5406, May 2021

2021

[23] [23]

Online optimization algorithms in repeated price competition: Equilibrium learning and algorithmic collusion.arXiv preprint arXiv:2412.15707, 2024.https://arxiv.org/abs/2412.15707

Martin Bichler, Julius Durmann, and Matthias Oberlechner. Online optimization algorithms in repeated price competition: Equilibrium learning and algorithmic collusion.arXiv preprint arXiv:2412.15707, 2024.https://arxiv.org/abs/2412.15707

work page arXiv 2024

[24] [24]

MIT press, 1997

Jörgen W Weibull.Evolutionary game theory. MIT press, 1997

1997

[25] [25]

Strategizing against no-regret learners

Yuan Deng, Jon Schneider, and Balasubramanian Sivan. Strategizing against no-regret learners. Advances in neural information processing systems, 32, 2019

2019

[26] [26]

Regret analysis of stochastic and nonstochastic multi-armed bandit problems.Foundations and Trends in Machine Learning, 5(1):1–122, 2012

Sébastien Bubeck and Nicolò Cesa-Bianchi. Regret analysis of stochastic and nonstochastic multi-armed bandit problems.Foundations and Trends in Machine Learning, 5(1):1–122, 2012

2012

[27] [27]

Attention in delay of gratification.Journal of Personality and Social Psychology, 16(2):329–337, 1970

Walter Mischel and Ebbe B Ebbesen. Attention in delay of gratification.Journal of Personality and Social Psychology, 16(2):329–337, 1970

1970

[28] [28]

Doubling Trick

Simon P Anderson, André De Palma, and Jacques-François Thisse.Discrete choice theory of product differentiation. MIT press, 1992. A Classes of Domination-Avoiding Algorithms We establish the DA property for four broad classes of algorithms: Mean-Based dynamics (in- cluding finite-horizon variants via the Doubling Trick), the standard infinite-horizon Mult...

1992

[29] [29]

A.2 Multiplicative Weights with Variable Learning Rate The Multiplicative Weights (MW) algorithm (Algorithm 1) is a cornerstone of online learning

Thus: UTA({dt(δ)})≤ ϵ 4 + ϵ 4 = ϵ 2 < ϵ This satisfies the Domination-Avoiding condition. A.2 Multiplicative Weights with Variable Learning Rate The Multiplicative Weights (MW) algorithm (Algorithm 1) is a cornerstone of online learning. While MW with a constant learning rate is Mean-Based and regret-minimizing in finite-horizon settings, it fails to achi...

[30] [30]

Because the threshold boundaries expand symmetrically, p∗ inherently approaches the exact midpoint of the surviving interval. Consequently, for sufficiently dense grids, the maximum distance from the equilibrium to any surviving price tightens to a single ϵ, asymptotically confining the IESPDS set to[p ∗ −ϵ, p ∗ +ϵ]. We now provide the deferred proofs for...