Markets with Heterogeneous Agents: Dynamics and Survival of Bayesian vs. No-Regret Learners

David Easley; Eva Tardos; Yoav Kolumbus

arxiv: 2502.08597 · v3 · submitted 2025-02-12 · 💻 cs.GT · cs.AI· cs.MA· econ.TH

Markets with Heterogeneous Agents: Dynamics and Survival of Bayesian vs. No-Regret Learners

David Easley , Yoav Kolumbus , Eva Tardos This is my paper

Pith reviewed 2026-05-23 03:52 UTC · model grok-4.3

classification 💻 cs.GT cs.AIcs.MAecon.TH

keywords Bayesian learningno-regret learningmarket survivalasset marketsheterogeneous agentsregret minimizationhybrid strategieswealth dynamics

0 comments

The pith

Bayesian learners can drive no-regret learners out of markets despite the latter achieving logarithmic regret.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies competing learning agents in markets where asset payoffs are stochastic. It shows how ideas from regret minimization in online learning relate to which agents survive and dominate in economic market selection. The central result is that low regret is not sufficient for survival when facing a Bayesian learner whose prior assigns positive probability to the true model. Bayesian methods are fragile to incorrect models or changes in the environment, whereas no-regret methods are more robust but may not exploit the environment as effectively when the model is known. The work also introduces hybrid approaches that aim to combine the strengths of both.

Core claim

The paper establishes that in asset markets, an agent's long-run survival is governed by its relative performance in predicting payoffs compared to others. Surprisingly, no-regret learners can be eliminated even when they attain logarithmic regret bounds if pitted against Bayesian learners with finite priors that include the correct payoff-generating process. While Bayesian learning excels when the prior is accurate, it is vulnerable to misspecification, making no-regret learning more adaptable to shifts in distributions.

What carries the argument

The market selection mechanism based on wealth shares updated by realized payoffs, which equates survival to outpredicting competitors in a repeated stochastic game.

If this is right

Regret minimization alone does not ensure positive long-run market share against informed Bayesian agents.
Bayesian learners with correct priors dominate but fail under distribution shifts.
Hybrid strategies that blend Bayesian updates with no-regret elements provide improved robustness.
No-regret learning requires less environment knowledge than full Bayesian approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This result implies that in uncertain or changing markets, agents might benefit from prioritizing robustness over precise Bayesian inference.
The unification of regret and survival concepts could extend to algorithmic trading environments where learning types compete over finite horizons.
Varying the support size of the Bayesian prior in simulations would reveal thresholds where logarithmic regret becomes sufficient for survival.

Load-bearing premise

Market survival depends on relative wealth growth determined by prediction accuracy against a Bayesian competitor whose prior includes the true model.

What would settle it

A market simulation or observation in which a logarithmic-regret agent maintains positive wealth share indefinitely against a Bayesian learner with the true model in its finite prior.

Figures

Figures reproduced from arXiv: 2502.08597 by David Easley, Eva Tardos, Yoav Kolumbus.

**Figure 2.** Figure 2: Wealth dynamics in a two-state two-player market. Figure 2a shows the competition between two [PITH_FULL_IMAGE:figures/full_fig_p020_2.png] view at source ↗

read the original abstract

We analyze the performance of heterogeneous learning agents in asset markets with stochastic payoffs. Our main focus is on comparing Bayesian learners and no-regret learners who compete in markets and identifying the conditions under which each approach is more effective. We formally relate the notions of survival and market dominance studied in economics and the framework of regret minimization, thereby bridging these theories. A central finding is that regret plays a key role in market selection, but low regret alone does not guarantee survival: surprisingly, an agent may achieve even logarithmic regret and yet be driven out of the market when competing against a Bayesian learner with a finite prior that assigns positive probability to the correct model. At the same time, we show that Bayesian learning is highly fragile, while no-regret learning requires less knowledge of the environment and is therefore more robust. Motivated by this contrast, we propose two simple hybrid strategies that incorporate Bayesian updates while improving robustness and adaptability to distribution shifts, taking a step toward a best-of-both-worlds learning approach. More broadly, our work contributes to the understanding of dynamics of heterogeneous learning agents and their impact on markets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Log regret fails to guarantee survival against a Bayesian with the right prior, but the market-clearing setup may make the correct model endogenous.

read the letter

The main point is that an agent can post logarithmic regret yet still get driven out when facing a Bayesian learner whose finite prior includes the true model. The paper also introduces two hybrid strategies that try to combine Bayesian updating with more robustness to shifts. That link between regret bounds and market survival is the concrete new piece, and it does connect the online-learning and economics literatures in a direct way. The observation that Bayesian methods are fragile while no-regret needs less prior knowledge is stated plainly and matches what we already know about each approach separately. The hybrids are presented as a practical step toward combining the strengths. The stress-test concern about endogenous returns is worth checking in the full text. If asset prices are determined by aggregate demand, the payoff distribution depends on both strategies, so the Bayesian's fixed prior advantage is no longer obviously exogenous. The abstract does not spell out the wealth-update or market-clearing rules, which leaves that assumption unexamined. Without the derivations or any simulation details it is hard to tell whether the survival claims survive that dependence. The work is aimed at people who already work on learning agents in markets or multi-agent systems. It is worth sending to referees so the model assumptions and the formal relation between regret and survival can be verified; the bridging idea is substantive enough to justify the time even if revisions are needed.

Referee Report

2 major / 2 minor

Summary. The paper analyzes heterogeneous learning in asset markets with stochastic payoffs, relating survival/market dominance from economics to regret minimization. It claims that logarithmic regret does not guarantee survival against a Bayesian learner whose finite prior places positive mass on the true model; Bayesian learning is fragile to shifts while no-regret is more robust; and two hybrid strategies are proposed that combine Bayesian updates with improved robustness.

Significance. If the central claims are established with explicit wealth-update and market-clearing rules that keep the correct model exogenous, the work would usefully bridge regret minimization and market-selection theories and motivate hybrid learners. The abstract alone supplies no derivations, proofs, or simulation details, so soundness cannot yet be assessed.

major comments (2)

[model definition / wealth-update rules (abstract and §3)] The central survival claim (abstract) requires an exogenous 'correct model' to which the Bayesian prior assigns positive mass and against which regret is measured. If asset returns are determined by market clearing (aggregate demand affects prices), the return distribution depends on both agents' strategies, rendering the correct model endogenous. This fixed-point issue is load-bearing for the comparison between Bayesian and no-regret survival and is not addressed by merely positing a finite prior.
[abstract and main theorems] The statement that 'an agent may achieve even logarithmic regret and yet be driven out' is presented as a central finding, yet no explicit wealth-update equation, market-clearing condition, or regret bound is supplied in the abstract. Without these, it is impossible to verify whether the claimed separation between regret and survival follows from the model assumptions rather than from an implicit exogenous-payoff assumption.

minor comments (2)

[hybrid strategies section] Notation for the two hybrid strategies is introduced only in the abstract; their precise update rules and robustness guarantees should be stated explicitly in the main text.
[conclusion / discussion] The paper would benefit from a short table contrasting the knowledge requirements and fragility properties of pure Bayesian, pure no-regret, and hybrid learners.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address each major comment below and indicate planned revisions to clarify the model and strengthen the presentation.

read point-by-point responses

Referee: The central survival claim (abstract) requires an exogenous 'correct model' to which the Bayesian prior assigns positive mass and against which regret is measured. If asset returns are determined by market clearing (aggregate demand affects prices), the return distribution depends on both agents' strategies, rendering the correct model endogenous. This fixed-point issue is load-bearing for the comparison between Bayesian and no-regret survival and is not addressed by merely positing a finite prior.

Authors: We appreciate the referee highlighting this modeling consideration. In the paper, asset payoffs are drawn from a fixed exogenous stochastic distribution that defines the 'correct model' (to which the Bayesian prior assigns positive mass and against which regret is measured). Market clearing determines equilibrium prices from aggregate demand, but wealth updates depend on realized payoffs from the exogenous distribution; the true distribution itself does not depend on agents' strategies. We will revise Section 3 to include the explicit wealth-update equation and market-clearing condition, and add a clarifying sentence on exogeneity. This construction ensures the fixed-point issue does not arise. revision: yes
Referee: The statement that 'an agent may achieve even logarithmic regret and yet be driven out' is presented as a central finding, yet no explicit wealth-update equation, market-clearing condition, or regret bound is supplied in the abstract. Without these, it is impossible to verify whether the claimed separation between regret and survival follows from the model assumptions rather than from an implicit exogenous-payoff assumption.

Authors: The abstract summarizes the main findings at a high level; the wealth-update rules, market-clearing conditions, and regret bounds appear explicitly in Sections 3 and 4, where the theorems establishing the separation (under the exogenous-payoff model) are proved. We will revise the abstract to briefly reference the exogenous stochastic payoffs assumption, improving verifiability while respecting abstract conventions. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained against external benchmarks

full rationale

The provided abstract and context present a comparison of Bayesian and no-regret learners via market survival and regret bounds, relating existing economic and algorithmic frameworks without any quoted equations or steps that reduce a claimed prediction to a fitted input, self-definition, or self-citation chain. No load-bearing uniqueness theorem or ansatz is invoked from prior author work in a way that collapses the central contrast (low regret not guaranteeing survival against a finite-prior Bayesian) to an input by construction. The modeling assumptions about exogenous payoffs and correct models are stated as primitives for the comparison rather than derived from the result itself, satisfying the criteria for an independent derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Analysis rests on standard domain assumptions from game theory and online learning; no free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Asset markets have stochastic payoffs
Explicitly stated as the setting for the heterogeneous-agent dynamics.
domain assumption Survival and market dominance are well-defined outcomes of repeated trading
Invoked when relating regret to market selection.

pith-pipeline@v0.9.0 · 5734 in / 1197 out tokens · 34882 ms · 2026-05-23T03:52:05.393805+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 3.1: agent survives iff lim (R_n(T) - R_m(T)) < ∞ for every competitor m
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

wealth ratio log(r_nm_T) expressed via relative entropies I_q(α)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

88 extracted references · 88 canonical work pages · 1 internal anchor

[1]

Aggarwal, A

G. Aggarwal, A. Gupta, A. Perlroth, and G. Velegkas. Randomized truthful auctions with learning agents.arXiv preprint arXiv:2411.09517, 2024

work page arXiv 2024
[2]

A. Alchian. Uncertainty, evolution and economic theory.Journal of Political Economy, 58: 211–221, 1950

work page 1950
[3]

K. J. Arrow. The role of securities in the optimal allocation of risk-bearing1.The Review of Economic Studies, 31(2):91–96, 04 1964

work page 1964
[4]

E. R. Arunachaleswaran, N. Collina, S. Kannan, A. Roth, and J. Ziani. Algorithmic collusion without threats.arXiv preprint arXiv:2409.03956, 2024

work page arXiv 2024
[5]

E. R. Arunachaleswaran, N. Collina, and J. Schneider. Learning to play against unknown opponents.arXiv preprint arXiv:2412.18297, 2024

work page arXiv 2024
[6]

E. R. Arunachaleswaran, N. Collina, and J. Schneider. Pareto-optimal algorithms for learning in games. InProceedings of the 25th ACM Conference on Economics and Computation, pages 490–510, 2024

work page 2024
[7]

P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The nonstochastic multiarmed bandit problem.SIAM J. Comput., 32(1):48–77, 2002

work page 2002
[8]

Babaioff, Y

M. Babaioff, Y. Kolumbus, and E. Winter. Optimal collaterals in multi-enterprise investment networks. InProceedings of the ACM Web Conference 2022, pages 79–89, 2022

work page 2022
[9]

S. R. Balseiro and Y. Gur. Learning in repeated auctions with budgets: Regret minimization and equilibrium.Management Science, 65(9), 2019

work page 2019
[10]

Banchio and G

M. Banchio and G. Mantegazza. Adaptive algorithms and collusion via coupling. InEC, page 208, 2023

work page 2023
[11]

Battalio, B

R. Battalio, B. Hatch, and M. Sağlam. The cost of exposing large institutional orders to electronic liquidity providers.Management Science, 70(6):3597–3618, 2024

work page 2024
[12]

Bichler, S

M. Bichler, S. B. Lunowa, M. Oberlechner, F. R. Pieroth, and B. Wohlmuth. On the convergence of learning algorithms in bayesian auction games.arXiv preprint arXiv:2311.15398, 2023

work page arXiv 2023
[13]

Bischi, L

G.-I. Bischi, L. Sbragia, and F. Szidarovszky. Learning the demand function in a repeated cournot oligopoly game.International Journal of Systems Science, 39(4):403–419, 2008

work page 2008
[14]

Blackwell

D. Blackwell. An analog of the minimax theorem for vector payoffs. 1956

work page 1956
[15]

Blum and A

A. Blum and A. Kalai. Universal portfolios with and without transaction costs.Mach. Learn., 35 (3):193–205, 1999

work page 1999
[16]

A. Blum, M. Hajiaghayi, K. Ligett, and A. Roth. Regret minimization and the price of total anarchy. InProceedings of the fortieth annual ACM symposium on Theory of computing, pages 373–382, 2008

work page 2008
[17]

Blume and D

L. Blume and D. Easley. Evolution and market behavior.Journal of Economic Theory, 58(1): 9–40, Oct. 1992

work page 1992
[18]

Blume and D

L. Blume and D. Easley. If you’re so smart, why aren’t you rich? belief selection in complete and incomplete markets.Econometrica, 74(4):929–966, 2006

work page 2006
[19]

S. Brânzei. Exchange markets: proportional response dynamics and beyond.ACM SIGecom Exchanges, 19(2):37–45, 2021. 27

work page 2021
[20]

Branzei, R

S. Branzei, R. Mehta, and N. Nisan. Universal growth in production economies.Advances in Neural Information Processing Systems, 31, 2018

work page 2018
[21]

Brânzei, N

S. Brânzei, N. Devanur, and Y. Rabani. Proportional dynamics in exchange economies. In Proceedings of the 22nd ACM Conference on Economics and Computation, pages 180–201, 2021

work page 2021
[22]

Branzei, R

S. Branzei, R. Mehta, and N. Nisan. Tit-for-tat strategies drive growth and inequality in production economies. InProceedings A, volume 481. The Royal Society, 2025

work page 2025
[23]

Braverman, J

M. Braverman, J. Mao, J. Schneider, and S. M. Weinberg. Selling to a no-regret buyer. InACM Conference on Economics and Computation, EC, pages 523–538, 2018

work page 2018
[24]

G. W. Brown. Iterative solution of games by fictitious play.Activity analysis of production and allocation, 13(1):374–376, 1951

work page 1951
[25]

Z. Y. Brown and A. MacKay. Competition in pricing algorithms.American Economic Journal: Microeconomics, 15(2):109–156, 2023

work page 2023
[26]

L. Cai, S. M. Weinberg, E. Wildenhain, and S. Zhang. Selling to multiple no-regret buyers. Working paper available athttps://arxiv.org/pdf/2307.04175.pdf, 2023

work page arXiv 2023
[27]

Cesa-Bianchi and G

N. Cesa-Bianchi and G. Lugosi.Prediction, learning, and games. Cambridge university press, 2006

work page 2006
[28]

Cesa-Bianchi, T

N. Cesa-Bianchi, T. R. Cesari, R. Colomboni, F. Fusco, and S. Leonardi. A regret analysis of bilateral trade. InProceedings of the 22nd ACM Conference on Economics and Computation, pages 289–309, 2021

work page 2021
[29]

Cesa-Bianchi, T

N. Cesa-Bianchi, T. Cesari, R. Colomboni, F. Fusco, and S. Leonardi. Bilateral trade: A regret minimization perspective.Mathematics of Operations Research, 2023

work page 2023
[30]

Y. K. Cheung and R. Cole. Amortized analysis of asynchronous price dynamics.arXiv:1806.10952, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[31]

Y. K. Cheung, R. Cole, and Y. Tao. Dynamics of distributed updating in fisher markets. In Proceedings of the 2018 ACM Conference on Economics and Computation, pages 351–368, 2018

work page 2018
[32]

B. S. Clarke and A. R. Barron. Information theoretic asymptotics of bayes methods.IEEE Transactions on Information Theory, 36(3):453–71, 1990

work page 1990
[33]

Cole and L

R. Cole and L. Fleischer. Fast-converging tatonnement algorithms for one-time and ongoing market problems. InProceedings of the fortieth annual ACM symposium on Theory of computing, pages 315–324, 2008

work page 2008
[34]

Collina, V

N. Collina, V. Gupta, and A. Roth. Repeated contracting with multiple non-myopic agents: Policy regret and limited liability. InProceedings of the 25th ACM Conference on Economics and Computation, pages 640–668, 2024

work page 2024
[35]

Daskalakis and V

C. Daskalakis and V. Syrgkanis. Learning in auctions: Regret is hard, envy is easy. In2016 ieee 57th annual symposium on foundations of computer science (focs), pages 219–228. IEEE, 2016

work page 2016
[36]

M. L. de Prado.Advances in Financial Machine Learning. Wiley, 2018

work page 2018
[37]

S. Deng, M. Schiffer, and M. Bichler. Exploring competitive and collusive behaviors in algorithmic pricing with deep reinforcement learning.arXiv preprint arXiv:2503.11270, 2025

work page arXiv 2025
[38]

X. Deng, X. Hu, T. Lin, and W. Zheng. Nash convergence of mean-based learning algorithms in first price auctions. InProceedings of the ACM Web Conference 2022, pages 141–150, 2022

work page 2022
[39]

E. Fama. The behavior of stock market prices.Journal of Business, 38(1):34–105, Jan. 1965. 28

work page 1965
[40]

Farrell and E

J. Farrell and E. Maskin. Renegotiation in repeated games.Games and economic behavior, 1(4): 327–360, 1989

work page 1989
[41]

Y. Feng, B. Lucier, and A. Slivkins. Strategic budget selection in a competitive autobidding world. InProceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 213–224, 2024

work page 2024
[42]

Z. Feng, G. Guruganesh, C. Liaw, A. Mehta, and A. Sethi. Convergence analysis of no-regret bidding algorithms in repeated auctions.arXiv preprint arXiv:2009.06136, 2020

work page arXiv 2009
[43]

Fikioris and É

G. Fikioris and É. Tardos. Liquid welfare guarantees for no-regret learning in sequential budgeted auctions. InProceedings of the 24th ACM Conference on Economics and Computation, pages 678–698, 2023

work page 2023
[44]

Fikioris, R

G. Fikioris, R. Kleinberg, Y. Kolumbus, R. Kumar, Y. Mansour, and É. Tardos. Learning in budgeted auctions with spacing objectives.arXiv preprint arXiv:2411.04843, 2024

work page arXiv 2024
[45]

D. P. Foster and R. V. Vohra. Calibrated learning and correlated equilibrium.Games and Economic Behavior, 21(1-2):40–55, 1997

work page 1997
[46]

Friedman.Essays in Positive Economics

M. Friedman.Essays in Positive Economics. University of Chicago Press, Chicago, 1953

work page 1953
[47]

Fudenberg and D

D. Fudenberg and D. K. Levine. Consistency and cautious fictitious play.Journal of Economic Dynamics and Control, 19(5-7):1065–1089, 1995

work page 1995
[48]

Gofer and Y

E. Gofer and Y. Mansour. Lower bounds on individual sequence regret.Machine Learning, 103: 1–26, 2016

work page 2016
[49]

Goldstein, A

M. Goldstein, A. Kwan, and R. Philip. High-frequency trading strategies.Management Science, 69(8):4413–4434, 2023

work page 2023
[50]

W. Guo, M. Jordan, and E. Vitercik. No-regret learning in partially-informed auctions. In International Conference on Machine Learning, pages 8039–8055. PMLR, 2022

work page 2022
[51]

Guruganesh, Y

G. Guruganesh, Y. Kolumbus, J. Schneider, I. Talgam-Cohen, E.-V. Vlatakis-Gkaragkounis, J. Wang, and S. Weinberg. Contracting with a learning agent.Advances in Neural Information Processing Systems, 37:77366–77408, 2024

work page 2024
[52]

Halac, I

M. Halac, I. Kremer, and E. Winter. Raising capital from heterogeneous investors.American Economic Review, 110(3):889–921, 2020

work page 2020
[53]

J. Hannan. Approximation to Bayes risk in repeated play. InContributions to the Theory of Games (AM-39), Volume III, pages 97–139. Princeton University Press, 1957

work page 1957
[54]

Harris.Trading and Exchanges: Market Microstructure for Practitioners

L. Harris.Trading and Exchanges: Market Microstructure for Practitioners. Oxford University Press, 2003

work page 2003
[55]

Hart and A

S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68(5):1127–1150, 2000

work page 2000
[56]

Hart and A

S. Hart and A. Mas-Colell.Simple adaptive strategies: from regret-matching to uncoupled dynamics, volume 4. World Scientific, 2013

work page 2013
[57]

J. D. Hartline, S. Long, and C. Zhang. Regulation of algorithmic collusion. InProceedings of the Symposium on Computer Science and Law, pages 98–108, 2024

work page 2024
[58]

J. D. Hartline, C. Wang, and C. Zhang. Regulation of algorithmic collusion, refined: Testing pessimistic calibrated regret.arXiv preprint arXiv:2501.09740, 2025

work page arXiv 2025
[59]

Hazan and S

E. Hazan and S. Kale. An online portfolio selection algorithm with regret logarithmic in price 29 variation.Mathematical Finance, 25(2):288–310, 2015

work page 2015
[60]

Hazan and C

E. Hazan and C. Seshadhri. Efficient learning algorithms for changing environments. InProceedings of the 26th annual international conference on machine learning, pages 393–400, 2009

work page 2009
[61]

Hazan et al

E. Hazan et al. Introduction to online convex optimization.Foundations and Trends®in Optimization, 2(3-4):157–325, 2016

work page 2016
[62]

Hens and K

T. Hens and K. Schenk-Hoppe.Handbook of Financial Markets: Dynamics and Evolution. 01 2009

work page 2009
[63]

M. O. Jackson and A. Pernoud. Systemic risk in financial networks: A survey.Annual Review of Economics, 13(1):171–202, 2021

work page 2021
[64]

Kalai and S

A. Kalai and S. Vempala. Efficient algorithms for online decision problems.Journal of Computer and System Sciences, 71(3):291–307, 2005

work page 2005
[65]

J. L. Kelly. A new interpretation of information rate.Bell system technical Journal, 35:917–926., 1956

work page 1956
[66]

Kolumbus and N

Y. Kolumbus and N. Nisan. How and why to manipulate your own agent: On the incentives of users of learning agents.Advances in Neural Information Processing Systems, 35:28080–28094, 2022

work page 2022
[67]

Kolumbus and N

Y. Kolumbus and N. Nisan. Auctions between regret-minimizing agents. InACM Web Conference, WebConf, pages 100–111, 2022

work page 2022
[68]

Kolumbus, M

Y. Kolumbus, M. Levy, and N. Nisan. Asynchronous proportional response dynamics: convergence in markets with adversarial scheduling.Advances in Neural Information Processing Systems, 36: 25409–25434, 2023

work page 2023
[69]

Kolumbus, J

Y. Kolumbus, J. Halpern, and É. Tardos. Paying to do better: Games with payments between learning agents.arXiv preprint arXiv:2405.20880, 2024

work page arXiv 2024
[70]

S. S. Kozat and A. C. Singer. Universal constant rebalanced portfolios with switching. In2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, volume 3, pages III–1129. IEEE, 2007

work page 2007
[71]

X. Li, B. Shou, and Z. Qin. An expected regret minimization portfolio selection model.European Journal of Operational Research, 218(2):484–492, 2012

work page 2012
[72]

Lucier, S

B. Lucier, S. Pattathil, A. Slivkins, and M. Zhang. Autobidders with budget and roi constraints: Efficiency, regret, and pacing dynamics. InThe Thirty Seventh Annual Conference on Learning Theory, pages 3642–3643. PMLR, 2024

work page 2024
[73]

Mansour, M

Y. Mansour, M. Mohri, J. Schneider, and B. Sivan. Strategizing against learners in bayesian games. InConference on Learning Theory, pages 5221–5252. PMLR, 2022

work page 2022
[74]

Mardia, J

J. Mardia, J. Jiao, E. Tánczos, R. D. Nowak, and T. Weissman. Concentration inequalities for the empirical distribution of discrete distributions: beyond the method of types.Information and Inference: A Journal of IMA, 9(4):813–850, 2020

work page 2020
[75]

Marto and H

R. Marto and H. Le. The rise of digital advertising and its economic implications.St. Louis Fed On the Economy, Oct 2024. URLhttps://www.stlouisfed.org/on-the-economy

work page 2024
[76]

Mas-Colell, M

A. Mas-Colell, M. Whinston, and J. Green.Microeconomic Theory. Oxford University Press, 1995

work page 1995
[77]

M. O’Hara. High-frequency trading and its impact on markets.Financial Analysts Journal, 70 30 (3):18–27, 2014

work page 2014
[78]

M. S. Pinsker. Information and information stability of random variables and processes.Holden- Day, 1964

work page 1964
[79]

Polyanskiy and Y

Y. Polyanskiy and Y. Wu.Information theory: From coding to learning. Cambridge university press, 2024

work page 2024
[80]

Robinson

J. Robinson. An iterative method of solving a game.Annals of mathematics, pages 296–301, 1951

work page 1951

Showing first 80 references.

[1] [1]

Aggarwal, A

G. Aggarwal, A. Gupta, A. Perlroth, and G. Velegkas. Randomized truthful auctions with learning agents.arXiv preprint arXiv:2411.09517, 2024

work page arXiv 2024

[2] [2]

A. Alchian. Uncertainty, evolution and economic theory.Journal of Political Economy, 58: 211–221, 1950

work page 1950

[3] [3]

K. J. Arrow. The role of securities in the optimal allocation of risk-bearing1.The Review of Economic Studies, 31(2):91–96, 04 1964

work page 1964

[4] [4]

E. R. Arunachaleswaran, N. Collina, S. Kannan, A. Roth, and J. Ziani. Algorithmic collusion without threats.arXiv preprint arXiv:2409.03956, 2024

work page arXiv 2024

[5] [5]

E. R. Arunachaleswaran, N. Collina, and J. Schneider. Learning to play against unknown opponents.arXiv preprint arXiv:2412.18297, 2024

work page arXiv 2024

[6] [6]

E. R. Arunachaleswaran, N. Collina, and J. Schneider. Pareto-optimal algorithms for learning in games. InProceedings of the 25th ACM Conference on Economics and Computation, pages 490–510, 2024

work page 2024

[7] [7]

P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The nonstochastic multiarmed bandit problem.SIAM J. Comput., 32(1):48–77, 2002

work page 2002

[8] [8]

Babaioff, Y

M. Babaioff, Y. Kolumbus, and E. Winter. Optimal collaterals in multi-enterprise investment networks. InProceedings of the ACM Web Conference 2022, pages 79–89, 2022

work page 2022

[9] [9]

S. R. Balseiro and Y. Gur. Learning in repeated auctions with budgets: Regret minimization and equilibrium.Management Science, 65(9), 2019

work page 2019

[10] [10]

Banchio and G

M. Banchio and G. Mantegazza. Adaptive algorithms and collusion via coupling. InEC, page 208, 2023

work page 2023

[11] [11]

Battalio, B

R. Battalio, B. Hatch, and M. Sağlam. The cost of exposing large institutional orders to electronic liquidity providers.Management Science, 70(6):3597–3618, 2024

work page 2024

[12] [12]

Bichler, S

M. Bichler, S. B. Lunowa, M. Oberlechner, F. R. Pieroth, and B. Wohlmuth. On the convergence of learning algorithms in bayesian auction games.arXiv preprint arXiv:2311.15398, 2023

work page arXiv 2023

[13] [13]

Bischi, L

G.-I. Bischi, L. Sbragia, and F. Szidarovszky. Learning the demand function in a repeated cournot oligopoly game.International Journal of Systems Science, 39(4):403–419, 2008

work page 2008

[14] [14]

Blackwell

D. Blackwell. An analog of the minimax theorem for vector payoffs. 1956

work page 1956

[15] [15]

Blum and A

A. Blum and A. Kalai. Universal portfolios with and without transaction costs.Mach. Learn., 35 (3):193–205, 1999

work page 1999

[16] [16]

A. Blum, M. Hajiaghayi, K. Ligett, and A. Roth. Regret minimization and the price of total anarchy. InProceedings of the fortieth annual ACM symposium on Theory of computing, pages 373–382, 2008

work page 2008

[17] [17]

Blume and D

L. Blume and D. Easley. Evolution and market behavior.Journal of Economic Theory, 58(1): 9–40, Oct. 1992

work page 1992

[18] [18]

Blume and D

L. Blume and D. Easley. If you’re so smart, why aren’t you rich? belief selection in complete and incomplete markets.Econometrica, 74(4):929–966, 2006

work page 2006

[19] [19]

S. Brânzei. Exchange markets: proportional response dynamics and beyond.ACM SIGecom Exchanges, 19(2):37–45, 2021. 27

work page 2021

[20] [20]

Branzei, R

S. Branzei, R. Mehta, and N. Nisan. Universal growth in production economies.Advances in Neural Information Processing Systems, 31, 2018

work page 2018

[21] [21]

Brânzei, N

S. Brânzei, N. Devanur, and Y. Rabani. Proportional dynamics in exchange economies. In Proceedings of the 22nd ACM Conference on Economics and Computation, pages 180–201, 2021

work page 2021

[22] [22]

Branzei, R

S. Branzei, R. Mehta, and N. Nisan. Tit-for-tat strategies drive growth and inequality in production economies. InProceedings A, volume 481. The Royal Society, 2025

work page 2025

[23] [23]

Braverman, J

M. Braverman, J. Mao, J. Schneider, and S. M. Weinberg. Selling to a no-regret buyer. InACM Conference on Economics and Computation, EC, pages 523–538, 2018

work page 2018

[24] [24]

G. W. Brown. Iterative solution of games by fictitious play.Activity analysis of production and allocation, 13(1):374–376, 1951

work page 1951

[25] [25]

Z. Y. Brown and A. MacKay. Competition in pricing algorithms.American Economic Journal: Microeconomics, 15(2):109–156, 2023

work page 2023

[26] [26]

L. Cai, S. M. Weinberg, E. Wildenhain, and S. Zhang. Selling to multiple no-regret buyers. Working paper available athttps://arxiv.org/pdf/2307.04175.pdf, 2023

work page arXiv 2023

[27] [27]

Cesa-Bianchi and G

N. Cesa-Bianchi and G. Lugosi.Prediction, learning, and games. Cambridge university press, 2006

work page 2006

[28] [28]

Cesa-Bianchi, T

N. Cesa-Bianchi, T. R. Cesari, R. Colomboni, F. Fusco, and S. Leonardi. A regret analysis of bilateral trade. InProceedings of the 22nd ACM Conference on Economics and Computation, pages 289–309, 2021

work page 2021

[29] [29]

Cesa-Bianchi, T

N. Cesa-Bianchi, T. Cesari, R. Colomboni, F. Fusco, and S. Leonardi. Bilateral trade: A regret minimization perspective.Mathematics of Operations Research, 2023

work page 2023

[30] [30]

Y. K. Cheung and R. Cole. Amortized analysis of asynchronous price dynamics.arXiv:1806.10952, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[31] [31]

Y. K. Cheung, R. Cole, and Y. Tao. Dynamics of distributed updating in fisher markets. In Proceedings of the 2018 ACM Conference on Economics and Computation, pages 351–368, 2018

work page 2018

[32] [32]

B. S. Clarke and A. R. Barron. Information theoretic asymptotics of bayes methods.IEEE Transactions on Information Theory, 36(3):453–71, 1990

work page 1990

[33] [33]

Cole and L

R. Cole and L. Fleischer. Fast-converging tatonnement algorithms for one-time and ongoing market problems. InProceedings of the fortieth annual ACM symposium on Theory of computing, pages 315–324, 2008

work page 2008

[34] [34]

Collina, V

N. Collina, V. Gupta, and A. Roth. Repeated contracting with multiple non-myopic agents: Policy regret and limited liability. InProceedings of the 25th ACM Conference on Economics and Computation, pages 640–668, 2024

work page 2024

[35] [35]

Daskalakis and V

C. Daskalakis and V. Syrgkanis. Learning in auctions: Regret is hard, envy is easy. In2016 ieee 57th annual symposium on foundations of computer science (focs), pages 219–228. IEEE, 2016

work page 2016

[36] [36]

M. L. de Prado.Advances in Financial Machine Learning. Wiley, 2018

work page 2018

[37] [37]

S. Deng, M. Schiffer, and M. Bichler. Exploring competitive and collusive behaviors in algorithmic pricing with deep reinforcement learning.arXiv preprint arXiv:2503.11270, 2025

work page arXiv 2025

[38] [38]

X. Deng, X. Hu, T. Lin, and W. Zheng. Nash convergence of mean-based learning algorithms in first price auctions. InProceedings of the ACM Web Conference 2022, pages 141–150, 2022

work page 2022

[39] [39]

E. Fama. The behavior of stock market prices.Journal of Business, 38(1):34–105, Jan. 1965. 28

work page 1965

[40] [40]

Farrell and E

J. Farrell and E. Maskin. Renegotiation in repeated games.Games and economic behavior, 1(4): 327–360, 1989

work page 1989

[41] [41]

Y. Feng, B. Lucier, and A. Slivkins. Strategic budget selection in a competitive autobidding world. InProceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 213–224, 2024

work page 2024

[42] [42]

Z. Feng, G. Guruganesh, C. Liaw, A. Mehta, and A. Sethi. Convergence analysis of no-regret bidding algorithms in repeated auctions.arXiv preprint arXiv:2009.06136, 2020

work page arXiv 2009

[43] [43]

Fikioris and É

G. Fikioris and É. Tardos. Liquid welfare guarantees for no-regret learning in sequential budgeted auctions. InProceedings of the 24th ACM Conference on Economics and Computation, pages 678–698, 2023

work page 2023

[44] [44]

Fikioris, R

G. Fikioris, R. Kleinberg, Y. Kolumbus, R. Kumar, Y. Mansour, and É. Tardos. Learning in budgeted auctions with spacing objectives.arXiv preprint arXiv:2411.04843, 2024

work page arXiv 2024

[45] [45]

D. P. Foster and R. V. Vohra. Calibrated learning and correlated equilibrium.Games and Economic Behavior, 21(1-2):40–55, 1997

work page 1997

[46] [46]

Friedman.Essays in Positive Economics

M. Friedman.Essays in Positive Economics. University of Chicago Press, Chicago, 1953

work page 1953

[47] [47]

Fudenberg and D

D. Fudenberg and D. K. Levine. Consistency and cautious fictitious play.Journal of Economic Dynamics and Control, 19(5-7):1065–1089, 1995

work page 1995

[48] [48]

Gofer and Y

E. Gofer and Y. Mansour. Lower bounds on individual sequence regret.Machine Learning, 103: 1–26, 2016

work page 2016

[49] [49]

Goldstein, A

M. Goldstein, A. Kwan, and R. Philip. High-frequency trading strategies.Management Science, 69(8):4413–4434, 2023

work page 2023

[50] [50]

W. Guo, M. Jordan, and E. Vitercik. No-regret learning in partially-informed auctions. In International Conference on Machine Learning, pages 8039–8055. PMLR, 2022

work page 2022

[51] [51]

Guruganesh, Y

G. Guruganesh, Y. Kolumbus, J. Schneider, I. Talgam-Cohen, E.-V. Vlatakis-Gkaragkounis, J. Wang, and S. Weinberg. Contracting with a learning agent.Advances in Neural Information Processing Systems, 37:77366–77408, 2024

work page 2024

[52] [52]

Halac, I

M. Halac, I. Kremer, and E. Winter. Raising capital from heterogeneous investors.American Economic Review, 110(3):889–921, 2020

work page 2020

[53] [53]

J. Hannan. Approximation to Bayes risk in repeated play. InContributions to the Theory of Games (AM-39), Volume III, pages 97–139. Princeton University Press, 1957

work page 1957

[54] [54]

Harris.Trading and Exchanges: Market Microstructure for Practitioners

L. Harris.Trading and Exchanges: Market Microstructure for Practitioners. Oxford University Press, 2003

work page 2003

[55] [55]

Hart and A

S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68(5):1127–1150, 2000

work page 2000

[56] [56]

Hart and A

S. Hart and A. Mas-Colell.Simple adaptive strategies: from regret-matching to uncoupled dynamics, volume 4. World Scientific, 2013

work page 2013

[57] [57]

J. D. Hartline, S. Long, and C. Zhang. Regulation of algorithmic collusion. InProceedings of the Symposium on Computer Science and Law, pages 98–108, 2024

work page 2024

[58] [58]

J. D. Hartline, C. Wang, and C. Zhang. Regulation of algorithmic collusion, refined: Testing pessimistic calibrated regret.arXiv preprint arXiv:2501.09740, 2025

work page arXiv 2025

[59] [59]

Hazan and S

E. Hazan and S. Kale. An online portfolio selection algorithm with regret logarithmic in price 29 variation.Mathematical Finance, 25(2):288–310, 2015

work page 2015

[60] [60]

Hazan and C

E. Hazan and C. Seshadhri. Efficient learning algorithms for changing environments. InProceedings of the 26th annual international conference on machine learning, pages 393–400, 2009

work page 2009

[61] [61]

Hazan et al

E. Hazan et al. Introduction to online convex optimization.Foundations and Trends®in Optimization, 2(3-4):157–325, 2016

work page 2016

[62] [62]

Hens and K

T. Hens and K. Schenk-Hoppe.Handbook of Financial Markets: Dynamics and Evolution. 01 2009

work page 2009

[63] [63]

M. O. Jackson and A. Pernoud. Systemic risk in financial networks: A survey.Annual Review of Economics, 13(1):171–202, 2021

work page 2021

[64] [64]

Kalai and S

A. Kalai and S. Vempala. Efficient algorithms for online decision problems.Journal of Computer and System Sciences, 71(3):291–307, 2005

work page 2005

[65] [65]

J. L. Kelly. A new interpretation of information rate.Bell system technical Journal, 35:917–926., 1956

work page 1956

[66] [66]

Kolumbus and N

Y. Kolumbus and N. Nisan. How and why to manipulate your own agent: On the incentives of users of learning agents.Advances in Neural Information Processing Systems, 35:28080–28094, 2022

work page 2022

[67] [67]

Kolumbus and N

Y. Kolumbus and N. Nisan. Auctions between regret-minimizing agents. InACM Web Conference, WebConf, pages 100–111, 2022

work page 2022

[68] [68]

Kolumbus, M

Y. Kolumbus, M. Levy, and N. Nisan. Asynchronous proportional response dynamics: convergence in markets with adversarial scheduling.Advances in Neural Information Processing Systems, 36: 25409–25434, 2023

work page 2023

[69] [69]

Kolumbus, J

Y. Kolumbus, J. Halpern, and É. Tardos. Paying to do better: Games with payments between learning agents.arXiv preprint arXiv:2405.20880, 2024

work page arXiv 2024

[70] [70]

S. S. Kozat and A. C. Singer. Universal constant rebalanced portfolios with switching. In2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, volume 3, pages III–1129. IEEE, 2007

work page 2007

[71] [71]

X. Li, B. Shou, and Z. Qin. An expected regret minimization portfolio selection model.European Journal of Operational Research, 218(2):484–492, 2012

work page 2012

[72] [72]

Lucier, S

B. Lucier, S. Pattathil, A. Slivkins, and M. Zhang. Autobidders with budget and roi constraints: Efficiency, regret, and pacing dynamics. InThe Thirty Seventh Annual Conference on Learning Theory, pages 3642–3643. PMLR, 2024

work page 2024

[73] [73]

Mansour, M

Y. Mansour, M. Mohri, J. Schneider, and B. Sivan. Strategizing against learners in bayesian games. InConference on Learning Theory, pages 5221–5252. PMLR, 2022

work page 2022

[74] [74]

Mardia, J

J. Mardia, J. Jiao, E. Tánczos, R. D. Nowak, and T. Weissman. Concentration inequalities for the empirical distribution of discrete distributions: beyond the method of types.Information and Inference: A Journal of IMA, 9(4):813–850, 2020

work page 2020

[75] [75]

Marto and H

R. Marto and H. Le. The rise of digital advertising and its economic implications.St. Louis Fed On the Economy, Oct 2024. URLhttps://www.stlouisfed.org/on-the-economy

work page 2024

[76] [76]

Mas-Colell, M

A. Mas-Colell, M. Whinston, and J. Green.Microeconomic Theory. Oxford University Press, 1995

work page 1995

[77] [77]

M. O’Hara. High-frequency trading and its impact on markets.Financial Analysts Journal, 70 30 (3):18–27, 2014

work page 2014

[78] [78]

M. S. Pinsker. Information and information stability of random variables and processes.Holden- Day, 1964

work page 1964

[79] [79]

Polyanskiy and Y

Y. Polyanskiy and Y. Wu.Information theory: From coding to learning. Cambridge university press, 2024

work page 2024

[80] [80]

Robinson

J. Robinson. An iterative method of solving a game.Annals of mathematics, pages 296–301, 1951

work page 1951