Misspecified Explore-then-Exploit Leads to Supra-Competitive Prices

Farrell Wu; Jackie Baek; Vivek F. Farias

arxiv: 2605.16064 · v1 · pith:MDN3OWE6new · submitted 2026-05-15 · 💻 cs.GT · cs.AI· econ.TH

Misspecified Explore-then-Exploit Leads to Supra-Competitive Prices

Jackie Baek , Vivek F. Farias , Farrell Wu This is my paper

Pith reviewed 2026-05-19 18:01 UTC · model grok-4.3

classification 💻 cs.GT cs.AIecon.TH

keywords algorithmic pricingexplore-then-exploitmisspecified demandsupra-competitive pricesNash equilibriumdemand estimationfluid limitpricing dynamics

0 comments

The pith

Firms using explore-then-exploit pricing with misspecified demand models converge to prices above the Nash equilibrium.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines a simple algorithmic pricing approach in which firms first randomize prices during an exploration phase and then estimate demand from their own data to set myopic prices thereafter. The estimation uses a monopoly-style model that leaves out competitors' prices entirely. When the exploration ranges are similar and lie on the same side of the Nash price, the resulting dynamics drive prices upward, sometimes all the way to monopoly levels under symmetric exploration. A sympathetic reader cares because the result shows how routine algorithmic tools can produce persistently high prices without any coordinated intent.

Core claim

The authors establish that an explore-then-exploit pricing pipeline relying on a misspecified monopoly-style demand estimation converges to supra-competitive prices above the Nash equilibrium when firms explore within similar price ranges on the same side of the Nash price. Through a fluid-limit ordinary differential equation analysis, they show that prices can reach monopoly levels under symmetric exploration. Simulations calibrated to a real multifamily rental market confirm that supra-competitive outcomes arise robustly beyond the theoretical assumptions, including under finite horizons, heterogeneous products, and nonlinear logit demand.

What carries the argument

Fluid-limit ordinary differential equation analysis of the explore-then-exploit pricing dynamics under misspecified monopoly demand estimation.

If this is right

Supra-competitive prices arise when firms explore within similar price ranges on the same side of the Nash price.
Prices can reach monopoly levels under symmetric exploration.
The outcome persists in simulations with finite horizons, heterogeneous products, and nonlinear logit demand.
Basic algorithmic pricing systems can systematically generate collusive-like prices without explicit coordination.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Regulators could examine whether common pricing software structures create unintended high-price equilibria across markets.
Firms might reduce the effect by expanding their demand models to account for observed competitor prices.
Analogous misspecifications in other repeated decision algorithms could produce similarly elevated equilibria in non-price settings.
Testing the same pipeline on markets with different demand curvatures would clarify how sensitive the supra-competitive outcome is to functional form.

Load-bearing premise

The demand estimation step uses a misspecified monopoly-style model that omits competitors' prices, and exploration occurs within similar ranges on the same side of the Nash price.

What would settle it

Observing convergence to the Nash equilibrium instead of supra-competitive prices when firms either include competitors' prices in the demand model or explore ranges on opposite sides of the Nash price.

Figures

Figures reproduced from arXiv: 2605.16064 by Farrell Wu, Jackie Baek, Vivek F. Farias.

**Figure 1.** Figure 1: Left: the shaded regions depict the best-response cones in the (µ1, µ2) plane, where µi is firm i’s average exploration price. We show that the terminal prices are supra-competitive whenever (µ1, µ2) lies in the shaded region. The angle θ depends on the demand parameters but is always at least 45◦ , so the cones cover more than one quarter of the feasible exploration-mean space. Right: Final price under sy… view at source ↗

**Figure 2.** Figure 2: Best-response cones in the duopoly case, shown in the [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: ODE-implied terminal price P ODE 1 (α; µ, σ2 expI2×2) in the duopoly case. Each panel fixes (α, Σexp) and varies the exploration means (µ1, µ2). White marks the Nash price p NE = 2/3, while red and blue indicate terminal prices above and below Nash. Thin lines mark the best-response boundaries defining the two cones. Horizons sharpen cone-like regions. As the horizon α increases (left to right), the heatma… view at source ↗

**Figure 4.** Figure 4: ODE and stochastic mean terminal-price heatmaps. The left panel is the deterministic [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: ODE map and terminal-price histograms from stochastic simulations at [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Terminal rent changes relative to Nash as the exploration mean multiplier [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: Finite-time dynamics of terminal rent changes relative to Nash. We fix [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

**Figure 8.** Figure 8: ODE-implied mean terminal price under interval sampling. For each boundary pair [PITH_FULL_IMAGE:figures/full_fig_p042_8.png] view at source ↗

**Figure 9.** Figure 9: Mean terminal price under center–dispersion sampling. For each anchor price [PITH_FULL_IMAGE:figures/full_fig_p043_9.png] view at source ↗

read the original abstract

We study whether simple algorithmic pricing systems can systematically produce collusive-like prices in multi-firm markets. We consider firms using an explore-then-exploit pipeline: they randomize prices during an initial exploration phase, then estimate demand from their own historical data and set prices myopically thereafter. The estimation step relies on a misspecified, monopoly-style model that omits competitors' prices. We characterize when this pipeline converges to supra-competitive prices above the Nash equilibrium, via a fluid-limit ordinary differential equation analysis. We show that supra-competitive prices arise when firms explore within similar price ranges on the same side of the Nash price. Moreover, prices can be substantially above the Nash price; we show that prices can reach monopoly levels under symmetric exploration. Simulations calibrated to a real multifamily rental market confirm that supra-competitive outcomes arise robustly beyond our theoretical assumptions, including under finite horizons, heterogeneous products, and nonlinear logit demand.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper studies the emergence of supra-competitive pricing in oligopoly markets when firms employ an explore-then-exploit strategy with a misspecified demand model that ignores competitors' prices. Using a fluid-limit ordinary differential equation (ODE) analysis, the authors show that convergence to prices above the Nash equilibrium occurs when exploration ranges are similar and lie on the same side of the Nash price. Symmetric exploration can lead to the monopoly price as the fixed point. The theoretical results are supported by simulations that extend to finite time horizons, heterogeneous products, and logit demand, calibrated to data from a real multifamily rental market.

Significance. This result is significant as it identifies a specific mechanism—misspecification in demand estimation combined with correlated exploration—through which algorithmic pricing can lead to outcomes resembling collusion without any intent to collude. The analytical approach using ODEs provides precise conditions for when this happens, and the simulations demonstrate robustness. Strengths include the parameter-free nature of the core result under the stated exploration assumptions and the connection to real-world data. This contributes to the literature on algorithmic collusion and has potential policy implications for regulating pricing algorithms.

minor comments (3)

[Abstract] The abstract mentions 'supra-competitive outcomes arise robustly beyond our theoretical assumptions'; specifying one or two key extensions in the abstract would enhance impact.
[§3] The transition from the discrete-time process to the fluid-limit ODE could include a brief outline of the convergence theorem used, even if standard.
[Figure 2] The plot of price trajectories would be clearer with annotations indicating the Nash and monopoly prices for reference.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript and for recommending minor revision. The referee's description accurately reflects the paper's focus on misspecified explore-then-exploit pricing and the conditions leading to supra-competitive outcomes. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation relies on an explicit fluid-limit ODE constructed from the explore-then-exploit dynamics and the misspecified monopoly demand model. Fixed points of the ODE are solved directly from the myopic best-response mapping under the stated exploration ranges and misspecification; these are not obtained by fitting to the target supra-competitive outcome or by renaming an input. No self-citation is invoked as a load-bearing uniqueness theorem, and the analysis is self-contained against the model's own assumptions without reducing any prediction to a fitted quantity by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on the fluid-limit approximation and the structural assumption of misspecified monopoly demand estimation.

axioms (1)

domain assumption Fluid-limit ordinary differential equation approximation governs the long-run price dynamics
Invoked to characterize convergence of the pricing process.

pith-pipeline@v0.9.0 · 5690 in / 1102 out tokens · 46171 ms · 2026-05-19T18:01:19.182846+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We characterize when this pipeline converges to supra-competitive prices above the Nash equilibrium, via a fluid-limit ordinary differential equation analysis.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

Abada and X

I. Abada and X. Lambin. Artificial intelligence: Can seemingly collusive outcomes be avoided? Management Science, 69 0 (9): 0 5042--5065, 2023

work page 2023
[2]

Abada, X

I. Abada, X. Lambin, and N. Tchakarov. Collusion by mistake: Does algorithmic sophistication drive supra-competitive profits? European Journal of Operational Research, 318 0 (3): 0 927--953, 2024

work page 2024
[3]

Aouad and A

A. Aouad and A. V. den Boer. Algorithmic collusion in assortment games. Available at SSRN 3930364 , 2021

work page 2021
[4]

E. R. Arunachaleswaran, N. Collina, S. Kannan, A. Roth, and J. Ziani. Algorithmic collusion without threats. arXiv preprint arXiv :2409.03956 , 2024

work page arXiv 2024
[5]

Asker, C

J. Asker, C. Fershtman, and A. Pakes. Artificial intelligence, algorithm design, and pricing. AEA Papers and Proceedings , 112: 0 452--456, 2022

work page 2022
[6]

Assad, R

S. Assad, R. Clark, D. Ershov, and L. Xu. Algorithmic pricing and competition: Empirical evidence from the German retail gasoline market. Journal of Political Economy, 132 0 (3): 0 723--771, 2024

work page 2024
[7]

Aviv and A

Y. Aviv and A. Pazgal. Pricing of short life-cycle products through active learning. Working paper, Washington University in St. Louis, 2002

work page 2002
[8]

Banchio and G

M. Banchio and G. Mantegazza. Artificial intelligence and spontaneous collusion. arXiv preprint arXiv :2202.05946 , 2022

work page arXiv 2022
[9]

Z. Y. Brown and A. MacKay. Competition in pricing algorithms. American Economic Journal: Microeconomics, 15 0 (2): 0 109--156, 2023

work page 2023
[10]

Calder-Wang and G

S. Calder-Wang and G. H. Kim. Algorithmic pricing in multifamily rentals: Efficiency gains or price coordination? Available at SSRN 4403058 , 2024

work page 2024
[11]

Calvano, G

E. Calvano, G. Calzolari, V. Denicol \`o , and S. Pastorello. Artificial intelligence, algorithmic pricing, and collusion. American Economic Review, 110 0 (10): 0 3267--3297, 2020

work page 2020
[12]

L. Chen, A. Mislove, and C. Wilson. An empirical analysis of algorithmic pricing on Amazon Marketplace . In Proceedings of the 25th International Conference on World Wide Web , pages 1339--1349, 2016

work page 2016
[13]

W. L. Cooper, T. Homem-de-Mello , and A. J. Kleywegt. Learning and pricing with models that do not explicitly incorporate competition. Operations Research, 63 0 (1): 0 86--103, 2015

work page 2015
[14]

A. V. den Boer, J. M. Meylahn, and M. P. Schinkel. Artificial collusion: Examining supracompetitive pricing by Q -learning algorithms. Amsterdam Law School Research Paper No. 2022-25; Amsterdam Center for Law & Economics Working Paper No. 2022-06, 2024

work page 2022
[15]

Douglas, F

C. Douglas, F. Provost, and A. Sundararajan. Naive algorithmic collusion: When do bandit learners cooperate and when do they compete? arXiv preprint arXiv :2411.16574 , 2024

work page arXiv 2024
[16]

V. F. Farias and B. Van Roy. Dynamic pricing with a prior on market response. Operations Research, 58 0 (1): 0 16--29, 2010

work page 2010
[17]

S. Fish, Y. A. Gonczarowski, and R. I. Shorrer. Algorithmic collusion by large language models. arXiv preprint arXiv :2404.00806 , 2024

work page arXiv 2024
[18]

K. T. Hansen, K. Misra, and M. M. Pai. Frontiers: Algorithmic collusion: Supra-competitive prices via independent algorithms. Marketing Science, 40 0 (1): 0 1--12, 2021

work page 2021
[19]

M. Hettich. Algorithmic collusion: Insights from deep learning. Available at SSRN 3785966 , 2021

work page 2021
[20]

Keppo, Y

J. Keppo, Y. Li, G. Tsoukalas, and N. Yuan. A.I. pricing, agent heterogeneity, and collusion. Available at SSRN 5386338 , 2025

work page 2025
[21]

A. P. Kirman. Learning by firms about demand conditions. In R. H. Day and T. Groves, editors, Adaptive Economic Models, pages 137--156. Academic Press, New York, 1975

work page 1975
[22]

A. P. Kirman. On mistaken beliefs and resultant equilibria. In R. Frydman and E. S. Phelps, editors, Individual Forecasting and Aggregate Outcomes: ``Rational Expectations'' Examined, pages 147--168. Cambridge University Press, Cambridge, 1986

work page 1986
[23]

A. P. Kirman. Learning in oligopoly: Theory, simulation, and experimental evidence. In A. P. Kirman and M. Salmon, editors, Learning and Rationality in Economics, pages 127--178. Basil Blackwell, Cambridge, MA, 1995

work page 1995
[24]

T. Klein. Autonomous algorithmic collusion: Q -learning under sequential pricing. The RAND Journal of Economics , 52 0 (3): 0 538--558, 2021

work page 2021
[25]

M. A. Lariviere and E. L. Porteus. Stalking information: Bayesian inventory management with unobserved lost sales. Management Science, 45 0 (3): 0 346--363, 1999

work page 1999
[26]

Lin and \"O

M. Lin and \"O . Sar ta c . Competition in pricing algorithms: Stability, exploration, and supracompetitive outcomes. Manuscript submitted for review, December 2025

work page 2025
[27]

Loots and A

T. Loots and A. V. den Boer. Data-driven collusion and competition in a pricing duopoly with multinomial logit demand. Production and Operations Management, 32 0 (4): 0 1169--1186, 2023

work page 2023
[28]

J. M. Meylahn and A. V. den Boer. Learning to collude in a pricing duopoly. Manufacturing & Service Operations Management, 24 0 (5): 0 2577--2594, 2022

work page 2022
[29]

H. A. Simon. Dynamic programming under uncertainty with a quadratic criterion function. Econometrica, 24 0 (1): 0 74--81, 1956

work page 1956
[30]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998

work page 1998
[31]

Z. Yang, X. Lei, and P. Gao. Regulating discriminatory pricing in the presence of tacit collusion. Available at SSRN 4633784 , 2023

work page 2023

[1] [1]

Abada and X

I. Abada and X. Lambin. Artificial intelligence: Can seemingly collusive outcomes be avoided? Management Science, 69 0 (9): 0 5042--5065, 2023

work page 2023

[2] [2]

Abada, X

I. Abada, X. Lambin, and N. Tchakarov. Collusion by mistake: Does algorithmic sophistication drive supra-competitive profits? European Journal of Operational Research, 318 0 (3): 0 927--953, 2024

work page 2024

[3] [3]

Aouad and A

A. Aouad and A. V. den Boer. Algorithmic collusion in assortment games. Available at SSRN 3930364 , 2021

work page 2021

[4] [4]

E. R. Arunachaleswaran, N. Collina, S. Kannan, A. Roth, and J. Ziani. Algorithmic collusion without threats. arXiv preprint arXiv :2409.03956 , 2024

work page arXiv 2024

[5] [5]

Asker, C

J. Asker, C. Fershtman, and A. Pakes. Artificial intelligence, algorithm design, and pricing. AEA Papers and Proceedings , 112: 0 452--456, 2022

work page 2022

[6] [6]

Assad, R

S. Assad, R. Clark, D. Ershov, and L. Xu. Algorithmic pricing and competition: Empirical evidence from the German retail gasoline market. Journal of Political Economy, 132 0 (3): 0 723--771, 2024

work page 2024

[7] [7]

Aviv and A

Y. Aviv and A. Pazgal. Pricing of short life-cycle products through active learning. Working paper, Washington University in St. Louis, 2002

work page 2002

[8] [8]

Banchio and G

M. Banchio and G. Mantegazza. Artificial intelligence and spontaneous collusion. arXiv preprint arXiv :2202.05946 , 2022

work page arXiv 2022

[9] [9]

Z. Y. Brown and A. MacKay. Competition in pricing algorithms. American Economic Journal: Microeconomics, 15 0 (2): 0 109--156, 2023

work page 2023

[10] [10]

Calder-Wang and G

S. Calder-Wang and G. H. Kim. Algorithmic pricing in multifamily rentals: Efficiency gains or price coordination? Available at SSRN 4403058 , 2024

work page 2024

[11] [11]

Calvano, G

E. Calvano, G. Calzolari, V. Denicol \`o , and S. Pastorello. Artificial intelligence, algorithmic pricing, and collusion. American Economic Review, 110 0 (10): 0 3267--3297, 2020

work page 2020

[12] [12]

L. Chen, A. Mislove, and C. Wilson. An empirical analysis of algorithmic pricing on Amazon Marketplace . In Proceedings of the 25th International Conference on World Wide Web , pages 1339--1349, 2016

work page 2016

[13] [13]

W. L. Cooper, T. Homem-de-Mello , and A. J. Kleywegt. Learning and pricing with models that do not explicitly incorporate competition. Operations Research, 63 0 (1): 0 86--103, 2015

work page 2015

[14] [14]

A. V. den Boer, J. M. Meylahn, and M. P. Schinkel. Artificial collusion: Examining supracompetitive pricing by Q -learning algorithms. Amsterdam Law School Research Paper No. 2022-25; Amsterdam Center for Law & Economics Working Paper No. 2022-06, 2024

work page 2022

[15] [15]

Douglas, F

C. Douglas, F. Provost, and A. Sundararajan. Naive algorithmic collusion: When do bandit learners cooperate and when do they compete? arXiv preprint arXiv :2411.16574 , 2024

work page arXiv 2024

[16] [16]

V. F. Farias and B. Van Roy. Dynamic pricing with a prior on market response. Operations Research, 58 0 (1): 0 16--29, 2010

work page 2010

[17] [17]

S. Fish, Y. A. Gonczarowski, and R. I. Shorrer. Algorithmic collusion by large language models. arXiv preprint arXiv :2404.00806 , 2024

work page arXiv 2024

[18] [18]

K. T. Hansen, K. Misra, and M. M. Pai. Frontiers: Algorithmic collusion: Supra-competitive prices via independent algorithms. Marketing Science, 40 0 (1): 0 1--12, 2021

work page 2021

[19] [19]

M. Hettich. Algorithmic collusion: Insights from deep learning. Available at SSRN 3785966 , 2021

work page 2021

[20] [20]

Keppo, Y

J. Keppo, Y. Li, G. Tsoukalas, and N. Yuan. A.I. pricing, agent heterogeneity, and collusion. Available at SSRN 5386338 , 2025

work page 2025

[21] [21]

A. P. Kirman. Learning by firms about demand conditions. In R. H. Day and T. Groves, editors, Adaptive Economic Models, pages 137--156. Academic Press, New York, 1975

work page 1975

[22] [22]

A. P. Kirman. On mistaken beliefs and resultant equilibria. In R. Frydman and E. S. Phelps, editors, Individual Forecasting and Aggregate Outcomes: ``Rational Expectations'' Examined, pages 147--168. Cambridge University Press, Cambridge, 1986

work page 1986

[23] [23]

A. P. Kirman. Learning in oligopoly: Theory, simulation, and experimental evidence. In A. P. Kirman and M. Salmon, editors, Learning and Rationality in Economics, pages 127--178. Basil Blackwell, Cambridge, MA, 1995

work page 1995

[24] [24]

T. Klein. Autonomous algorithmic collusion: Q -learning under sequential pricing. The RAND Journal of Economics , 52 0 (3): 0 538--558, 2021

work page 2021

[25] [25]

M. A. Lariviere and E. L. Porteus. Stalking information: Bayesian inventory management with unobserved lost sales. Management Science, 45 0 (3): 0 346--363, 1999

work page 1999

[26] [26]

Lin and \"O

M. Lin and \"O . Sar ta c . Competition in pricing algorithms: Stability, exploration, and supracompetitive outcomes. Manuscript submitted for review, December 2025

work page 2025

[27] [27]

Loots and A

T. Loots and A. V. den Boer. Data-driven collusion and competition in a pricing duopoly with multinomial logit demand. Production and Operations Management, 32 0 (4): 0 1169--1186, 2023

work page 2023

[28] [28]

J. M. Meylahn and A. V. den Boer. Learning to collude in a pricing duopoly. Manufacturing & Service Operations Management, 24 0 (5): 0 2577--2594, 2022

work page 2022

[29] [29]

H. A. Simon. Dynamic programming under uncertainty with a quadratic criterion function. Econometrica, 24 0 (1): 0 74--81, 1956

work page 1956

[30] [30]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998

work page 1998

[31] [31]

Z. Yang, X. Lei, and P. Gao. Regulating discriminatory pricing in the presence of tacit collusion. Available at SSRN 4633784 , 2023

work page 2023