Misspecified Explore-then-Exploit Leads to Supra-Competitive Prices
Pith reviewed 2026-05-19 18:01 UTC · model grok-4.3
The pith
Firms using explore-then-exploit pricing with misspecified demand models converge to prices above the Nash equilibrium.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that an explore-then-exploit pricing pipeline relying on a misspecified monopoly-style demand estimation converges to supra-competitive prices above the Nash equilibrium when firms explore within similar price ranges on the same side of the Nash price. Through a fluid-limit ordinary differential equation analysis, they show that prices can reach monopoly levels under symmetric exploration. Simulations calibrated to a real multifamily rental market confirm that supra-competitive outcomes arise robustly beyond the theoretical assumptions, including under finite horizons, heterogeneous products, and nonlinear logit demand.
What carries the argument
Fluid-limit ordinary differential equation analysis of the explore-then-exploit pricing dynamics under misspecified monopoly demand estimation.
If this is right
- Supra-competitive prices arise when firms explore within similar price ranges on the same side of the Nash price.
- Prices can reach monopoly levels under symmetric exploration.
- The outcome persists in simulations with finite horizons, heterogeneous products, and nonlinear logit demand.
- Basic algorithmic pricing systems can systematically generate collusive-like prices without explicit coordination.
Where Pith is reading between the lines
- Regulators could examine whether common pricing software structures create unintended high-price equilibria across markets.
- Firms might reduce the effect by expanding their demand models to account for observed competitor prices.
- Analogous misspecifications in other repeated decision algorithms could produce similarly elevated equilibria in non-price settings.
- Testing the same pipeline on markets with different demand curvatures would clarify how sensitive the supra-competitive outcome is to functional form.
Load-bearing premise
The demand estimation step uses a misspecified monopoly-style model that omits competitors' prices, and exploration occurs within similar ranges on the same side of the Nash price.
What would settle it
Observing convergence to the Nash equilibrium instead of supra-competitive prices when firms either include competitors' prices in the demand model or explore ranges on opposite sides of the Nash price.
Figures
read the original abstract
We study whether simple algorithmic pricing systems can systematically produce collusive-like prices in multi-firm markets. We consider firms using an explore-then-exploit pipeline: they randomize prices during an initial exploration phase, then estimate demand from their own historical data and set prices myopically thereafter. The estimation step relies on a misspecified, monopoly-style model that omits competitors' prices. We characterize when this pipeline converges to supra-competitive prices above the Nash equilibrium, via a fluid-limit ordinary differential equation analysis. We show that supra-competitive prices arise when firms explore within similar price ranges on the same side of the Nash price. Moreover, prices can be substantially above the Nash price; we show that prices can reach monopoly levels under symmetric exploration. Simulations calibrated to a real multifamily rental market confirm that supra-competitive outcomes arise robustly beyond our theoretical assumptions, including under finite horizons, heterogeneous products, and nonlinear logit demand.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies the emergence of supra-competitive pricing in oligopoly markets when firms employ an explore-then-exploit strategy with a misspecified demand model that ignores competitors' prices. Using a fluid-limit ordinary differential equation (ODE) analysis, the authors show that convergence to prices above the Nash equilibrium occurs when exploration ranges are similar and lie on the same side of the Nash price. Symmetric exploration can lead to the monopoly price as the fixed point. The theoretical results are supported by simulations that extend to finite time horizons, heterogeneous products, and logit demand, calibrated to data from a real multifamily rental market.
Significance. This result is significant as it identifies a specific mechanism—misspecification in demand estimation combined with correlated exploration—through which algorithmic pricing can lead to outcomes resembling collusion without any intent to collude. The analytical approach using ODEs provides precise conditions for when this happens, and the simulations demonstrate robustness. Strengths include the parameter-free nature of the core result under the stated exploration assumptions and the connection to real-world data. This contributes to the literature on algorithmic collusion and has potential policy implications for regulating pricing algorithms.
minor comments (3)
- [Abstract] The abstract mentions 'supra-competitive outcomes arise robustly beyond our theoretical assumptions'; specifying one or two key extensions in the abstract would enhance impact.
- [§3] The transition from the discrete-time process to the fluid-limit ODE could include a brief outline of the convergence theorem used, even if standard.
- [Figure 2] The plot of price trajectories would be clearer with annotations indicating the Nash and monopoly prices for reference.
Simulated Author's Rebuttal
We thank the referee for their positive summary of the manuscript and for recommending minor revision. The referee's description accurately reflects the paper's focus on misspecified explore-then-exploit pricing and the conditions leading to supra-competitive outcomes. No major comments were raised in the report.
Circularity Check
No significant circularity
full rationale
The derivation relies on an explicit fluid-limit ODE constructed from the explore-then-exploit dynamics and the misspecified monopoly demand model. Fixed points of the ODE are solved directly from the myopic best-response mapping under the stated exploration ranges and misspecification; these are not obtained by fitting to the target supra-competitive outcome or by renaming an input. No self-citation is invoked as a load-bearing uniqueness theorem, and the analysis is self-contained against the model's own assumptions without reducing any prediction to a fitted quantity by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Fluid-limit ordinary differential equation approximation governs the long-run price dynamics
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We characterize when this pipeline converges to supra-competitive prices above the Nash equilibrium, via a fluid-limit ordinary differential equation analysis.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
I. Abada and X. Lambin. Artificial intelligence: Can seemingly collusive outcomes be avoided? Management Science, 69 0 (9): 0 5042--5065, 2023
work page 2023
- [2]
-
[3]
A. Aouad and A. V. den Boer. Algorithmic collusion in assortment games. Available at SSRN 3930364 , 2021
work page 2021
- [4]
- [5]
- [6]
-
[7]
Y. Aviv and A. Pazgal. Pricing of short life-cycle products through active learning. Working paper, Washington University in St. Louis, 2002
work page 2002
-
[8]
M. Banchio and G. Mantegazza. Artificial intelligence and spontaneous collusion. arXiv preprint arXiv :2202.05946 , 2022
-
[9]
Z. Y. Brown and A. MacKay. Competition in pricing algorithms. American Economic Journal: Microeconomics, 15 0 (2): 0 109--156, 2023
work page 2023
-
[10]
S. Calder-Wang and G. H. Kim. Algorithmic pricing in multifamily rentals: Efficiency gains or price coordination? Available at SSRN 4403058 , 2024
work page 2024
-
[11]
E. Calvano, G. Calzolari, V. Denicol \`o , and S. Pastorello. Artificial intelligence, algorithmic pricing, and collusion. American Economic Review, 110 0 (10): 0 3267--3297, 2020
work page 2020
-
[12]
L. Chen, A. Mislove, and C. Wilson. An empirical analysis of algorithmic pricing on Amazon Marketplace . In Proceedings of the 25th International Conference on World Wide Web , pages 1339--1349, 2016
work page 2016
-
[13]
W. L. Cooper, T. Homem-de-Mello , and A. J. Kleywegt. Learning and pricing with models that do not explicitly incorporate competition. Operations Research, 63 0 (1): 0 86--103, 2015
work page 2015
-
[14]
A. V. den Boer, J. M. Meylahn, and M. P. Schinkel. Artificial collusion: Examining supracompetitive pricing by Q -learning algorithms. Amsterdam Law School Research Paper No. 2022-25; Amsterdam Center for Law & Economics Working Paper No. 2022-06, 2024
work page 2022
-
[15]
C. Douglas, F. Provost, and A. Sundararajan. Naive algorithmic collusion: When do bandit learners cooperate and when do they compete? arXiv preprint arXiv :2411.16574 , 2024
-
[16]
V. F. Farias and B. Van Roy. Dynamic pricing with a prior on market response. Operations Research, 58 0 (1): 0 16--29, 2010
work page 2010
- [17]
-
[18]
K. T. Hansen, K. Misra, and M. M. Pai. Frontiers: Algorithmic collusion: Supra-competitive prices via independent algorithms. Marketing Science, 40 0 (1): 0 1--12, 2021
work page 2021
-
[19]
M. Hettich. Algorithmic collusion: Insights from deep learning. Available at SSRN 3785966 , 2021
work page 2021
- [20]
-
[21]
A. P. Kirman. Learning by firms about demand conditions. In R. H. Day and T. Groves, editors, Adaptive Economic Models, pages 137--156. Academic Press, New York, 1975
work page 1975
-
[22]
A. P. Kirman. On mistaken beliefs and resultant equilibria. In R. Frydman and E. S. Phelps, editors, Individual Forecasting and Aggregate Outcomes: ``Rational Expectations'' Examined, pages 147--168. Cambridge University Press, Cambridge, 1986
work page 1986
-
[23]
A. P. Kirman. Learning in oligopoly: Theory, simulation, and experimental evidence. In A. P. Kirman and M. Salmon, editors, Learning and Rationality in Economics, pages 127--178. Basil Blackwell, Cambridge, MA, 1995
work page 1995
-
[24]
T. Klein. Autonomous algorithmic collusion: Q -learning under sequential pricing. The RAND Journal of Economics , 52 0 (3): 0 538--558, 2021
work page 2021
-
[25]
M. A. Lariviere and E. L. Porteus. Stalking information: Bayesian inventory management with unobserved lost sales. Management Science, 45 0 (3): 0 346--363, 1999
work page 1999
-
[26]
M. Lin and \"O . Sar ta c . Competition in pricing algorithms: Stability, exploration, and supracompetitive outcomes. Manuscript submitted for review, December 2025
work page 2025
-
[27]
T. Loots and A. V. den Boer. Data-driven collusion and competition in a pricing duopoly with multinomial logit demand. Production and Operations Management, 32 0 (4): 0 1169--1186, 2023
work page 2023
-
[28]
J. M. Meylahn and A. V. den Boer. Learning to collude in a pricing duopoly. Manufacturing & Service Operations Management, 24 0 (5): 0 2577--2594, 2022
work page 2022
-
[29]
H. A. Simon. Dynamic programming under uncertainty with a quadratic criterion function. Econometrica, 24 0 (1): 0 74--81, 1956
work page 1956
-
[30]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998
work page 1998
-
[31]
Z. Yang, X. Lei, and P. Gao. Regulating discriminatory pricing in the presence of tacit collusion. Available at SSRN 4633784 , 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.