Optimal Trading in Automated Market Makers with Deep Learning
Pith reviewed 2026-05-24 09:42 UTC · model grok-4.3
The pith
Traders can hide large orders in constant function market makers by controlling execution speed to eliminate price slippage.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By capturing the conditional dependence between CFMM and CEX variables exactly through conditional elicitability and solving the associated dynamic programming equation with the deep Galerkin method, an optimal trading policy exists that controls the speed of execution to hide orders, thereby removing price slippage while outperforming naive execution rules.
What carries the argument
The dynamic programming equation for the optimal execution problem, solved numerically by the deep Galerkin method after dependence is modeled via conditional elicitability.
If this is right
- The optimal policy varies the trading rate over time to conceal large positions.
- The resulting execution exhibits no price slippage under the modeled dynamics.
- The strategy outperforms constant-rate and other simple execution rules in the numerical tests.
- The solution requires no approximation of the joint market process.
Where Pith is reading between the lines
- The same elicitability-based dependence modeling could be reused for other pairs of decentralized and centralized venues.
- Adding inventory penalties or transaction fees to the objective would produce a more realistic policy.
- The deep Galerkin solver could be replaced by other neural PDE methods to test sensitivity of the no-slippage result.
- Live deployment would require frequent re-estimation of the conditional elicitability parameters as market regimes shift.
Load-bearing premise
Market dynamics between the CFMM and CEX can be captured exactly by conditional elicitability without any approximation.
What would settle it
Apply the computed policy to live or historical order-book data from both venues and check whether measurable slippage appears or whether performance falls to the level of naive strategies.
Figures
read the original abstract
This article explores the optimisation of trading strategies in Constant Function Market Makers (CFMMs) and centralised exchanges. We develop a model that accounts for the interaction between these two markets, estimating the conditional dependence between variables using the concept of conditional elicitability. Furthermore, we pose an optimal execution problem where the agent hides their orders by controlling the rate at which they trade. We do so without approximating the market dynamics. The resulting dynamic programming equation is not analytically tractable, therefore, we employ the deep Galerkin method to solve it. Finally, we conduct numerical experiments and illustrate that the optimal strategy is not prone to price slippage and outperforms na\"ive strategies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to model interactions between CFMMs and CEXs by estimating conditional dependence via conditional elicitability, formulate an optimal execution problem in which the agent hides orders by controlling trade rate, solve the resulting DP equation exactly (without approximating market dynamics) via the deep Galerkin method, and demonstrate in numerical experiments that the resulting optimal strategy is free of price slippage and outperforms naïve strategies.
Significance. If the central claims hold after proper validation, the work would offer a timely contribution to optimal execution in hybrid DeFi/centralized markets by combining elicitability-based dependence modeling with deep-learning solution of high-dimensional DP problems; the absence of approximation in the dynamics and the reported slippage-free property would be notable strengths if substantiated.
major comments (2)
- [Abstract] Abstract: the assertion that the approach proceeds 'without approximating the market dynamics' is not reconciled with the reliance on conditional elicitability for estimation; conditional elicitability supplies a consistent scoring rule but still requires finite-sample estimation, parametric assumptions on the joint law, and discretization in the subsequent deep Galerkin solve, so that any deviation of the estimated conditional law from the true law renders the derived policy optimality and slippage-free property valid only under the wrong measure.
- [Abstract] Abstract: the statement that 'numerical experiments ... illustrate that the optimal strategy is not prone to price slippage and outperforms naïve strategies' provides no error bars, baseline details, or description of how post-training performance was measured, leaving the central empirical support for the no-slippage and outperformance claims unassessable.
minor comments (1)
- [Abstract] The abstract contains the typographical rendering 'naïve' (with escaped quote) that should be corrected for readability.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address each major comment below and will revise the abstract to improve clarity and transparency.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that the approach proceeds 'without approximating the market dynamics' is not reconciled with the reliance on conditional elicitability for estimation; conditional elicitability supplies a consistent scoring rule but still requires finite-sample estimation, parametric assumptions on the joint law, and discretization in the subsequent deep Galerkin solve, so that any deviation of the estimated conditional law from the true law renders the derived policy optimality and slippage-free property valid only under the wrong measure.
Authors: The phrasing 'without approximating the market dynamics' is meant to convey that the dynamic programming equation is derived directly from the CFMM-CEX interaction model once the conditional dependence is specified via elicitability, without introducing further modeling approximations (e.g., no assumed functional forms for price impact or liquidity beyond the elicitability-based conditional law). We acknowledge that finite-sample estimation, any parametric choices in the joint law, and the numerical discretization inherent to the deep Galerkin solver mean the resulting policy is optimal only with respect to the estimated measure. We will revise the abstract to state explicitly that the DP formulation introduces no additional approximations to the dynamics beyond the data-driven estimation step, and that optimality and the slippage-free property hold under the estimated conditional law. revision: yes
-
Referee: [Abstract] Abstract: the statement that 'numerical experiments ... illustrate that the optimal strategy is not prone to price slippage and outperforms naïve strategies' provides no error bars, baseline details, or description of how post-training performance was measured, leaving the central empirical support for the no-slippage and outperformance claims unassessable.
Authors: The abstract is intentionally concise, but the full manuscript contains the experimental details, including simulation setup, comparison baselines (constant-rate and other naive policies), performance metrics for slippage and execution cost, and averaging over multiple runs. To address the concern, we will expand the abstract with a brief clause noting that results are obtained from extensive Monte Carlo simulations with reported performance metrics and variability. This will make the empirical claims more assessable while preserving brevity. revision: yes
Circularity Check
No circularity: model formulation and numerical solve remain independent of fitted outputs
full rationale
The derivation begins with a conditional-elicitability model for CFMM-CEX dependence, states an optimal-execution dynamic program whose dynamics are taken as given (not derived from the policy), and applies deep Galerkin as a numerical solver. No equation equates a reported performance quantity to a re-expression of the elicitability scoring rule or to any fitted parameter; the claim of operating 'without approximating the market dynamics' refers to the model statement rather than to the solution method. No self-citation is invoked as a uniqueness theorem or load-bearing premise, and no ansatz is smuggled via prior work. The reported outperformance is therefore an output of the solved control problem rather than a tautological restatement of its inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
- [1]
- [2]
-
[3]
A. Al-Aradi, A. Correia, G. Jardim, D. de Freitas Naiff, and Y. Saporito , Extensions of the deep Galerkin method , Applied Mathematics and Computation, 430 (2022), p. 127287
work page 2022
-
[4]
Solving Nonlinear and High-Dimensional Partial Differential Equations via Deep Learning
A. Al-Aradi, A. Correia, D. Naiff, G. Jardim, and Y. Saporito , Solving nonlinear and high-dimensional partial differential equations via deep learning , Report for the Financial Mathe- matics Team Challenge FTMC Brazil, 2018, available at arXiv:1811.08782, (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[5]
G. Angeris, A. Agrawal, A. Evans, T. Chitra, and S. Boyd , Constant function market makers: Multi-asset trades via convex optimization , in Handbook on Blockchain, Springer, 2022, pp. 415–444
work page 2022
-
[6]
G. Angeris, A. Evans, T. Chitra, and S. Boyd, Optimal routing for constant function market makers, in Proceedings of the 23rd ACM Conference on Economics and Computation, 2022, pp. 115– 128
work page 2022
-
[7]
P. Bergault, L. Bertucci, D. Bouba, and O. Gu ´eant, Automated market makers: Mean- variance analysis of LPs payoffs and design of pricing functions , arXiv preprint arXiv:2212.00336, (2022)
- [8]
- [9]
- [10]
-
[11]
A. Coache and S. Jaimungal, Reinforcement learning with dynamic convex risk measures, arXiv preprint arXiv:2112.13414, (2021)
- [12]
-
[13]
Z. Fan, F. J. Marmolejo-Coss ´ıo, B. Altschuler, H. Sun, X. Wang, and D. Parkes , Differential liquidity provision in uniswap v3 and implications for contract design , in Proceedings of the Third ACM International Conference on AI in Finance, 2022, pp. 9–17
work page 2022
-
[14]
T. Fissler and J. F. Ziegel , Higher order elicitability and osband’s principle , (2016)
work page 2016
- [15]
-
[16]
B. K. Øksendal and A. Sulem , Applied stochastic control of jump diffusions , vol. 498, Springer, 2007
work page 2007
-
[17]
J. Sirignano and K. Spiliopoulos , DGM: A deep learning algorithm for solving partial differ- ential equations, Journal of computational physics, 375 (2018), pp. 1339–1364. A Additional Empirical Results Figure A.1: Repeated estimates of the expected value of returns conditioned on spread. 12 Figure A.2: Assessing the uncertainty in the estimation of Figur...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.