Optimal Trading in Automated Market Makers with Deep Learning

Max O. Souza; Sebastian Jaimungal; Yuri F. Saporito; Yuri Thamsten

arxiv: 2304.02180 · v2 · submitted 2023-04-05 · 💱 q-fin.TR

Optimal Trading in Automated Market Makers with Deep Learning

Sebastian Jaimungal , Yuri F. Saporito , Max O. Souza , Yuri Thamsten This is my paper

Pith reviewed 2026-05-24 09:42 UTC · model grok-4.3

classification 💱 q-fin.TR

keywords optimal executionconstant function market makersconditional elicitabilitydeep Galerkin methodprice slippageautomated market makersdeep learningtrading strategies

0 comments

The pith

Traders can hide large orders in constant function market makers by controlling execution speed to eliminate price slippage.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a joint model of constant function market makers and centralized exchanges that uses conditional elicitability to capture exact dependence between the two venues without approximation. It formulates an optimal execution problem in which an agent conceals intent by varying the rate of trade over time. The resulting dynamic programming equation lacks a closed-form solution, so the authors apply the deep Galerkin method to compute the value function and policy numerically. Experiments show the computed policy produces no observable slippage and beats several naive benchmarks.

Core claim

By capturing the conditional dependence between CFMM and CEX variables exactly through conditional elicitability and solving the associated dynamic programming equation with the deep Galerkin method, an optimal trading policy exists that controls the speed of execution to hide orders, thereby removing price slippage while outperforming naive execution rules.

What carries the argument

The dynamic programming equation for the optimal execution problem, solved numerically by the deep Galerkin method after dependence is modeled via conditional elicitability.

If this is right

The optimal policy varies the trading rate over time to conceal large positions.
The resulting execution exhibits no price slippage under the modeled dynamics.
The strategy outperforms constant-rate and other simple execution rules in the numerical tests.
The solution requires no approximation of the joint market process.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same elicitability-based dependence modeling could be reused for other pairs of decentralized and centralized venues.
Adding inventory penalties or transaction fees to the objective would produce a more realistic policy.
The deep Galerkin solver could be replaced by other neural PDE methods to test sensitivity of the no-slippage result.
Live deployment would require frequent re-estimation of the conditional elicitability parameters as market regimes shift.

Load-bearing premise

Market dynamics between the CFMM and CEX can be captured exactly by conditional elicitability without any approximation.

What would settle it

Apply the computed policy to live or historical order-book data from both venues and check whether measurable slippage appears or whether performance falls to the level of naive strategies.

Figures

Figures reproduced from arXiv: 2304.02180 by Max O. Souza, Sebastian Jaimungal, Yuri F. Saporito, Yuri Thamsten.

**Figure 2.1.** Figure 2.1: Histograms of the log-sizes of swaps occurring in the pool. [PITH_FULL_IMAGE:figures/full_fig_p002_2_1.png] view at source ↗

**Figure 2.2.** Figure 2.2: Inter-arrival times between buy and sell swap events. [PITH_FULL_IMAGE:figures/full_fig_p003_2_2.png] view at source ↗

**Figure 2.3.** Figure 2.3: Conditional expected return in each venue conditioned on a certain price differential value. [PITH_FULL_IMAGE:figures/full_fig_p004_2_3.png] view at source ↗

**Figure 2.4.** Figure 2.4: Conditional probability that an AMM swap is a buy conditional on the spread. P(buy | Ref trade, ∆ = P − S) [PITH_FULL_IMAGE:figures/full_fig_p004_2_4.png] view at source ↗

**Figure 2.** Figure 2: shows the analogous estimates for the centralised exchange. In particular, it shows the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.1.** Figure 3.1: Sample paths of cumulative returns of the uncontrolled market dynamics over 15 minutes [PITH_FULL_IMAGE:figures/full_fig_p006_3_1.png] view at source ↗

**Figure 4.** Figure 4: portrays a few sample paths of the [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 4.2.** Figure 4.2: Histogram of terminal amount of token-y received. The amount received if all were swapped at time 0 is shown by the dotted line. In [PITH_FULL_IMAGE:figures/full_fig_p010_4_2.png] view at source ↗

**Figure 4.3.** Figure 4.3: Optimal intensity of trades computed using the DGM approach varying in the vertical axis [PITH_FULL_IMAGE:figures/full_fig_p011_4_3.png] view at source ↗

read the original abstract

This article explores the optimisation of trading strategies in Constant Function Market Makers (CFMMs) and centralised exchanges. We develop a model that accounts for the interaction between these two markets, estimating the conditional dependence between variables using the concept of conditional elicitability. Furthermore, we pose an optimal execution problem where the agent hides their orders by controlling the rate at which they trade. We do so without approximating the market dynamics. The resulting dynamic programming equation is not analytically tractable, therefore, we employ the deep Galerkin method to solve it. Finally, we conduct numerical experiments and illustrate that the optimal strategy is not prone to price slippage and outperforms na\"ive strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies conditional elicitability to capture CFMM-CEX dependence then solves the resulting optimal execution problem with deep Galerkin, but the 'exact dynamics without approximation' framing rests on an estimated conditional law rather than a known one.

read the letter

The core contribution is a model that links trading in constant-function market makers to centralized exchanges by using conditional elicitability to handle the joint dynamics, then poses an optimal execution problem where the trader controls order rate to stay hidden. They solve the resulting dynamic program with the deep Galerkin method and run numerical tests showing the policy avoids slippage better than simple benchmarks. That specific coupling of elicitability with a non-approximated execution problem in this market pair looks new on the surface and addresses a practical DeFi execution question that sits between crypto and traditional quant finance. The approach is technically coherent on its own terms and the choice of deep Galerkin makes sense once the HJB equation lacks a closed form. The main soft spot is the validation: the abstract claims the experiments back the no-slippage result, yet gives no error bars, baseline descriptions, or out-of-sample measurement details, so the performance edge is hard to judge from what is shown. A second issue is the language around 'without approximating the market dynamics.' Conditional elicitability supplies a consistent scoring rule for estimating conditional functionals from data; it does not deliver an exact known law. Any policy optimality and slippage claim therefore holds only with respect to the fitted conditional, and the paper would be stronger if it clarified how sensitive the results are to estimation error or to the parametric assumptions used in the elicitability step. Minor points include the usual need for more implementation specifics on the neural-network training and the discretization used inside the Galerkin solver. This is a paper for readers already working on optimal execution or market-making in automated venues; a quant-finance reading group might find the method combination worth discussing if the experiments hold up. It is coherent enough and the problem is concrete enough that a serious editor should send it to referees rather than desk-reject, even though the current write-up leaves the empirical claims under-supported.

Referee Report

2 major / 1 minor

Summary. The paper claims to model interactions between CFMMs and CEXs by estimating conditional dependence via conditional elicitability, formulate an optimal execution problem in which the agent hides orders by controlling trade rate, solve the resulting DP equation exactly (without approximating market dynamics) via the deep Galerkin method, and demonstrate in numerical experiments that the resulting optimal strategy is free of price slippage and outperforms naïve strategies.

Significance. If the central claims hold after proper validation, the work would offer a timely contribution to optimal execution in hybrid DeFi/centralized markets by combining elicitability-based dependence modeling with deep-learning solution of high-dimensional DP problems; the absence of approximation in the dynamics and the reported slippage-free property would be notable strengths if substantiated.

major comments (2)

[Abstract] Abstract: the assertion that the approach proceeds 'without approximating the market dynamics' is not reconciled with the reliance on conditional elicitability for estimation; conditional elicitability supplies a consistent scoring rule but still requires finite-sample estimation, parametric assumptions on the joint law, and discretization in the subsequent deep Galerkin solve, so that any deviation of the estimated conditional law from the true law renders the derived policy optimality and slippage-free property valid only under the wrong measure.
[Abstract] Abstract: the statement that 'numerical experiments ... illustrate that the optimal strategy is not prone to price slippage and outperforms naïve strategies' provides no error bars, baseline details, or description of how post-training performance was measured, leaving the central empirical support for the no-slippage and outperformance claims unassessable.

minor comments (1)

[Abstract] The abstract contains the typographical rendering 'naïve' (with escaped quote) that should be corrected for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and will revise the abstract to improve clarity and transparency.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that the approach proceeds 'without approximating the market dynamics' is not reconciled with the reliance on conditional elicitability for estimation; conditional elicitability supplies a consistent scoring rule but still requires finite-sample estimation, parametric assumptions on the joint law, and discretization in the subsequent deep Galerkin solve, so that any deviation of the estimated conditional law from the true law renders the derived policy optimality and slippage-free property valid only under the wrong measure.

Authors: The phrasing 'without approximating the market dynamics' is meant to convey that the dynamic programming equation is derived directly from the CFMM-CEX interaction model once the conditional dependence is specified via elicitability, without introducing further modeling approximations (e.g., no assumed functional forms for price impact or liquidity beyond the elicitability-based conditional law). We acknowledge that finite-sample estimation, any parametric choices in the joint law, and the numerical discretization inherent to the deep Galerkin solver mean the resulting policy is optimal only with respect to the estimated measure. We will revise the abstract to state explicitly that the DP formulation introduces no additional approximations to the dynamics beyond the data-driven estimation step, and that optimality and the slippage-free property hold under the estimated conditional law. revision: yes
Referee: [Abstract] Abstract: the statement that 'numerical experiments ... illustrate that the optimal strategy is not prone to price slippage and outperforms naïve strategies' provides no error bars, baseline details, or description of how post-training performance was measured, leaving the central empirical support for the no-slippage and outperformance claims unassessable.

Authors: The abstract is intentionally concise, but the full manuscript contains the experimental details, including simulation setup, comparison baselines (constant-rate and other naive policies), performance metrics for slippage and execution cost, and averaging over multiple runs. To address the concern, we will expand the abstract with a brief clause noting that results are obtained from extensive Monte Carlo simulations with reported performance metrics and variability. This will make the empirical claims more assessable while preserving brevity. revision: yes

Circularity Check

0 steps flagged

No circularity: model formulation and numerical solve remain independent of fitted outputs

full rationale

The derivation begins with a conditional-elicitability model for CFMM-CEX dependence, states an optimal-execution dynamic program whose dynamics are taken as given (not derived from the policy), and applies deep Galerkin as a numerical solver. No equation equates a reported performance quantity to a re-expression of the elicitability scoring rule or to any fitted parameter; the claim of operating 'without approximating the market dynamics' refers to the model statement rather than to the solution method. No self-citation is invoked as a uniqueness theorem or load-bearing premise, and no ansatz is smuggled via prior work. The reported outperformance is therefore an output of the solved control problem rather than a tautological restatement of its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; ledger left empty pending full text.

pith-pipeline@v0.9.0 · 5642 in / 1008 out tokens · 19171 ms · 2026-05-24T09:42:19.636983+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 1 internal anchor

[1]

Adams, N

H. Adams, N. Zinsmeister, and D. Robinson , Uniswap v2 core, 2020 , URL: https://uniswap. org/whitepaper. pdf, (2020)

work page 2020
[2]

Adams, N

H. Adams, N. Zinsmeister, M. Salem, R. Keefer, and D. Robinson, Uniswap v3 core, Tech. rep., Uniswap, Tech. Rep., (2021)

work page 2021
[3]

Al-Aradi, A

A. Al-Aradi, A. Correia, G. Jardim, D. de Freitas Naiff, and Y. Saporito , Extensions of the deep Galerkin method , Applied Mathematics and Computation, 430 (2022), p. 127287

work page 2022
[4]

Solving Nonlinear and High-Dimensional Partial Differential Equations via Deep Learning

A. Al-Aradi, A. Correia, D. Naiff, G. Jardim, and Y. Saporito , Solving nonlinear and high-dimensional partial diﬀerential equations via deep learning , Report for the Financial Mathe- matics Team Challenge FTMC Brazil, 2018, available at arXiv:1811.08782, (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[5]

Angeris, A

G. Angeris, A. Agrawal, A. Evans, T. Chitra, and S. Boyd , Constant function market makers: Multi-asset trades via convex optimization , in Handbook on Blockchain, Springer, 2022, pp. 415–444

work page 2022
[6]

Angeris, A

G. Angeris, A. Evans, T. Chitra, and S. Boyd, Optimal routing for constant function market makers, in Proceedings of the 23rd ACM Conference on Economics and Computation, 2022, pp. 115– 128

work page 2022
[7]

Bergault, L

P. Bergault, L. Bertucci, D. Bouba, and O. Gu ´eant, Automated market makers: Mean- variance analysis of LPs payoﬀs and design of pricing functions , arXiv preprint arXiv:2212.00336, (2022)

work page arXiv 2022
[8]

Cartea, F

´A. Cartea, F. Drissi, and M. Monga , Decentralised ﬁnance and automated market making: Execution and speculation, Available at SSRN, (2022)

work page 2022
[9]

Cartea, F

´A. Cartea, F. Drissi, and M. Monga , Decentralised ﬁnance and automated market making: Predictable loss and optimal liquidity provision , Available at SSRN 4273989, (2022). 11

work page 2022
[10]

Cartea, F

´A. Cartea, F. Drissi, and M. Monga, Execution and statistical arbitrage with signals in multiple automated market makers , Available at SSRN, (2023)

work page 2023
[11]

Coache and S

A. Coache and S. Jaimungal, Reinforcement learning with dynamic convex risk measures, arXiv preprint arXiv:2112.13414, (2021)

work page arXiv 2021
[12]

Coache, S

A. Coache, S. Jaimungal, and ´A. Cartea, Conditionally elicitable dynamic risk measures for deep reinforcement learning, arXiv preprint arXiv:2206.14666, (2022)

work page arXiv 2022
[13]

Z. Fan, F. J. Marmolejo-Coss ´ıo, B. Altschuler, H. Sun, X. Wang, and D. Parkes , Diﬀerential liquidity provision in uniswap v3 and implications for contract design , in Proceedings of the Third ACM International Conference on AI in Finance, 2022, pp. 9–17

work page 2022
[14]

Fissler and J

T. Fissler and J. F. Ziegel , Higher order elicitability and osband’s principle , (2016)

work page 2016
[15]

Neuder, R

M. Neuder, R. Rao, D. J. Moroz, and D. C. Parkes , Strategic liquidity provision in uniswap v3, arXiv preprint arXiv:2106.12033, (2021)

work page arXiv 2021
[16]

B. K. Øksendal and A. Sulem , Applied stochastic control of jump diﬀusions , vol. 498, Springer, 2007

work page 2007
[17]

Sirignano and K

J. Sirignano and K. Spiliopoulos , DGM: A deep learning algorithm for solving partial diﬀer- ential equations, Journal of computational physics, 375 (2018), pp. 1339–1364. A Additional Empirical Results Figure A.1: Repeated estimates of the expected value of returns conditioned on spread. 12 Figure A.2: Assessing the uncertainty in the estimation of Figur...

work page 2018

[1] [1]

Adams, N

H. Adams, N. Zinsmeister, and D. Robinson , Uniswap v2 core, 2020 , URL: https://uniswap. org/whitepaper. pdf, (2020)

work page 2020

[2] [2]

Adams, N

H. Adams, N. Zinsmeister, M. Salem, R. Keefer, and D. Robinson, Uniswap v3 core, Tech. rep., Uniswap, Tech. Rep., (2021)

work page 2021

[3] [3]

Al-Aradi, A

A. Al-Aradi, A. Correia, G. Jardim, D. de Freitas Naiff, and Y. Saporito , Extensions of the deep Galerkin method , Applied Mathematics and Computation, 430 (2022), p. 127287

work page 2022

[4] [4]

Solving Nonlinear and High-Dimensional Partial Differential Equations via Deep Learning

A. Al-Aradi, A. Correia, D. Naiff, G. Jardim, and Y. Saporito , Solving nonlinear and high-dimensional partial diﬀerential equations via deep learning , Report for the Financial Mathe- matics Team Challenge FTMC Brazil, 2018, available at arXiv:1811.08782, (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[5] [5]

Angeris, A

G. Angeris, A. Agrawal, A. Evans, T. Chitra, and S. Boyd , Constant function market makers: Multi-asset trades via convex optimization , in Handbook on Blockchain, Springer, 2022, pp. 415–444

work page 2022

[6] [6]

Angeris, A

G. Angeris, A. Evans, T. Chitra, and S. Boyd, Optimal routing for constant function market makers, in Proceedings of the 23rd ACM Conference on Economics and Computation, 2022, pp. 115– 128

work page 2022

[7] [7]

Bergault, L

P. Bergault, L. Bertucci, D. Bouba, and O. Gu ´eant, Automated market makers: Mean- variance analysis of LPs payoﬀs and design of pricing functions , arXiv preprint arXiv:2212.00336, (2022)

work page arXiv 2022

[8] [8]

Cartea, F

´A. Cartea, F. Drissi, and M. Monga , Decentralised ﬁnance and automated market making: Execution and speculation, Available at SSRN, (2022)

work page 2022

[9] [9]

Cartea, F

´A. Cartea, F. Drissi, and M. Monga , Decentralised ﬁnance and automated market making: Predictable loss and optimal liquidity provision , Available at SSRN 4273989, (2022). 11

work page 2022

[10] [10]

Cartea, F

´A. Cartea, F. Drissi, and M. Monga, Execution and statistical arbitrage with signals in multiple automated market makers , Available at SSRN, (2023)

work page 2023

[11] [11]

Coache and S

A. Coache and S. Jaimungal, Reinforcement learning with dynamic convex risk measures, arXiv preprint arXiv:2112.13414, (2021)

work page arXiv 2021

[12] [12]

Coache, S

A. Coache, S. Jaimungal, and ´A. Cartea, Conditionally elicitable dynamic risk measures for deep reinforcement learning, arXiv preprint arXiv:2206.14666, (2022)

work page arXiv 2022

[13] [13]

Z. Fan, F. J. Marmolejo-Coss ´ıo, B. Altschuler, H. Sun, X. Wang, and D. Parkes , Diﬀerential liquidity provision in uniswap v3 and implications for contract design , in Proceedings of the Third ACM International Conference on AI in Finance, 2022, pp. 9–17

work page 2022

[14] [14]

Fissler and J

T. Fissler and J. F. Ziegel , Higher order elicitability and osband’s principle , (2016)

work page 2016

[15] [15]

Neuder, R

M. Neuder, R. Rao, D. J. Moroz, and D. C. Parkes , Strategic liquidity provision in uniswap v3, arXiv preprint arXiv:2106.12033, (2021)

work page arXiv 2021

[16] [16]

B. K. Øksendal and A. Sulem , Applied stochastic control of jump diﬀusions , vol. 498, Springer, 2007

work page 2007

[17] [17]

Sirignano and K

J. Sirignano and K. Spiliopoulos , DGM: A deep learning algorithm for solving partial diﬀer- ential equations, Journal of computational physics, 375 (2018), pp. 1339–1364. A Additional Empirical Results Figure A.1: Repeated estimates of the expected value of returns conditioned on spread. 12 Figure A.2: Assessing the uncertainty in the estimation of Figur...

work page 2018