pith. sign in

arxiv: 2304.02180 · v2 · submitted 2023-04-05 · 💱 q-fin.TR

Optimal Trading in Automated Market Makers with Deep Learning

Pith reviewed 2026-05-24 09:42 UTC · model grok-4.3

classification 💱 q-fin.TR
keywords optimal executionconstant function market makersconditional elicitabilitydeep Galerkin methodprice slippageautomated market makersdeep learningtrading strategies
0
0 comments X

The pith

Traders can hide large orders in constant function market makers by controlling execution speed to eliminate price slippage.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a joint model of constant function market makers and centralized exchanges that uses conditional elicitability to capture exact dependence between the two venues without approximation. It formulates an optimal execution problem in which an agent conceals intent by varying the rate of trade over time. The resulting dynamic programming equation lacks a closed-form solution, so the authors apply the deep Galerkin method to compute the value function and policy numerically. Experiments show the computed policy produces no observable slippage and beats several naive benchmarks.

Core claim

By capturing the conditional dependence between CFMM and CEX variables exactly through conditional elicitability and solving the associated dynamic programming equation with the deep Galerkin method, an optimal trading policy exists that controls the speed of execution to hide orders, thereby removing price slippage while outperforming naive execution rules.

What carries the argument

The dynamic programming equation for the optimal execution problem, solved numerically by the deep Galerkin method after dependence is modeled via conditional elicitability.

If this is right

  • The optimal policy varies the trading rate over time to conceal large positions.
  • The resulting execution exhibits no price slippage under the modeled dynamics.
  • The strategy outperforms constant-rate and other simple execution rules in the numerical tests.
  • The solution requires no approximation of the joint market process.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same elicitability-based dependence modeling could be reused for other pairs of decentralized and centralized venues.
  • Adding inventory penalties or transaction fees to the objective would produce a more realistic policy.
  • The deep Galerkin solver could be replaced by other neural PDE methods to test sensitivity of the no-slippage result.
  • Live deployment would require frequent re-estimation of the conditional elicitability parameters as market regimes shift.

Load-bearing premise

Market dynamics between the CFMM and CEX can be captured exactly by conditional elicitability without any approximation.

What would settle it

Apply the computed policy to live or historical order-book data from both venues and check whether measurable slippage appears or whether performance falls to the level of naive strategies.

Figures

Figures reproduced from arXiv: 2304.02180 by Max O. Souza, Sebastian Jaimungal, Yuri F. Saporito, Yuri Thamsten.

Figure 2.1
Figure 2.1. Figure 2.1: Histograms of the log-sizes of swaps occurring in the pool. [PITH_FULL_IMAGE:figures/full_fig_p002_2_1.png] view at source ↗
Figure 2.2
Figure 2.2. Figure 2.2: Inter-arrival times between buy and sell swap events. [PITH_FULL_IMAGE:figures/full_fig_p003_2_2.png] view at source ↗
Figure 2.3
Figure 2.3. Figure 2.3: Conditional expected return in each venue conditioned on a certain price differential value. [PITH_FULL_IMAGE:figures/full_fig_p004_2_3.png] view at source ↗
Figure 2.4
Figure 2.4. Figure 2.4: Conditional probability that an AMM swap is a buy conditional on the spread. P(buy | Ref trade, ∆ = P − S) [PITH_FULL_IMAGE:figures/full_fig_p004_2_4.png] view at source ↗
Figure 2
Figure 2. Figure 2: shows the analogous estimates for the centralised exchange. In particular, it shows the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3.1
Figure 3.1. Figure 3.1: Sample paths of cumulative returns of the uncontrolled market dynamics over 15 minutes [PITH_FULL_IMAGE:figures/full_fig_p006_3_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: portrays a few sample paths of the [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 4.2
Figure 4.2. Figure 4.2: Histogram of terminal amount of token-y received. The amount received if all were swapped at time 0 is shown by the dotted line. In [PITH_FULL_IMAGE:figures/full_fig_p010_4_2.png] view at source ↗
Figure 4.3
Figure 4.3. Figure 4.3: Optimal intensity of trades computed using the DGM approach varying in the vertical axis [PITH_FULL_IMAGE:figures/full_fig_p011_4_3.png] view at source ↗
read the original abstract

This article explores the optimisation of trading strategies in Constant Function Market Makers (CFMMs) and centralised exchanges. We develop a model that accounts for the interaction between these two markets, estimating the conditional dependence between variables using the concept of conditional elicitability. Furthermore, we pose an optimal execution problem where the agent hides their orders by controlling the rate at which they trade. We do so without approximating the market dynamics. The resulting dynamic programming equation is not analytically tractable, therefore, we employ the deep Galerkin method to solve it. Finally, we conduct numerical experiments and illustrate that the optimal strategy is not prone to price slippage and outperforms na\"ive strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims to model interactions between CFMMs and CEXs by estimating conditional dependence via conditional elicitability, formulate an optimal execution problem in which the agent hides orders by controlling trade rate, solve the resulting DP equation exactly (without approximating market dynamics) via the deep Galerkin method, and demonstrate in numerical experiments that the resulting optimal strategy is free of price slippage and outperforms naïve strategies.

Significance. If the central claims hold after proper validation, the work would offer a timely contribution to optimal execution in hybrid DeFi/centralized markets by combining elicitability-based dependence modeling with deep-learning solution of high-dimensional DP problems; the absence of approximation in the dynamics and the reported slippage-free property would be notable strengths if substantiated.

major comments (2)
  1. [Abstract] Abstract: the assertion that the approach proceeds 'without approximating the market dynamics' is not reconciled with the reliance on conditional elicitability for estimation; conditional elicitability supplies a consistent scoring rule but still requires finite-sample estimation, parametric assumptions on the joint law, and discretization in the subsequent deep Galerkin solve, so that any deviation of the estimated conditional law from the true law renders the derived policy optimality and slippage-free property valid only under the wrong measure.
  2. [Abstract] Abstract: the statement that 'numerical experiments ... illustrate that the optimal strategy is not prone to price slippage and outperforms naïve strategies' provides no error bars, baseline details, or description of how post-training performance was measured, leaving the central empirical support for the no-slippage and outperformance claims unassessable.
minor comments (1)
  1. [Abstract] The abstract contains the typographical rendering 'naïve' (with escaped quote) that should be corrected for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and will revise the abstract to improve clarity and transparency.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that the approach proceeds 'without approximating the market dynamics' is not reconciled with the reliance on conditional elicitability for estimation; conditional elicitability supplies a consistent scoring rule but still requires finite-sample estimation, parametric assumptions on the joint law, and discretization in the subsequent deep Galerkin solve, so that any deviation of the estimated conditional law from the true law renders the derived policy optimality and slippage-free property valid only under the wrong measure.

    Authors: The phrasing 'without approximating the market dynamics' is meant to convey that the dynamic programming equation is derived directly from the CFMM-CEX interaction model once the conditional dependence is specified via elicitability, without introducing further modeling approximations (e.g., no assumed functional forms for price impact or liquidity beyond the elicitability-based conditional law). We acknowledge that finite-sample estimation, any parametric choices in the joint law, and the numerical discretization inherent to the deep Galerkin solver mean the resulting policy is optimal only with respect to the estimated measure. We will revise the abstract to state explicitly that the DP formulation introduces no additional approximations to the dynamics beyond the data-driven estimation step, and that optimality and the slippage-free property hold under the estimated conditional law. revision: yes

  2. Referee: [Abstract] Abstract: the statement that 'numerical experiments ... illustrate that the optimal strategy is not prone to price slippage and outperforms naïve strategies' provides no error bars, baseline details, or description of how post-training performance was measured, leaving the central empirical support for the no-slippage and outperformance claims unassessable.

    Authors: The abstract is intentionally concise, but the full manuscript contains the experimental details, including simulation setup, comparison baselines (constant-rate and other naive policies), performance metrics for slippage and execution cost, and averaging over multiple runs. To address the concern, we will expand the abstract with a brief clause noting that results are obtained from extensive Monte Carlo simulations with reported performance metrics and variability. This will make the empirical claims more assessable while preserving brevity. revision: yes

Circularity Check

0 steps flagged

No circularity: model formulation and numerical solve remain independent of fitted outputs

full rationale

The derivation begins with a conditional-elicitability model for CFMM-CEX dependence, states an optimal-execution dynamic program whose dynamics are taken as given (not derived from the policy), and applies deep Galerkin as a numerical solver. No equation equates a reported performance quantity to a re-expression of the elicitability scoring rule or to any fitted parameter; the claim of operating 'without approximating the market dynamics' refers to the model statement rather than to the solution method. No self-citation is invoked as a uniqueness theorem or load-bearing premise, and no ansatz is smuggled via prior work. The reported outperformance is therefore an output of the solved control problem rather than a tautological restatement of its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; ledger left empty pending full text.

pith-pipeline@v0.9.0 · 5642 in / 1008 out tokens · 19171 ms · 2026-05-24T09:42:19.636983+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 1 internal anchor

  1. [1]

    Adams, N

    H. Adams, N. Zinsmeister, and D. Robinson , Uniswap v2 core, 2020 , URL: https://uniswap. org/whitepaper. pdf, (2020)

  2. [2]

    Adams, N

    H. Adams, N. Zinsmeister, M. Salem, R. Keefer, and D. Robinson, Uniswap v3 core, Tech. rep., Uniswap, Tech. Rep., (2021)

  3. [3]

    Al-Aradi, A

    A. Al-Aradi, A. Correia, G. Jardim, D. de Freitas Naiff, and Y. Saporito , Extensions of the deep Galerkin method , Applied Mathematics and Computation, 430 (2022), p. 127287

  4. [4]

    Solving Nonlinear and High-Dimensional Partial Differential Equations via Deep Learning

    A. Al-Aradi, A. Correia, D. Naiff, G. Jardim, and Y. Saporito , Solving nonlinear and high-dimensional partial differential equations via deep learning , Report for the Financial Mathe- matics Team Challenge FTMC Brazil, 2018, available at arXiv:1811.08782, (2018)

  5. [5]

    Angeris, A

    G. Angeris, A. Agrawal, A. Evans, T. Chitra, and S. Boyd , Constant function market makers: Multi-asset trades via convex optimization , in Handbook on Blockchain, Springer, 2022, pp. 415–444

  6. [6]

    Angeris, A

    G. Angeris, A. Evans, T. Chitra, and S. Boyd, Optimal routing for constant function market makers, in Proceedings of the 23rd ACM Conference on Economics and Computation, 2022, pp. 115– 128

  7. [7]

    Bergault, L

    P. Bergault, L. Bertucci, D. Bouba, and O. Gu ´eant, Automated market makers: Mean- variance analysis of LPs payoffs and design of pricing functions , arXiv preprint arXiv:2212.00336, (2022)

  8. [8]

    Cartea, F

    ´A. Cartea, F. Drissi, and M. Monga , Decentralised finance and automated market making: Execution and speculation, Available at SSRN, (2022)

  9. [9]

    Cartea, F

    ´A. Cartea, F. Drissi, and M. Monga , Decentralised finance and automated market making: Predictable loss and optimal liquidity provision , Available at SSRN 4273989, (2022). 11

  10. [10]

    Cartea, F

    ´A. Cartea, F. Drissi, and M. Monga, Execution and statistical arbitrage with signals in multiple automated market makers , Available at SSRN, (2023)

  11. [11]

    Coache and S

    A. Coache and S. Jaimungal, Reinforcement learning with dynamic convex risk measures, arXiv preprint arXiv:2112.13414, (2021)

  12. [12]

    Coache, S

    A. Coache, S. Jaimungal, and ´A. Cartea, Conditionally elicitable dynamic risk measures for deep reinforcement learning, arXiv preprint arXiv:2206.14666, (2022)

  13. [13]

    Z. Fan, F. J. Marmolejo-Coss ´ıo, B. Altschuler, H. Sun, X. Wang, and D. Parkes , Differential liquidity provision in uniswap v3 and implications for contract design , in Proceedings of the Third ACM International Conference on AI in Finance, 2022, pp. 9–17

  14. [14]

    Fissler and J

    T. Fissler and J. F. Ziegel , Higher order elicitability and osband’s principle , (2016)

  15. [15]

    Neuder, R

    M. Neuder, R. Rao, D. J. Moroz, and D. C. Parkes , Strategic liquidity provision in uniswap v3, arXiv preprint arXiv:2106.12033, (2021)

  16. [16]

    B. K. Øksendal and A. Sulem , Applied stochastic control of jump diffusions , vol. 498, Springer, 2007

  17. [17]

    Sirignano and K

    J. Sirignano and K. Spiliopoulos , DGM: A deep learning algorithm for solving partial differ- ential equations, Journal of computational physics, 375 (2018), pp. 1339–1364. A Additional Empirical Results Figure A.1: Repeated estimates of the expected value of returns conditioned on spread. 12 Figure A.2: Assessing the uncertainty in the estimation of Figur...