How to Evaluate Trading Strategies: Single Agent Market Replay or Multiple Agent Interactive Simulation?

David Byrd; Joshua Lockhart; Mahmoud Mahfouz; Maria Hybinette; Tucker Hybinette Balch

arxiv: 1906.12010 · v1 · pith:XBF3FTHOnew · submitted 2019-06-28 · 💱 q-fin.TR · cs.GT

How to Evaluate Trading Strategies: Single Agent Market Replay or Multiple Agent Interactive Simulation?

Tucker Hybinette Balch , Mahmoud Mahfouz , Joshua Lockhart , Maria Hybinette , David Byrd This is my paper

Pith reviewed 2026-05-25 13:39 UTC · model grok-4.3

classification 💱 q-fin.TR cs.GT

keywords trading strategiesmarket replayinteractive agent-based simulationmarket impactmulti-agent simulatorstrategy evaluationbackground agents

0 comments

The pith

Multi-agent simulators support both non-adaptive market replay and responsive interactive simulation to evaluate trading strategies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a multi-agent simulator can run two distinct evaluation methods for trading strategies. Market Replay replays historical data without the market adapting to the tested strategy, which can conceal certain flaws. Interactive Agent-Based Simulation uses background agents that react to prices and conditions, making the market responsive and able to expose different weaknesses. The distinction is illustrated by measuring market impact for orders of different sizes, where each method can highlight issues the other misses.

Core claim

A multi-agent simulator supports Market Replay, in which the simulated market does not substantially adapt to or respond to the presence of the experimental strategy, and Interactive Agent-Based Simulation, in which a population of background trading agents attend to market conditions and current price, making the overall market responsive to the experimental strategy.

What carries the argument

Multi-agent simulator enabling both Market Replay and Interactive Agent-Based Simulation (IABS) with responsive background trading agents.

Load-bearing premise

Background agents in the interactive simulation attend to market conditions and current price as part of their strategy.

What would settle it

An experiment that runs the same strategy in both methods and then in live trading, checking whether IABS outcomes align more closely with observed real-market effects than replay outcomes do.

Figures

Figures reproduced from arXiv: 1906.12010 by David Byrd, Joshua Lockhart, Mahmoud Mahfouz, Maria Hybinette, Tucker Hybinette Balch.

**Figure 1.** Figure 1: An example limit order book. ders. The experimental strategy can then experience and trade in this environment. Potential advantages of the IABS approach include: that participating market agents will react to the experimental strategy with different consequential orders; that the experimental strategy can be exposed to conditions and situations that may not have occurred historically; and that a much lar… view at source ↗

**Figure 2.** Figure 2: Price-level volume plot. Black line represents the mid price, Each point is the price at different price levels with the colour scheme indicating the size (log scale) present at each level At the first time stamp available after the market opens, the historical order book file is referenced by the market replay agent to generate a list of new limit orders necessary to replicate the opening order book. Thi… view at source ↗

**Figure 3.** Figure 3: Observed impact on the mid price by the experimental agent placing market orders at twice the best bid or ask size specific temporally-located event or class of events on a series of measures, such as the price quotes of an equity security. There is a long history of event studies in economics and finance as related in Craig MacKinlay’s excellent 1997 survey (Craig MacKinlay, 1997), which traces their use… view at source ↗

**Figure 4.** Figure 4: Observed impact on the mid price by the experimental agent placing market orders at 50%, 200%, 300% and 1000% of the best bid or ask size movement directly after the experimental agent order placement. We compare that against the baseline associated with the market replay agent placing orders without the presence of an experimental agent. In order to evaluate the different price impacts, we sample the pri… view at source ↗

**Figure 5.** Figure 5: Observed impact on the mid price by the experimental agent placing market orders with greed = 1.0 time unit, with messages in the same time unit handled in arbitrary order. A single equity was available to trade. Its fundamental value sequence, which we think of as the unobservable true consensus value of the equity, was taken to be a stochastic mean-reverting process. Participating agents received noisy o… view at source ↗

**Figure 6.** Figure 6: Observed impact on the mid price by the experimental agent placing market orders with varying greed conducted a similar set of trials while varying the impact agent’s greed parameter and present the mean observed impact by greed in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

We show how a multi-agent simulator can support two important but distinct methods for assessing a trading strategy: Market Replay and Interactive Agent-Based Simulation (IABS). Our solution is important because each method offers strengths and weaknesses that expose or conceal flaws in the subject strategy. A key weakness of Market Replay is that the simulated market does not substantially adapt to or respond to the presence of the experimental strategy. IABS methods provide an artificial market for the experimental strategy using a population of background trading agents. Because the background agents attend to market conditions and current price as part of their strategy, the overall market is responsive to the presence of the experimental strategy. Even so, IABS methods have their own weaknesses, primarily that it is unclear if the market environment they provide is realistic. We describe our approach in detail, and illustrate its use in an example application: The evaluation of market impact for various size orders.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Replay misses how markets adapt to your strategy while IABS lets background agents react, but the paper stays conceptual with no numbers or code to show the practical difference.

read the letter

The main thing to know is that this paper draws a clean line between market replay, where the simulated tape ignores your trades, and interactive agent-based simulation, where background traders condition on price and state so the market can push back. That distinction is real and directly relevant to anyone testing execution algorithms that might move the book. They illustrate it with an order-size market-impact example, which helps make the point concrete rather than purely abstract. The write-up of the two methods is straightforward and matches the mechanics people actually use in simulators like theirs. Credit for being explicit that IABS realism is still an open question instead of claiming it solves everything. The soft spot is the lack of any side-by-side runs, error metrics, or even a small table showing how much the two approaches diverge on the same strategy. Everything rests on the logical description rather than evidence that the responsiveness actually changes decisions or performance numbers. No code or data release is mentioned either. This is useful for quant teams that already run multi-agent environments and want a reminder about evaluation blind spots, or for people designing new simulators. A methods-focused reader will get value from the framing. It is worth sending to peer review because the core distinction holds up on its own terms and the practical stakes in trading strategy testing are high, even if the paper would need added experiments to carry more weight.

Referee Report

0 major / 1 minor

Summary. The manuscript claims that a multi-agent simulator enables two distinct evaluation methods for trading strategies: Market Replay (single-agent, non-adaptive) and Interactive Agent-Based Simulation (IABS, multi-agent). It argues that Market Replay fails to capture market adaptation to the experimental strategy, while IABS yields a responsive market because background agents condition on price and market state; realism of the resulting environment remains unclear. The approach is illustrated via an example evaluating market impact of orders of varying sizes.

Significance. If the stated mechanism holds, the work provides a clear conceptual distinction that could inform simulation choices in quantitative trading research, particularly by surfacing the non-responsiveness limitation of replay methods. The explicit caveat on IABS realism is a strength. However, the absence of quantitative comparisons, error analysis, or formal validation of the responsiveness claim limits the result to methodological clarification rather than demonstrated superiority.

minor comments (1)

Abstract: the market-impact illustration is referenced but not described; the main text should include at least one concrete numerical example or table to ground the claimed distinction.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment, recognition of the conceptual distinction, and recommendation for minor revision. The manuscript is framed as a methodological clarification rather than an empirical demonstration of superiority, and we address the concern about validation below.

read point-by-point responses

Referee: However, the absence of quantitative comparisons, error analysis, or formal validation of the responsiveness claim limits the result to methodological clarification rather than demonstrated superiority.

Authors: We agree that the paper provides methodological clarification rather than a quantitative demonstration of superiority. The core contribution is identifying the non-responsiveness limitation inherent to replay methods (where the market does not adapt to the experimental strategy) and showing how IABS can produce responsiveness because background agents condition on price and market state. We already include an explicit caveat that IABS realism is unclear. Adding quantitative comparisons, error analysis, or formal validation would require a substantially different study (e.g., calibration to real-market data or controlled experiments), which lies outside the stated scope. We will revise the introduction and conclusion to state the scope more explicitly and to note that empirical validation of IABS responsiveness remains an open direction for future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a descriptive comparison of two simulation methods (Market Replay vs. IABS) for evaluating trading strategies. It contains no equations, fitted parameters, predictions, or derivation chain that could reduce to its own inputs. The central distinction—that background agents in IABS condition on price and market state, making the market responsive—follows directly from the stated mechanism without self-reference or circular reduction. Self-citations, if present, are not load-bearing for any claimed result. The paper explicitly flags unresolved realism issues rather than asserting them. This is a self-contained methodological discussion with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is methodological and relies on domain assumptions about agent behavior rather than introducing fitted parameters, new mathematical axioms, or invented entities.

axioms (1)

domain assumption Background agents in IABS attend to market conditions and current price as part of their strategy.
Invoked to explain market responsiveness in IABS (abstract).

pith-pipeline@v0.9.0 · 5697 in / 1121 out tokens · 62605 ms · 2026-05-25T13:39:07.585708+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 1 internal anchor

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[2]

Optimal execution strategies in limit order books with general shape functions

Alfonsi, A., Fruth, A., and Schied, A. Optimal execution strategies in limit order books with general shape functions. Quantitative Finance, 10 0 (2): 0 143--157, 2010

work page 2010
[3]

and Domowitz, I

Bollerslev, T. and Domowitz, I. Some effects of restricting the electronic order book in an automated trade execution system. The Double Auction Market: Institutions, Theories and Evidence, 14: 0 221--252, 1993

work page 1993
[4]

Price impact

Bouchaud, J.-P. Price impact. Encyclopedia of quantitative finance, 2010

work page 2010
[5]

Byrd, D., Hybinette, M., and Balch, T. H. Abides: Towards high-fidelity market simulation for ai research. arXiv preprint arXiv:1904.12066, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[6]

and Bruten, J

Cliff, D. and Bruten, J. Minimal-intelligence agents for bargaining behaviors in market-based environments. Hewlett Packard Laboratories Technical Report, 91, 1997

work page 1997
[7]

Empirical properties of asset returns: stylized facts and statistical issues

Cont, R. Empirical properties of asset returns: stylized facts and statistical issues. 2001

work page 2001
[8]

A stochastic model for order book dynamics

Cont, R., Stoikov, S., and Talreja, R. A stochastic model for order book dynamics. Operations research, 58 0 (3): 0 549--563, 2010

work page 2010
[9]

The price impact of order book events

Cont, R., Kukanov, A., and Stoikov, S. The price impact of order book events. Journal of financial econometrics, 12 0 (1): 0 47--88, 2014

work page 2014
[10]

Event studies in economics and finance

Craig MacKinlay, A. Event studies in economics and finance. Journal of Economic Literature, 35: 0 13--39, 02 1997

work page 1997
[11]

and \"U nver, M

Duffy, J. and \"U nver, M. U. Asset price bubbles and crashes with near-zero-intelligence traders. Economic Theory, 27 0 (3): 0 537--563, Apr 2006

work page 2006
[12]

No-dynamic-arbitrage and market impact

Gatheral, J. No-dynamic-arbitrage and market impact. Quantitative finance, 10 0 (7): 0 749--759, 2010

work page 2010
[13]

Gode, D. K. and Sunder, S. Allocative efficiency of markets with zero-intelligence traders: Market as a partial substitute for individual rationality. Journal of political economy, 101 0 (1): 0 119--137, 1993

work page 1993
[14]

D., Porter, M

Gould, M. D., Porter, M. A., Williams, S., McDonald, M., Fenn, D. J., and Howison, S. D. Limit order books. Quantitative Finance, 13 0 (11): 0 1709--1742, 2013

work page 2013
[15]

Grinold, R. C. and Kahn, R. N. Active portfolio management: Quantitative theory and applications. Probus, 1995

work page 1995
[16]

and Polak, T

Huang, R. and Polak, T. Lobster: Limit order book reconstruction system, technical documentation, 2011. URL https://lobsterdata.com/info/DataSamples.php

work page 2011
[17]

Kyle, A. S. Continuous auctions and insider trading. Econometrica: Journal of the Econometric Society, pp.\ 1315--1335, 1985

work page 1985
[18]

Zero intelligence in economics and finance

Ladley, D. Zero intelligence in economics and finance. The Knowledge Engineering Review, 27 0 (2): 0 273–286, 2012. doi:10.1017/S0269888912000173

work page doi:10.1017/s0269888912000173 2012
[19]

Agent-based computational finance

LeBaron, B. Agent-based computational finance. Handbook of computational economics, 2: 0 1187--1233, 2006

work page 2006
[20]

and Laruelle, S

Lehalle, C.-A. and Laruelle, S. Market microstructure in practice. World Scientific, 2018

work page 2018
[21]

The nasdaq stock market (nasdaq), 2019

Nasdaq. The nasdaq stock market (nasdaq), 2019. URL https://www.nasdaqtrader.com/Trader.aspx?id=TradingUSEquities

work page 2019
[22]

An agent based model of the e-mini s&p 500 applied to flash crash analysis

Paddrik, M., Hayes, R., Todd, A., Yang, S., Beling, P., and Scherer, W. An agent based model of the e-mini s&p 500 applied to flash crash analysis. In 2012 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr), pp.\ 1--8. IEEE, 2012

work page 2012
[23]

Preis, T., Golke, S., Paul, W., and Schneider, J. J. Multi-agent-based order book model of financial markets. EPL (Europhysics Letters), 75 0 (3): 0 510, 2006

work page 2006
[24]

market making

Toke, I. M. “market making” in an order book model and its impact on the spread. In Econophysics of order-driven markets, pp.\ 49--64. Springer, 2011

work page 2011
[25]

and Wellman, M

Wang, X. and Wellman, M. P. Spoofing the limit order book: An agent-based model. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp.\ 651--659. International Foundation for Autonomous Agents and Multiagent Systems, 2017

work page 2017

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[2] [2]

Optimal execution strategies in limit order books with general shape functions

Alfonsi, A., Fruth, A., and Schied, A. Optimal execution strategies in limit order books with general shape functions. Quantitative Finance, 10 0 (2): 0 143--157, 2010

work page 2010

[3] [3]

and Domowitz, I

Bollerslev, T. and Domowitz, I. Some effects of restricting the electronic order book in an automated trade execution system. The Double Auction Market: Institutions, Theories and Evidence, 14: 0 221--252, 1993

work page 1993

[4] [4]

Price impact

Bouchaud, J.-P. Price impact. Encyclopedia of quantitative finance, 2010

work page 2010

[5] [5]

Byrd, D., Hybinette, M., and Balch, T. H. Abides: Towards high-fidelity market simulation for ai research. arXiv preprint arXiv:1904.12066, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[6] [6]

and Bruten, J

Cliff, D. and Bruten, J. Minimal-intelligence agents for bargaining behaviors in market-based environments. Hewlett Packard Laboratories Technical Report, 91, 1997

work page 1997

[7] [7]

Empirical properties of asset returns: stylized facts and statistical issues

Cont, R. Empirical properties of asset returns: stylized facts and statistical issues. 2001

work page 2001

[8] [8]

A stochastic model for order book dynamics

Cont, R., Stoikov, S., and Talreja, R. A stochastic model for order book dynamics. Operations research, 58 0 (3): 0 549--563, 2010

work page 2010

[9] [9]

The price impact of order book events

Cont, R., Kukanov, A., and Stoikov, S. The price impact of order book events. Journal of financial econometrics, 12 0 (1): 0 47--88, 2014

work page 2014

[10] [10]

Event studies in economics and finance

Craig MacKinlay, A. Event studies in economics and finance. Journal of Economic Literature, 35: 0 13--39, 02 1997

work page 1997

[11] [11]

and \"U nver, M

Duffy, J. and \"U nver, M. U. Asset price bubbles and crashes with near-zero-intelligence traders. Economic Theory, 27 0 (3): 0 537--563, Apr 2006

work page 2006

[12] [12]

No-dynamic-arbitrage and market impact

Gatheral, J. No-dynamic-arbitrage and market impact. Quantitative finance, 10 0 (7): 0 749--759, 2010

work page 2010

[13] [13]

Gode, D. K. and Sunder, S. Allocative efficiency of markets with zero-intelligence traders: Market as a partial substitute for individual rationality. Journal of political economy, 101 0 (1): 0 119--137, 1993

work page 1993

[14] [14]

D., Porter, M

Gould, M. D., Porter, M. A., Williams, S., McDonald, M., Fenn, D. J., and Howison, S. D. Limit order books. Quantitative Finance, 13 0 (11): 0 1709--1742, 2013

work page 2013

[15] [15]

Grinold, R. C. and Kahn, R. N. Active portfolio management: Quantitative theory and applications. Probus, 1995

work page 1995

[16] [16]

and Polak, T

Huang, R. and Polak, T. Lobster: Limit order book reconstruction system, technical documentation, 2011. URL https://lobsterdata.com/info/DataSamples.php

work page 2011

[17] [17]

Kyle, A. S. Continuous auctions and insider trading. Econometrica: Journal of the Econometric Society, pp.\ 1315--1335, 1985

work page 1985

[18] [18]

Zero intelligence in economics and finance

Ladley, D. Zero intelligence in economics and finance. The Knowledge Engineering Review, 27 0 (2): 0 273–286, 2012. doi:10.1017/S0269888912000173

work page doi:10.1017/s0269888912000173 2012

[19] [19]

Agent-based computational finance

LeBaron, B. Agent-based computational finance. Handbook of computational economics, 2: 0 1187--1233, 2006

work page 2006

[20] [20]

and Laruelle, S

Lehalle, C.-A. and Laruelle, S. Market microstructure in practice. World Scientific, 2018

work page 2018

[21] [21]

The nasdaq stock market (nasdaq), 2019

Nasdaq. The nasdaq stock market (nasdaq), 2019. URL https://www.nasdaqtrader.com/Trader.aspx?id=TradingUSEquities

work page 2019

[22] [22]

An agent based model of the e-mini s&p 500 applied to flash crash analysis

Paddrik, M., Hayes, R., Todd, A., Yang, S., Beling, P., and Scherer, W. An agent based model of the e-mini s&p 500 applied to flash crash analysis. In 2012 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr), pp.\ 1--8. IEEE, 2012

work page 2012

[23] [23]

Preis, T., Golke, S., Paul, W., and Schneider, J. J. Multi-agent-based order book model of financial markets. EPL (Europhysics Letters), 75 0 (3): 0 510, 2006

work page 2006

[24] [24]

market making

Toke, I. M. “market making” in an order book model and its impact on the spread. In Econophysics of order-driven markets, pp.\ 49--64. Springer, 2011

work page 2011

[25] [25]

and Wellman, M

Wang, X. and Wellman, M. P. Spoofing the limit order book: An agent-based model. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp.\ 651--659. International Foundation for Autonomous Agents and Multiagent Systems, 2017

work page 2017