The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book
Pith reviewed 2026-05-15 06:34 UTC · model grok-4.3
The pith
Public order-book feeds on Polymarket infer correct trade direction only 59 percent of the time against on-chain records.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only about 59 percent of buckets, with a panel mean of 0.615. Effective half-spreads change sign between feed- and on-chain directions on roughly two-thirds of markets, and Kyle's lambda estimates diverge on over half. Microstructure work on Polymarket must therefore source trade direction from on-chain OrderFilled events.
What carries the argument
The continuous tick-level public order-book feed joined to the authoritative on-chain trade record, using OrderFilled events as the source of true buy-sell labels.
If this is right
- Effective spreads and Kyle's lambda estimates flip sign or magnitude when switching from feed-inferred to on-chain trade direction.
- Depth is distributed more uniformly than concentrated at the top of the book.
- Maker wallets show broad participation with a concentrated tail and low self-counterparty wash trading.
- Cross-sectional depth is explained by market duration, price level, and volume, without residual time-to-close effects.
Where Pith is reading between the lines
- Other decentralized prediction markets may exhibit similar gaps between public feed and on-chain accuracy, requiring parallel data joins.
- The low observed wash-trading share provides a benchmark for assessing integrity on comparable on-chain venues.
- Researchers can now test whether the documented stylized facts hold in out-of-sample periods after the 52-day archive.
Load-bearing premise
The pre-registered panel of 600 markets represents Polymarket activity overall and the matching process between public feed and on-chain records introduces no systematic bias.
What would settle it
A new sample of markets or time window in which public-feed trade-direction labels agree with on-chain records at rates consistently above 75 percent.
Figures
read the original abstract
We study the microstructure of Polymarket, the largest on-chain prediction market, using a continuous tick-level archive of the public order-book feed (30 billion events over 52 days) joined to the authoritative on-chain trade record. On a pre-registered stratified panel of 600 markets we report eight stylized facts: a longshot spread premium; a depth profile closer to uniform than to top-of-book; a null block-clock alignment effect; broad maker-wallet diversity with a concentrated tail; category-conditional effective-spread differences; a sub-50 ms median archive-ingestion delay with a multi-second tail; a self-counterparty wash share with median 1% and a 22% upper tail (well below Cong et al. 2023's 25-70% for unregulated crypto venues -- a sanity bound, not an apples-to-apples reference); and a cross-sectional depth profile explained by market duration, price level, and volume, with no residual time-to-close effect. The paper also contributes a measurement result: trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), well below the ~80% Lee-Ready accuracy on Nasdaq. The effective half-spread changes sign between feed- and on-chain trade directions on 67%/50% of markets across two 7-day windows; Kyle's lambda on 60%/43%. Microstructure work on Polymarket therefore needs to source trade direction from on-chain OrderFilled events; we release a replication package that performs the join.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes the microstructure of Polymarket using a 30-billion-event tick-level archive of the public order-book feed joined to on-chain OrderFilled records over 52 days. On a pre-registered stratified panel of 600 markets, it documents eight stylized facts including a longshot spread premium, near-uniform depth profiles, maker-wallet concentration, category differences in effective spreads, sub-50 ms median ingestion delay with multi-second tail, low wash-trading shares, and cross-sectional depth drivers. The central measurement result is that public-feed trade-direction inference agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), far below Lee-Ready benchmarks, implying that future microstructure work must use on-chain direction; a replication package is released.
Significance. If the measurement holds, the paper supplies the first large-scale evidence that standard trade-signing methods fail in this on-chain setting, with direct implications for effective-spread, lambda, and informed-trading studies on prediction markets. The scale, pre-registration, confidence intervals, and replication package are strengths that support the empirical claims.
major comments (2)
- [Data and Methodology] Data and Methodology section (description of archive join): the multi-second tail in ingestion delay is reported alongside the sub-50 ms median, yet no robustness table restricts the agreement-rate calculation to low-delay matches or realigns using purely on-chain timestamps. Because the 0.615 panel-mean agreement is the load-bearing result, any systematic mismatch during rapid price moves could downward-bias the statistic without demonstrating weakness in the inference algorithm itself.
- [Results] Results on trade-direction accuracy (panel of 600 markets): the claim that feed-inferred direction changes the sign of effective half-spread on 67%/50% of markets and Kyle's lambda on 60%/43% across two 7-day windows lacks detail on bucket construction, weighting, or statistical significance of the sign flips. Without these, it is unclear whether the reported fractions reflect economically meaningful reversals or noise from the join.
minor comments (2)
- [Stylized facts] The reference to Cong et al. (2023) wash-trading shares (25-70%) is presented as a sanity bound rather than a direct comparator; a brief note on venue and regulatory differences would clarify the contrast.
- [Stylized facts] Figure or table showing the depth profile by price level and duration would benefit from explicit confidence bands or standard errors to match the precision used elsewhere.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript to incorporate additional robustness checks and methodological clarifications that strengthen the presentation of our results.
read point-by-point responses
-
Referee: [Data and Methodology] Data and Methodology section (description of archive join): the multi-second tail in ingestion delay is reported alongside the sub-50 ms median, yet no robustness table restricts the agreement-rate calculation to low-delay matches or realigns using purely on-chain timestamps. Because the 0.615 panel-mean agreement is the load-bearing result, any systematic mismatch during rapid price moves could downward-bias the statistic without demonstrating weakness in the inference algorithm itself.
Authors: We agree that a robustness check is warranted given the reported delay distribution. The current join already anchors on on-chain OrderFilled timestamps as ground truth, with public-feed events matched within a narrow window. We will add a new appendix table recomputing the panel-mean agreement rate (and its confidence interval) after restricting to matches with ingestion delay below the 90th percentile. We will also report a version that realigns exclusively on on-chain timestamps. This will directly test whether the low agreement rate is an artifact of high-delay observations. revision: yes
-
Referee: [Results] Results on trade-direction accuracy (panel of 600 markets): the claim that feed-inferred direction changes the sign of effective half-spread on 67%/50% of markets and Kyle's lambda on 60%/43% across two 7-day windows lacks detail on bucket construction, weighting, or statistical significance of the sign flips. Without these, it is unclear whether the reported fractions reflect economically meaningful reversals or noise from the join.
Authors: We will expand the relevant section to specify that buckets are 5-minute intervals per market-day, that sign changes are evaluated at the market level with equal weighting across the pre-registered panel of 600 markets, and that the two 7-day windows are non-overlapping. We will also add a brief description of how bootstrap standard errors are used to assess the reliability of the sign flips. These clarifications will be placed in the main text and the replication package will be updated to include the exact bucket-level code. revision: partial
Circularity Check
No circularity: purely empirical data joins and statistical summaries
full rationale
The paper reports direct measurements from joining a public order-book feed archive to on-chain OrderFilled events on a pre-registered panel of markets. All eight stylized facts and the central agreement-rate result (panel mean 0.615) are computed as straightforward statistical summaries of the joined data, with no equations, fitted parameters, or derivations that reduce to their own inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes; the work contains no derivation chain at all.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Public order-book feed and on-chain trade records can be accurately joined without material matching errors.
Reference graph
Works this paper leans on
-
[1]
High-frequency trading and price discovery
Jonathan Brogaard, Terrence Hendershott, and Ryan Riordan. High-frequency trading and price discovery. Review of Financial Studies, 27 0 (8): 0 2267--2306, 2014
work page 2014
-
[2]
Lin William Cong, Xi Li, Ke Tang, and Yang Yang. Crypto wash trading. Management Science, 69 0 (11): 0 6427--6454, 2023
work page 2023
-
[3]
Philipp D. Dubach. Replication package: The anatomy of a decentralized prediction market, 2026. URL https://doi.org/10.5281/zenodo.19811426
-
[4]
The accuracy of trade classification rules: Evidence from nasdaq
Katrina Ellis, Roni Michaely, and Maureen O'Hara. The accuracy of trade classification rules: Evidence from nasdaq. Journal of Financial and Quantitative Analysis, 35 0 (4): 0 529--551, 2000
work page 2000
-
[5]
Market Liquidity: Theory, Evidence, and Policy
Thierry Foucault, Marco Pagano, and Ailsa R \"o ell. Market Liquidity: Theory, Evidence, and Policy. Oxford University Press, 2013
work page 2013
-
[6]
Lawrence R. Glosten and Lawrence E. Harris. Estimating the components of the bid/ask spread. Journal of Financial Economics, 21 0 (1): 0 123--142, 1988
work page 1988
-
[7]
Logarithmic market scoring rules for modular combinatorial information aggregation
Robin Hanson. Logarithmic market scoring rules for modular combinatorial information aggregation. Journal of Prediction Markets, 1 0 (1): 0 3--15, 2007
work page 2007
-
[8]
Empirical Market Microstructure: The Institutions, Economics, and Econometrics of Securities Trading
Joel Hasbrouck. Empirical Market Microstructure: The Institutions, Economics, and Econometrics of Securities Trading. Oxford University Press, 2007
work page 2007
-
[9]
Roger D. Huang and Hans R. Stoll. The components of the bid-ask spread: A general approach. Review of Financial Studies, 10 0 (4): 0 995--1034, 1997
work page 1997
-
[10]
Charles M. C. Lee and Mark J. Ready. Inferring trade direction from intraday data. Journal of Finance, 46 0 (2): 0 733--746, 1991
work page 1991
-
[11]
Why do security prices change? a transaction-level analysis of NYSE stocks
Ananth Madhavan, Matthew Richardson, and Mark Roomans. Why do security prices change? a transaction-level analysis of NYSE stocks. Review of Financial Studies, 10 0 (4): 0 1035--1064, 1997
work page 1997
-
[12]
Charles F. Manski. Interpreting the predictions of prediction markets. Economics Letters, 91 0 (3): 0 425--429, 2006
work page 2006
-
[13]
Maureen O'Hara. Market Microstructure Theory. Blackwell Publishing, 1995
work page 1995
-
[14]
Lionel Page and Robert T. Clemen. Do prediction markets produce well-calibrated probability forecasts? The Economic Journal, 123 0 (568): 0 491--513, 2013
work page 2013
-
[15]
SoK : Market microstructure for decentralized prediction markets ( DePMs )
Nahid Rahman, Joseph Al-Chami, and Jeremy Clark. SoK : Market microstructure for decentralized prediction markets ( DePMs ). arXiv preprint arXiv:2510.15612, 2025. URL https://arxiv.org/abs/2510.15612
-
[16]
Erik Snowberg and Justin Wolfers. Explaining the favorite-long shot bias: Is it risk-love or misperceptions? Journal of Political Economy, 118 0 (4): 0 723--746, 2010
work page 2010
-
[17]
Richard H. Thaler and William T. Ziemba. Anomalies: Parimutuel betting markets: Racetracks and lotteries. Journal of Economic Perspectives, 2 0 (2): 0 161--174, 1988
work page 1988
-
[18]
The Anatomy of a Blockchain Prediction Market: Polymarket in the 2024 U.S. Presidential Election
Kwok Ping Tsang and Zichao Yang. The anatomy of Polymarket : Evidence from the 2024 presidential election. arXiv preprint arXiv:2603.03136, 2026. URL https://arxiv.org/abs/2603.03136
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[19]
Justin Wolfers and Eric Zitzewitz. Prediction markets. Journal of Economic Perspectives, 18 0 (2): 0 107--126, 2004
work page 2004
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.