pith. sign in

arxiv: 2604.24366 · v2 · submitted 2026-04-27 · 💱 q-fin.TR · cs.GT· q-fin.GN

The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book

Pith reviewed 2026-05-15 06:34 UTC · model grok-4.3

classification 💱 q-fin.TR cs.GTq-fin.GN
keywords prediction marketorder booktrade directionon-chain datamicrostructureeffective spreadPolymarket
0
0 comments X

The pith

Public order-book feeds on Polymarket infer correct trade direction only 59 percent of the time against on-chain records.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper joins a 30-billion-event archive of Polymarket's public order-book feed to the definitive on-chain trade records across a pre-registered panel of 600 markets. It shows that conventional methods for labeling buys and sells from the public feed succeed in only about 59 percent of cases, well below the accuracy levels reported for traditional equity markets. The authors also lay out eight stylized facts covering spreads, depth, maker concentration, and timing delays. Because the public feed alone cannot reliably identify trade direction, microstructure studies of the platform require direct use of on-chain OrderFilled events.

Core claim

Trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only about 59 percent of buckets, with a panel mean of 0.615. Effective half-spreads change sign between feed- and on-chain directions on roughly two-thirds of markets, and Kyle's lambda estimates diverge on over half. Microstructure work on Polymarket must therefore source trade direction from on-chain OrderFilled events.

What carries the argument

The continuous tick-level public order-book feed joined to the authoritative on-chain trade record, using OrderFilled events as the source of true buy-sell labels.

If this is right

  • Effective spreads and Kyle's lambda estimates flip sign or magnitude when switching from feed-inferred to on-chain trade direction.
  • Depth is distributed more uniformly than concentrated at the top of the book.
  • Maker wallets show broad participation with a concentrated tail and low self-counterparty wash trading.
  • Cross-sectional depth is explained by market duration, price level, and volume, without residual time-to-close effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other decentralized prediction markets may exhibit similar gaps between public feed and on-chain accuracy, requiring parallel data joins.
  • The low observed wash-trading share provides a benchmark for assessing integrity on comparable on-chain venues.
  • Researchers can now test whether the documented stylized facts hold in out-of-sample periods after the 52-day archive.

Load-bearing premise

The pre-registered panel of 600 markets represents Polymarket activity overall and the matching process between public feed and on-chain records introduces no systematic bias.

What would settle it

A new sample of markets or time window in which public-feed trade-direction labels agree with on-chain records at rates consistently above 75 percent.

Figures

Figures reproduced from arXiv: 2604.24366 by Philipp D. Dubach.

Figure 2
Figure 2. Figure 2: SF2 panel: histogram of L1/L10 depth-concentration ratio across 546 panel markets. Vertical lines mark the uniform-grid benchmark (green, 0.10) and the fully top-of￾book limit (red, 1.0). 5.3 SF3 – Polygon block-clock align￾ment We test whether price_change events cluster near Polygon block boundaries by computing, per market, the share of events that fall within ±100 ms of the nearest 2 000 ms grid point.… view at source ↗
Figure 1
Figure 1. Figure 1: SF1 panel: median quoted spread (bps) per mid-price decile, 600 panel markets. Shaded band is interquartile range. 5.2 SF2 – Depth concentration We summarize the L2 depth profile by the ratio depthL=1/depthL=10, the share of cumulative top-10 depth held at the top-of-book. A value of 1.0 means the entire top-10 depth sits at level 1 (a thin, top-heavy book); 0.1 matches a uniform grid where each level carr… view at source ↗
Figure 3
Figure 3. Figure 3: SF3 panel: distribution of per-market block-alignment shares. The red dashed line marks the chance-level null (0.10). HHI = 1; a uniform distribution across n makers yields 1/n. Across 600 markets and 6.4 M trades, the median HHI is 0.031 (∼ 32 effective makers). The distribution is right-skewed: p90 = 0.119 (∼ 8 effective makers) and a maximum of 0.40 (roughly 3 effective makers). Maker liquidity is decen… view at source ↗
Figure 5
Figure 5. Figure 5: SF5 panel: median effective half￾spread by category, with interquartile-range er￾ror bars. Categories are derived from keyword classification of CLOB REST question text. Each archive row carries two timestamps: timestamp_received (exchange side) and timestamp_created_at (collector side). Their difference is a per-event ingestion delay. Across 547 markets with non-empty windows, the median per-market p50 de… view at source ↗
Figure 6
Figure 6. Figure 6: SF6 panel: per-market percentile distributions of archive-ingestion latency (log scale). 5.7 SF7 – Self-counterparty wash share We flag a trade as wash-suspect under a two-tier rule: (a) maker == taker (direct self-match), or (b) a flipped pair (makera,takera) ↔ (takera, makera) within 128 blocks (Polygon finality buffer) on the same market. This is an explicit lower bound: it captures only direct and imme… view at source ↗
Figure 7
Figure 7. Figure 7: SF7 panel: distribution of self￾counterparty wash share by market. Red dashed line marks a 25% reference, the lower bound of the wash-share range documented by Cong et al. [2023] on unregulated cryptocur￾rency venues. midpoint (2026-03-13), restricted to 322 mar￾kets with positive seconds-to-close and non￾zero summary depth view at source ↗
Figure 8
Figure 8. Figure 8: SF8: cross-sectional fit of log mean depth on log seconds-to-close at the panel midpoint. 6 Spread Decomposition Following Glosten and Harris [1988] and the modern restatements in Huang and Stoll [1997], Madhavan et al. [1997], we decompose the per￾market effective half-spread into two compo￾nents: S eff 1/2 = c + φ, (1) where c is a transitory order-processing / inventory component, recovered as the reali… view at source ↗
Figure 9
Figure 9. Figure 9: Glosten-Harris decomposition across the top-100 stratum: distribution of transitory component c (left) and adverse-selection com￾ponent φ (right), both in probability points. component (0.0). This near-null pattern lines up with the calibration in Section 7: once sign errors are removed, the dollar￾weighted “adverse selection” that orderbook￾only inference produces collapses, leaving the typical top-100 ma… view at source ↗
read the original abstract

We study the microstructure of Polymarket, the largest on-chain prediction market, using a continuous tick-level archive of the public order-book feed (30 billion events over 52 days) joined to the authoritative on-chain trade record. On a pre-registered stratified panel of 600 markets we report eight stylized facts: a longshot spread premium; a depth profile closer to uniform than to top-of-book; a null block-clock alignment effect; broad maker-wallet diversity with a concentrated tail; category-conditional effective-spread differences; a sub-50 ms median archive-ingestion delay with a multi-second tail; a self-counterparty wash share with median 1% and a 22% upper tail (well below Cong et al. 2023's 25-70% for unregulated crypto venues -- a sanity bound, not an apples-to-apples reference); and a cross-sectional depth profile explained by market duration, price level, and volume, with no residual time-to-close effect. The paper also contributes a measurement result: trade direction inferred from Polymarket's public order-book feed agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), well below the ~80% Lee-Ready accuracy on Nasdaq. The effective half-spread changes sign between feed- and on-chain trade directions on 67%/50% of markets across two 7-day windows; Kyle's lambda on 60%/43%. Microstructure work on Polymarket therefore needs to source trade direction from on-chain OrderFilled events; we release a replication package that performs the join.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper analyzes the microstructure of Polymarket using a 30-billion-event tick-level archive of the public order-book feed joined to on-chain OrderFilled records over 52 days. On a pre-registered stratified panel of 600 markets, it documents eight stylized facts including a longshot spread premium, near-uniform depth profiles, maker-wallet concentration, category differences in effective spreads, sub-50 ms median ingestion delay with multi-second tail, low wash-trading shares, and cross-sectional depth drivers. The central measurement result is that public-feed trade-direction inference agrees with on-chain ground truth on only ~59% of buckets (panel mean 0.615, 95% CI [0.58, 0.65]), far below Lee-Ready benchmarks, implying that future microstructure work must use on-chain direction; a replication package is released.

Significance. If the measurement holds, the paper supplies the first large-scale evidence that standard trade-signing methods fail in this on-chain setting, with direct implications for effective-spread, lambda, and informed-trading studies on prediction markets. The scale, pre-registration, confidence intervals, and replication package are strengths that support the empirical claims.

major comments (2)
  1. [Data and Methodology] Data and Methodology section (description of archive join): the multi-second tail in ingestion delay is reported alongside the sub-50 ms median, yet no robustness table restricts the agreement-rate calculation to low-delay matches or realigns using purely on-chain timestamps. Because the 0.615 panel-mean agreement is the load-bearing result, any systematic mismatch during rapid price moves could downward-bias the statistic without demonstrating weakness in the inference algorithm itself.
  2. [Results] Results on trade-direction accuracy (panel of 600 markets): the claim that feed-inferred direction changes the sign of effective half-spread on 67%/50% of markets and Kyle's lambda on 60%/43% across two 7-day windows lacks detail on bucket construction, weighting, or statistical significance of the sign flips. Without these, it is unclear whether the reported fractions reflect economically meaningful reversals or noise from the join.
minor comments (2)
  1. [Stylized facts] The reference to Cong et al. (2023) wash-trading shares (25-70%) is presented as a sanity bound rather than a direct comparator; a brief note on venue and regulatory differences would clarify the contrast.
  2. [Stylized facts] Figure or table showing the depth profile by price level and duration would benefit from explicit confidence bands or standard errors to match the precision used elsewhere.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript to incorporate additional robustness checks and methodological clarifications that strengthen the presentation of our results.

read point-by-point responses
  1. Referee: [Data and Methodology] Data and Methodology section (description of archive join): the multi-second tail in ingestion delay is reported alongside the sub-50 ms median, yet no robustness table restricts the agreement-rate calculation to low-delay matches or realigns using purely on-chain timestamps. Because the 0.615 panel-mean agreement is the load-bearing result, any systematic mismatch during rapid price moves could downward-bias the statistic without demonstrating weakness in the inference algorithm itself.

    Authors: We agree that a robustness check is warranted given the reported delay distribution. The current join already anchors on on-chain OrderFilled timestamps as ground truth, with public-feed events matched within a narrow window. We will add a new appendix table recomputing the panel-mean agreement rate (and its confidence interval) after restricting to matches with ingestion delay below the 90th percentile. We will also report a version that realigns exclusively on on-chain timestamps. This will directly test whether the low agreement rate is an artifact of high-delay observations. revision: yes

  2. Referee: [Results] Results on trade-direction accuracy (panel of 600 markets): the claim that feed-inferred direction changes the sign of effective half-spread on 67%/50% of markets and Kyle's lambda on 60%/43% across two 7-day windows lacks detail on bucket construction, weighting, or statistical significance of the sign flips. Without these, it is unclear whether the reported fractions reflect economically meaningful reversals or noise from the join.

    Authors: We will expand the relevant section to specify that buckets are 5-minute intervals per market-day, that sign changes are evaluated at the market level with equal weighting across the pre-registered panel of 600 markets, and that the two 7-day windows are non-overlapping. We will also add a brief description of how bootstrap standard errors are used to assess the reliability of the sign flips. These clarifications will be placed in the main text and the replication package will be updated to include the exact bucket-level code. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical data joins and statistical summaries

full rationale

The paper reports direct measurements from joining a public order-book feed archive to on-chain OrderFilled events on a pre-registered panel of markets. All eight stylized facts and the central agreement-rate result (panel mean 0.615) are computed as straightforward statistical summaries of the joined data, with no equations, fitted parameters, or derivations that reduce to their own inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes; the work contains no derivation chain at all.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is empirical and relies on standard market microstructure assumptions rather than new theoretical constructs, fitted parameters, or invented entities.

axioms (1)
  • domain assumption Public order-book feed and on-chain trade records can be accurately joined without material matching errors.
    Invoked in the measurement of trade direction agreement and all stylized facts derived from the joined dataset.

pith-pipeline@v0.9.0 · 5608 in / 1259 out tokens · 53877 ms · 2026-05-15T06:34:26.383790+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 1 internal anchor

  1. [1]

    High-frequency trading and price discovery

    Jonathan Brogaard, Terrence Hendershott, and Ryan Riordan. High-frequency trading and price discovery. Review of Financial Studies, 27 0 (8): 0 2267--2306, 2014

  2. [2]

    Crypto wash trading

    Lin William Cong, Xi Li, Ke Tang, and Yang Yang. Crypto wash trading. Management Science, 69 0 (11): 0 6427--6454, 2023

  3. [3]

    Philipp D. Dubach. Replication package: The anatomy of a decentralized prediction market, 2026. URL https://doi.org/10.5281/zenodo.19811426

  4. [4]

    The accuracy of trade classification rules: Evidence from nasdaq

    Katrina Ellis, Roni Michaely, and Maureen O'Hara. The accuracy of trade classification rules: Evidence from nasdaq. Journal of Financial and Quantitative Analysis, 35 0 (4): 0 529--551, 2000

  5. [5]

    Market Liquidity: Theory, Evidence, and Policy

    Thierry Foucault, Marco Pagano, and Ailsa R \"o ell. Market Liquidity: Theory, Evidence, and Policy. Oxford University Press, 2013

  6. [6]

    Glosten and Lawrence E

    Lawrence R. Glosten and Lawrence E. Harris. Estimating the components of the bid/ask spread. Journal of Financial Economics, 21 0 (1): 0 123--142, 1988

  7. [7]

    Logarithmic market scoring rules for modular combinatorial information aggregation

    Robin Hanson. Logarithmic market scoring rules for modular combinatorial information aggregation. Journal of Prediction Markets, 1 0 (1): 0 3--15, 2007

  8. [8]

    Empirical Market Microstructure: The Institutions, Economics, and Econometrics of Securities Trading

    Joel Hasbrouck. Empirical Market Microstructure: The Institutions, Economics, and Econometrics of Securities Trading. Oxford University Press, 2007

  9. [9]

    Huang and Hans R

    Roger D. Huang and Hans R. Stoll. The components of the bid-ask spread: A general approach. Review of Financial Studies, 10 0 (4): 0 995--1034, 1997

  10. [10]

    Charles M. C. Lee and Mark J. Ready. Inferring trade direction from intraday data. Journal of Finance, 46 0 (2): 0 733--746, 1991

  11. [11]

    Why do security prices change? a transaction-level analysis of NYSE stocks

    Ananth Madhavan, Matthew Richardson, and Mark Roomans. Why do security prices change? a transaction-level analysis of NYSE stocks. Review of Financial Studies, 10 0 (4): 0 1035--1064, 1997

  12. [12]

    Charles F. Manski. Interpreting the predictions of prediction markets. Economics Letters, 91 0 (3): 0 425--429, 2006

  13. [13]

    Market Microstructure Theory

    Maureen O'Hara. Market Microstructure Theory. Blackwell Publishing, 1995

  14. [14]

    Lionel Page and Robert T. Clemen. Do prediction markets produce well-calibrated probability forecasts? The Economic Journal, 123 0 (568): 0 491--513, 2013

  15. [15]

    SoK : Market microstructure for decentralized prediction markets ( DePMs )

    Nahid Rahman, Joseph Al-Chami, and Jeremy Clark. SoK : Market microstructure for decentralized prediction markets ( DePMs ). arXiv preprint arXiv:2510.15612, 2025. URL https://arxiv.org/abs/2510.15612

  16. [16]

    Explaining the favorite-long shot bias: Is it risk-love or misperceptions? Journal of Political Economy, 118 0 (4): 0 723--746, 2010

    Erik Snowberg and Justin Wolfers. Explaining the favorite-long shot bias: Is it risk-love or misperceptions? Journal of Political Economy, 118 0 (4): 0 723--746, 2010

  17. [17]

    Thaler and William T

    Richard H. Thaler and William T. Ziemba. Anomalies: Parimutuel betting markets: Racetracks and lotteries. Journal of Economic Perspectives, 2 0 (2): 0 161--174, 1988

  18. [18]

    The Anatomy of a Blockchain Prediction Market: Polymarket in the 2024 U.S. Presidential Election

    Kwok Ping Tsang and Zichao Yang. The anatomy of Polymarket : Evidence from the 2024 presidential election. arXiv preprint arXiv:2603.03136, 2026. URL https://arxiv.org/abs/2603.03136

  19. [19]

    Prediction markets

    Justin Wolfers and Eric Zitzewitz. Prediction markets. Journal of Economic Perspectives, 18 0 (2): 0 107--126, 2004