Polymarket-v1 Database

Boka Qin; Rui Yang

arxiv: 2606.04217 · v2 · pith:BIAG7H2Lnew · submitted 2026-06-02 · 💻 cs.CE · q-fin.ST· q-fin.TR

Polymarket-v1 Database

Boka Qin , Rui Yang This is my paper

Pith reviewed 2026-06-28 07:44 UTC · model grok-4.3

classification 💻 cs.CE q-fin.STq-fin.TR

keywords prediction marketsmicrostructureaggressor classificationVPINBrier scoreon-chain dataGibbs spreadtrade direction

0 comments

The pith

Ground-truth aggressor direction from the blockchain settlement layer shows that true VPIN predicts Brier scores while classified proxies do not.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper assembles the full on-chain trade record for Polymarket's first exchange, yielding 1.2 billion records with exact aggressor flags taken directly from settlement rather than inferred. Standard classifiers such as the tick rule and bulk volume method recover only random accuracy because prediction markets exhibit persistent direction autocorrelation and concentrated market-making, violating the mean-reversion premise those tools assume. These label errors distort VPIN and order-flow imbalance enough to erase their statistical links to market calibration. With the accurate labels the authors recover a positive relation between true VPIN and Brier scores and a negative relation between Gibbs spread and Brier scores; both relations weaken sharply when the same metrics are computed from classified data instead.

Core claim

The complete on-chain archive supplies 100 percent ground-truth aggressor direction unavailable in prior prediction-market data sets. Tick-rule and bulk-volume classifiers achieve only 49.83 percent and 50.51 percent aggregate accuracy, with systematic price-level bias arising from positive trade-direction autocorrelation and concentrated market-making. These errors cause inferred VPIN to diverge from ground-truth VPIN and bias OFI estimates. Ground-truth VPIN positively predicts Brier scores while Gibbs spread negatively predicts them, yet the same relationships are materially attenuated when ground-truth metrics are replaced by classified proxies.

What carries the argument

The 100 percent ground-truth aggressor direction extracted from the blockchain settlement layer, used both to benchmark classical classifiers and to validate microstructure metrics against subsequent forecast accuracy.

If this is right

Classification errors propagate directly into VPIN and OFI, producing biased transaction-cost estimates.
True VPIN rises with worse Brier scores, consistent with informed volume coinciding with poorer calibration.
Gibbs spread falls with worse Brier scores, consistent with high-spread markets drawing informed specialists rather than noise traders.
Any study that substitutes classified proxies for ground-truth metrics will understate the strength of microstructure-forecast linkages.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Prediction-market platforms may need classifiers explicitly adjusted for persistent direction autocorrelation rather than relying on equity-market defaults.
The same ground-truth labels could be used to train market-specific classifiers that recover accurate VPIN at scale.
On-chain settlement data from other decentralized prediction or betting venues would allow direct tests of whether the same classification failures appear outside Polymarket.

Load-bearing premise

The on-chain settlement layer supplies 100 percent accurate aggressor direction for every trade record without extraction errors or ambiguities.

What would settle it

Re-estimating the VPIN-Brier and Gibbs-Brier regressions on the same markets after replacing ground-truth labels with tick-rule labels at the observed error rate and finding that the slope coefficients remain statistically indistinguishable from the ground-truth results.

Figures

Figures reproduced from arXiv: 2606.04217 by Boka Qin, Rui Yang.

**Figure 8.** Figure 8: 0.00-0.01 0.05-0.06 0.10-0.11 0.15-0.16 0.20-0.21 0.25-0.26 0.30-0.31 0.35-0.36 0.40-0.41 0.45-0.46 0.50-0.51 0.55-0.56 0.60-0.61 0.65-0.66 0.70-0.71 0.75-0.76 0.80-0.81 0.85-0.86 0.90-0.91 0.95-0.96 Price Bin −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 Direction Autocorrelation ρ Direction Autocorrelation by Price Bin FIGURE 10. Trade direction autocorrelation ρ = Corr(Dt , Dt−1) by price bin, with m… view at source ↗

read the original abstract

We introduce the Polymarket-v1 Database: the complete on-chain trade archive of Polymarket's first-generation CTF Exchange on Polygon, spanning 2022-11-21 to 2026-04-28 and covering the full contract lifecycle from first settlement to natural termination. The dataset comprises 1.20 billion trade records across 1.30 million markets with $61 billion in nominal volume. Its defining feature is 100% ground-truth aggressor direction derived from the blockchain settlement layer, a property unavailable in existing prediction market archives, which rely on heuristic inference. We use this truth-aligned archive to benchmark standard microstructure tools and document three findings. First, the tick rule and bulk volume classification achieve near-random aggregate accuracy (49.83% and 50.51%), but this masks a systematic, correctable price-level gradient driven by positive trade direction autocorrelation and concentrated market-making -- two structural features of prediction markets that violate the mean-reversion assumption embedded in classical classifiers. Second, these classification errors propagate into downstream metrics: inferred VPIN diverges substantially from ground-truth VPIN, and OFI estimates are directionally biased, with material consequences for Transaction Cost Analysis. Third, ground-truth microstructure quality predicts forecasting performance in ways that classification-based proxies cannot recover: True VPIN positively predicts Brier scores, while Gibbs spread negatively predicts them -- a selection effect reflecting that high-spread niche markets attract informed specialists rather than noise traders. Replacing ground-truth metrics with classified proxies attenuates both relationships, illustrating that measurement accuracy at the transaction level is a prerequisite for reliable inference about prediction market design and probability calibration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is mainly a data release of 1.2 billion Polymarket trades with blockchain-derived aggressor flags, plus evidence that tick rule and bulk volume classifiers are near-random here.

read the letter

The paper's core offering is the Polymarket-v1 Database: the full on-chain archive from the CTF exchange on Polygon, with 1.2 billion records and claimed 100% ground-truth aggressor direction pulled directly from the settlement layer. That feature is new relative to prior prediction-market datasets that rely on heuristics.

What stands out is the benchmarking. The tick rule hits 49.83% accuracy and bulk volume classification 50.51%, with the errors tied to positive autocorrelation and concentrated market-making that break the usual mean-reversion assumptions. Those misclassifications then show up in VPIN divergence and biased OFI estimates, which matters for any transaction-cost work. The third part links ground-truth VPIN positively to Brier scores and Gibbs spread negatively to them, while proxies weaken the signals.

The ground-truth claim is the load-bearing piece. The abstract presents it as error-free because it comes from the settlement layer, but the stress-test point about atomic multi-leg trades, partial fills, or decoding mismatches is worth checking in the methods section. If the extraction logic is laid out clearly and handles those cases, the accuracy numbers hold; if not, the downstream gaps shrink. The analysis steps for the Brier relationships are only summarized in the abstract, so the full text needs to show the exact regressions and robustness checks.

This is for people doing microstructure or calibration work on prediction markets who want real trade-level data instead of inferred labels. The dataset scale and the concrete classifier failures give it a clear audience. It deserves peer review because the data contribution is concrete and the classifier results are falsifiable against the released archive, even if the later metric relationships need tighter documentation.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces the Polymarket-v1 Database: 1.20 billion on-chain trade records from Polymarket's CTF Exchange on Polygon (2022-11-21 to 2026-04-28) across 1.30 million markets and $61 billion nominal volume. Its core contribution is the provision of 100% ground-truth aggressor direction extracted from the blockchain settlement layer. Using this archive the authors benchmark the tick rule (49.83% accuracy) and bulk volume classification (50.51% accuracy), attribute the near-random performance to positive trade-direction autocorrelation and concentrated market-making, document propagation of classification errors into VPIN and OFI, and report that ground-truth VPIN positively and Gibbs spread negatively predict Brier scores while classification-based proxies attenuate both relationships.

Significance. If the ground-truth aggressor flags are verifiably error-free, the database supplies a large-scale, externally validated resource for prediction-market microstructure that is unavailable in existing archives. The documented divergences between inferred and true VPIN/OFI, together with the differential predictive power for Brier scores, would constitute concrete evidence that transaction-level direction accuracy is a prerequisite for reliable inference on forecasting performance and market design.

major comments (2)

[Abstract] Abstract: the claim that the dataset supplies '100% ground-truth aggressor direction derived from the blockchain settlement layer' is load-bearing for every accuracy number, divergence result, and Brier-score relationship, yet the manuscript supplies no description of the extraction algorithm, handling of atomic multi-leg settlements, partial fills, or contract-event decoding ambiguities.
[Abstract] Abstract: the statements that 'True VPIN positively predicts Brier scores, while Gibbs spread negatively predicts them' and that 'Replacing ground-truth metrics with classified proxies attenuates both relationships' are presented without reference to the underlying statistical specifications, sample construction, or robustness checks, preventing assessment of whether these selection-effect interpretations are supported by the data.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed reading and constructive comments on the abstract. Both points identify areas where additional clarity would strengthen the manuscript. We address each below and will revise accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the dataset supplies '100% ground-truth aggressor direction derived from the blockchain settlement layer' is load-bearing for every accuracy number, divergence result, and Brier-score relationship, yet the manuscript supplies no description of the extraction algorithm, handling of atomic multi-leg settlements, partial fills, or contract-event decoding ambiguities.

Authors: We agree that the abstract does not describe the extraction procedure. The full manuscript contains a methods section that specifies the on-chain event decoding logic, the treatment of atomic multi-leg settlements as single transactions, the identification of partial fills via cumulative fill events, and the resolution of contract-event ambiguities through the CTF settlement contract ABI. To make this transparent at the point of first reading, we will add a single sentence to the abstract summarizing the extraction approach and will include an explicit cross-reference to the methods section. revision: yes
Referee: [Abstract] Abstract: the statements that 'True VPIN positively predicts Brier scores, while Gibbs spread negatively predicts them' and that 'Replacing ground-truth metrics with classified proxies attenuates both relationships' are presented without reference to the underlying statistical specifications, sample construction, or robustness checks, preventing assessment of whether these selection-effect interpretations are supported by the data.

Authors: The abstract condenses results that are fully specified in the empirical section: market-day panel regressions of Brier score on VPIN and Gibbs spread with market-type fixed effects, volume controls, and robustness to alternative sample windows and winsorization. The attenuation result is shown via side-by-side coefficient comparisons. We will revise the abstract to include brief parenthetical references to the regression specification and sample definition, thereby directing readers to the supporting details without lengthening the abstract excessively. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical database and benchmarks are self-contained

full rationale

The paper introduces an on-chain trade archive and uses its claimed ground-truth aggressor flags to benchmark tick-rule and bulk-volume classifiers, then reports divergences in VPIN/OFI and correlations between true microstructure metrics and Brier scores. No derivation chain reduces a claimed prediction or result to a fitted parameter or self-citation by construction; the central findings are direct empirical comparisons against an external data source rather than self-referential equations or renamed inputs. The work is a standard empirical contribution whose validity hinges on the accuracy of the blockchain extraction, not on internal definitional loops.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper's claims rest on the assumption that the extracted on-chain records are complete and correctly label aggressor direction for the entire period.

axioms (1)

domain assumption Blockchain settlement layer supplies 100% accurate aggressor direction for every trade
This property is stated as the defining feature that distinguishes the archive from heuristic-based datasets.

pith-pipeline@v0.9.1-grok · 5815 in / 1074 out tokens · 20765 ms · 2026-06-28T07:44:14.484163+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 3 linked inside Pith

[1]

Dubach, P. D. , title =. 2026 , howpublished =

2026
[2]

and Tsang, K

Yang, Z. and Tsang, K. P. , title =. 2026 , howpublished =

2026
[3]

Akey, P. and Gr. Who Wins and Who Loses in Prediction Markets? Evidence from Polymarket , year =
[4]

, title =

Slivkoff, N. , title =. 2025 , month =

2025
[5]

, title =

Becker, J. , title =. 2026 , howpublished =

2026
[6]

and Walther, M

Reichenbach, F. and Walther, M. , title =. 2026 , howpublished =

2026
[7]

and Al-Chami, J

Rahman, N. and Al-Chami, J. and Clark, J. , title =. 2025 , howpublished =

2025
[8]

and Zhou, L

Jia, H. and Zhou, L. and Zhang, W. and Cong, L. W. and Li, S. and Sun, S. , title =. 2026 , howpublished =

2026
[9]

and Ma, H

Sirolly, A. and Ma, H. and Kanoria, Y. and Sethi, R. , title =. 2025 , howpublished =

2025
[10]

and Zitzewitz, E

Wolfers, J. and Zitzewitz, E. , title =. Journal of Economic Perspectives , year =
[11]

and Wolfers, J

Snowberg, E. and Wolfers, J. , title =. Journal of Political Economy , year =
[12]

and Forsythe, R

Berg, J. and Forsythe, R. and Nelson, F. and Rietz, T. , title =. Handbook of Experimental Economics Results , editor =. 2008 , volume =

2008
[13]

, title =

Roll, R. , title =. The Journal of Finance , year =
[14]

Glosten, L. R. and Milgrom, P. R. , title =. Journal of Financial Economics , year =
[15]

Kyle, A. S. , title =. Econometrica , year =
[16]

, title =

Amihud, Y. , title =. Journal of Financial Markets , year =
[17]

Lee, C. M. C. and Ready, M. J. , title =. The Journal of Finance , year =
[18]

and O'Hara, M

Easley, D. and O'Hara, M. , title =. Journal of Financial Economics , year =
[19]

and Kiefer, N

Easley, D. and Kiefer, N. M. and O'Hara, M. and Paperman, J. B. , title =. The Journal of Finance , year =
[20]

Easley, D. and L. Flow Toxicity and Liquidity in a High Frequency World , journal =. 2012 , volume =

2012
[21]

, title =

Hasbrouck, J. , title =. The Journal of Finance , year =
[22]

Corwin, S. A. and Schultz, P. , title =. The Journal of Finance , year =
[23]

and Ranaldo, A

Abdi, F. and Ranaldo, A. , title =. The Review of Financial Studies , year =
[24]

and Granger, C

Gonzalo, J. and Granger, C. W. J. , title =. Journal of Business and Economic Statistics , year =
[25]

Lo, A. W. and MacKinlay, A. C. , title =. The Review of Financial Studies , year =
[26]

and Perron, P

Bai, J. and Perron, P. , title =. Econometrica , year =
[27]

and Perron, P

Bai, J. and Perron, P. , title =. Journal of Applied Econometrics , year =
[28]

, title =

Goodman-Bacon, A. , title =. Journal of Econometrics , year =
[29]

and Sant'Anna, P

Callaway, B. and Sant'Anna, P. H. C. , title =. Journal of Econometrics , year =
[30]

and D'Haultf

de Chaisemartin, C. and D'Haultf. Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects , journal =. 2020 , volume =

2020
[31]

Fama, E. F. , title =. The Journal of Finance , year =
[32]

, title =

O'Hara, M. , title =. 1995 , publisher =

1995
[33]

Grossman, S. J. and Stiglitz, J. E. , title =. American Economic Review , year =
[34]

Barclay, M. J. and Warner, J. B. , title =. Journal of Financial Economics , year =
[35]

Cong, L. W. and He, Z. and Li, J. and Tang, K. , title =. Management Science , year =
[36]

and Cramton, P

Budish, E. and Cramton, P. and Shim, J. , title =. The Quarterly Journal of Economics , year =
[37]

Abdi, F. and A. Ranaldo. 2017. A Simple Estimation of Bid-Ask Spreads from Daily Close, High, and Low Prices. The Review of Financial Studies 30 (12): 4437--4480

2017
[38]

Gr \'e goire, N

Akey, P., V. Gr \'e goire, N. Harvie, and C. Martineau. 2026. Who Wins and Who Loses in Prediction Markets? Evidence from Polymarket. SSRN 6443103. https://ssrn.com/abstract=6443103

2026
[39]

Amihud, Y. 2002. Illiquidity and Stock Returns: Cross-Section and Time-Series Effects. Journal of Financial Markets 5 (1): 31--56

2002
[40]

Bai, J. and P. Perron. 1998. Estimating and Testing Linear Models with Multiple Structural Changes. Econometrica 66 (1): 47--78

1998
[41]

Bai, J. and P. Perron. 2003. Computation and Analysis of Multiple Structural Change Models. Journal of Applied Econometrics 18 (1): 1--22

2003
[42]

Barclay, M. J. and J. B. Warner. 1993. Stealth Trading and Volatility: Which Trades Move Prices? Journal of Financial Economics 34 (3): 281--305

1993
[43]

Forsythe, F

Berg, J., R. Forsythe, F. Nelson, and T. Rietz. 2008. Results from a Dozen Years of Election Futures Markets Research. In Handbook of Experimental Economics Results, vol. 1, edited by C. Plott and V. Smith, pp. 742--751. Elsevier

2008
[44]

Callaway, B. and P. H. C. Sant'Anna. 2021. Difference-in-Differences with Multiple Time Periods. Journal of Econometrics 225 (2): 200--230

2021
[45]

Corwin, S. A. and P. Schultz. 2012. A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices. The Journal of Finance 67 (2): 719--760

2012
[46]

de Chaisemartin, C. and X. D'Haultf uille. 2020. Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. American Economic Review 110 (9): 2964--2996

2020
[47]

Dubach, P. D. 2026. The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book. arXiv preprint arXiv:2604.24366. https://arxiv.org/abs/2604.24366

Pith/arXiv arXiv 2026
[48]

Easley, D., N. M. Kiefer, M. O'Hara, and J. B. Paperman. 1996. Liquidity, Information, and Infrequently Traded Stocks. The Journal of Finance 51 (4): 1405--1436

1996
[49]

Easley, D., M. M. L \'o pez de Prado, and M. O'Hara. 2012. Flow Toxicity and Liquidity in a High Frequency World. Review of Financial Studies 25 (5): 1457--1493

2012
[50]

Easley, D. and M. O'Hara. 1987. Price, Trade Size, and Information in Securities Markets. Journal of Financial Economics 19 (1): 69--90

1987
[51]

Fama, E. F. 1970. Efficient Capital Markets: A Review of Empirical Work. The Journal of Finance 25 (2): 383--417

1970
[52]

Glosten, L. R. and P. R. Milgrom. 1985. Bid, Ask, and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders. Journal of Financial Economics 14 (1): 71--100

1985
[53]

Goodman-Bacon, A. 2021. Difference-in-Differences with Variation in Treatment Timing. Journal of Econometrics 225 (2): 254--277

2021
[54]

Grossman, S. J. and J. E. Stiglitz. 1980. On the Impossibility of Informationally Efficient Markets. American Economic Review 70 (3): 393--408

1980
[55]

Hasbrouck, J. 1991. Measuring the Information Content of Stock Trades. The Journal of Finance 46 (1): 179--207

1991
[56]

Hasbrouck, J. 2009. Trading Costs and Returns for U.S. Equities: Estimating Effective Costs from Daily Data. The Journal of Finance 64 (3): 1445--1477

2009
[57]

Jia, H., L. Zhou, W. Zhang, L. W. Cong, S. Li, and S. Sun. 2026. Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: Experiments & Analysis. arXiv preprint arXiv:2604.20421. https://arxiv.org/abs/2604.20421

Pith/arXiv arXiv 2026
[58]

Kyle, A. S. 1985. Continuous Auctions and Insider Trading. Econometrica 53 (6): 1315--1335

1985
[59]

Lee, C. M. C. and M. J. Ready. 1991. Inferring Trade Direction from Intraday Data. The Journal of Finance 46 (2): 733--746

1991
[60]

Lo, A. W. and A. C. MacKinlay. 1988. Stock Market Prices Do Not Follow Random Walks: Evidence from a Simple Specification Test. The Review of Financial Studies 1 (1): 41--66

1988
[61]

O'Hara, M. 1995. Market Microstructure Theory. Cambridge, MA: Blackwell Publishers

1995
[62]

Al-Chami, and J

Rahman, N., J. Al-Chami, and J. Clark. 2025. SoK: Market Microstructure for Decentralized Prediction Markets (DePMs). arXiv preprint arXiv:2510.15612. https://arxiv.org/abs/2510.15612

arXiv 2025
[63]

Reichenbach, F. and M. Walther. 2026. Exploring Decentralized Prediction Markets: Accuracy, Skill, and Bias on Polymarket. SSRN 5910522. https://ssrn.com/abstract=5910522

2026
[64]

Roll, R. 1984. A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market. The Journal of Finance 39 (4): 1127--1139

1984
[65]

Sirolly, A., H. Ma, Y. Kanoria, and R. Sethi. 2025. Network-Based Detection of Wash Trading. SSRN 5714122. https://ssrn.com/abstract=5714122

2025
[66]

Slivkoff, N. 2025. Polymarket Volume Is Being Double-Counted. Paradigm Research Note

2025
[67]

Snowberg, E. and J. Wolfers. 2010. Explaining the Favorite--Longshot Bias: Is It Risk-Love or Misperceptions? Journal of Political Economy 118 (4): 723--746

2010
[68]

Wolfers, J. and E. Zitzewitz. 2004. Prediction Markets. Journal of Economic Perspectives 18 (2): 107--126

2004
[69]

Yang, Z. and K. P. Tsang. 2026. The Anatomy of a Blockchain Prediction Market: Polymarket in the 2024 U.S. Presidential Election. arXiv preprint arXiv:2603.03136. https://arxiv.org/abs/2603.03136. SSRN 6336679

Pith/arXiv arXiv 2026

[1] [1]

Dubach, P. D. , title =. 2026 , howpublished =

2026

[2] [2]

and Tsang, K

Yang, Z. and Tsang, K. P. , title =. 2026 , howpublished =

2026

[3] [3]

Akey, P. and Gr. Who Wins and Who Loses in Prediction Markets? Evidence from Polymarket , year =

[4] [4]

, title =

Slivkoff, N. , title =. 2025 , month =

2025

[5] [5]

, title =

Becker, J. , title =. 2026 , howpublished =

2026

[6] [6]

and Walther, M

Reichenbach, F. and Walther, M. , title =. 2026 , howpublished =

2026

[7] [7]

and Al-Chami, J

Rahman, N. and Al-Chami, J. and Clark, J. , title =. 2025 , howpublished =

2025

[8] [8]

and Zhou, L

Jia, H. and Zhou, L. and Zhang, W. and Cong, L. W. and Li, S. and Sun, S. , title =. 2026 , howpublished =

2026

[9] [9]

and Ma, H

Sirolly, A. and Ma, H. and Kanoria, Y. and Sethi, R. , title =. 2025 , howpublished =

2025

[10] [10]

and Zitzewitz, E

Wolfers, J. and Zitzewitz, E. , title =. Journal of Economic Perspectives , year =

[11] [11]

and Wolfers, J

Snowberg, E. and Wolfers, J. , title =. Journal of Political Economy , year =

[12] [12]

and Forsythe, R

Berg, J. and Forsythe, R. and Nelson, F. and Rietz, T. , title =. Handbook of Experimental Economics Results , editor =. 2008 , volume =

2008

[13] [13]

, title =

Roll, R. , title =. The Journal of Finance , year =

[14] [14]

Glosten, L. R. and Milgrom, P. R. , title =. Journal of Financial Economics , year =

[15] [15]

Kyle, A. S. , title =. Econometrica , year =

[16] [16]

, title =

Amihud, Y. , title =. Journal of Financial Markets , year =

[17] [17]

Lee, C. M. C. and Ready, M. J. , title =. The Journal of Finance , year =

[18] [18]

and O'Hara, M

Easley, D. and O'Hara, M. , title =. Journal of Financial Economics , year =

[19] [19]

and Kiefer, N

Easley, D. and Kiefer, N. M. and O'Hara, M. and Paperman, J. B. , title =. The Journal of Finance , year =

[20] [20]

Easley, D. and L. Flow Toxicity and Liquidity in a High Frequency World , journal =. 2012 , volume =

2012

[21] [21]

, title =

Hasbrouck, J. , title =. The Journal of Finance , year =

[22] [22]

Corwin, S. A. and Schultz, P. , title =. The Journal of Finance , year =

[23] [23]

and Ranaldo, A

Abdi, F. and Ranaldo, A. , title =. The Review of Financial Studies , year =

[24] [24]

and Granger, C

Gonzalo, J. and Granger, C. W. J. , title =. Journal of Business and Economic Statistics , year =

[25] [25]

Lo, A. W. and MacKinlay, A. C. , title =. The Review of Financial Studies , year =

[26] [26]

and Perron, P

Bai, J. and Perron, P. , title =. Econometrica , year =

[27] [27]

and Perron, P

Bai, J. and Perron, P. , title =. Journal of Applied Econometrics , year =

[28] [28]

, title =

Goodman-Bacon, A. , title =. Journal of Econometrics , year =

[29] [29]

and Sant'Anna, P

Callaway, B. and Sant'Anna, P. H. C. , title =. Journal of Econometrics , year =

[30] [30]

and D'Haultf

de Chaisemartin, C. and D'Haultf. Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects , journal =. 2020 , volume =

2020

[31] [31]

Fama, E. F. , title =. The Journal of Finance , year =

[32] [32]

, title =

O'Hara, M. , title =. 1995 , publisher =

1995

[33] [33]

Grossman, S. J. and Stiglitz, J. E. , title =. American Economic Review , year =

[34] [34]

Barclay, M. J. and Warner, J. B. , title =. Journal of Financial Economics , year =

[35] [35]

Cong, L. W. and He, Z. and Li, J. and Tang, K. , title =. Management Science , year =

[36] [36]

and Cramton, P

Budish, E. and Cramton, P. and Shim, J. , title =. The Quarterly Journal of Economics , year =

[37] [37]

Abdi, F. and A. Ranaldo. 2017. A Simple Estimation of Bid-Ask Spreads from Daily Close, High, and Low Prices. The Review of Financial Studies 30 (12): 4437--4480

2017

[38] [38]

Gr \'e goire, N

Akey, P., V. Gr \'e goire, N. Harvie, and C. Martineau. 2026. Who Wins and Who Loses in Prediction Markets? Evidence from Polymarket. SSRN 6443103. https://ssrn.com/abstract=6443103

2026

[39] [39]

Amihud, Y. 2002. Illiquidity and Stock Returns: Cross-Section and Time-Series Effects. Journal of Financial Markets 5 (1): 31--56

2002

[40] [40]

Bai, J. and P. Perron. 1998. Estimating and Testing Linear Models with Multiple Structural Changes. Econometrica 66 (1): 47--78

1998

[41] [41]

Bai, J. and P. Perron. 2003. Computation and Analysis of Multiple Structural Change Models. Journal of Applied Econometrics 18 (1): 1--22

2003

[42] [42]

Barclay, M. J. and J. B. Warner. 1993. Stealth Trading and Volatility: Which Trades Move Prices? Journal of Financial Economics 34 (3): 281--305

1993

[43] [43]

Forsythe, F

Berg, J., R. Forsythe, F. Nelson, and T. Rietz. 2008. Results from a Dozen Years of Election Futures Markets Research. In Handbook of Experimental Economics Results, vol. 1, edited by C. Plott and V. Smith, pp. 742--751. Elsevier

2008

[44] [44]

Callaway, B. and P. H. C. Sant'Anna. 2021. Difference-in-Differences with Multiple Time Periods. Journal of Econometrics 225 (2): 200--230

2021

[45] [45]

Corwin, S. A. and P. Schultz. 2012. A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices. The Journal of Finance 67 (2): 719--760

2012

[46] [46]

de Chaisemartin, C. and X. D'Haultf uille. 2020. Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. American Economic Review 110 (9): 2964--2996

2020

[47] [47]

Dubach, P. D. 2026. The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book. arXiv preprint arXiv:2604.24366. https://arxiv.org/abs/2604.24366

Pith/arXiv arXiv 2026

[48] [48]

Easley, D., N. M. Kiefer, M. O'Hara, and J. B. Paperman. 1996. Liquidity, Information, and Infrequently Traded Stocks. The Journal of Finance 51 (4): 1405--1436

1996

[49] [49]

Easley, D., M. M. L \'o pez de Prado, and M. O'Hara. 2012. Flow Toxicity and Liquidity in a High Frequency World. Review of Financial Studies 25 (5): 1457--1493

2012

[50] [50]

Easley, D. and M. O'Hara. 1987. Price, Trade Size, and Information in Securities Markets. Journal of Financial Economics 19 (1): 69--90

1987

[51] [51]

Fama, E. F. 1970. Efficient Capital Markets: A Review of Empirical Work. The Journal of Finance 25 (2): 383--417

1970

[52] [52]

Glosten, L. R. and P. R. Milgrom. 1985. Bid, Ask, and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders. Journal of Financial Economics 14 (1): 71--100

1985

[53] [53]

Goodman-Bacon, A. 2021. Difference-in-Differences with Variation in Treatment Timing. Journal of Econometrics 225 (2): 254--277

2021

[54] [54]

Grossman, S. J. and J. E. Stiglitz. 1980. On the Impossibility of Informationally Efficient Markets. American Economic Review 70 (3): 393--408

1980

[55] [55]

Hasbrouck, J. 1991. Measuring the Information Content of Stock Trades. The Journal of Finance 46 (1): 179--207

1991

[56] [56]

Hasbrouck, J. 2009. Trading Costs and Returns for U.S. Equities: Estimating Effective Costs from Daily Data. The Journal of Finance 64 (3): 1445--1477

2009

[57] [57]

Jia, H., L. Zhou, W. Zhang, L. W. Cong, S. Li, and S. Sun. 2026. Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: Experiments & Analysis. arXiv preprint arXiv:2604.20421. https://arxiv.org/abs/2604.20421

Pith/arXiv arXiv 2026

[58] [58]

Kyle, A. S. 1985. Continuous Auctions and Insider Trading. Econometrica 53 (6): 1315--1335

1985

[59] [59]

Lee, C. M. C. and M. J. Ready. 1991. Inferring Trade Direction from Intraday Data. The Journal of Finance 46 (2): 733--746

1991

[60] [60]

Lo, A. W. and A. C. MacKinlay. 1988. Stock Market Prices Do Not Follow Random Walks: Evidence from a Simple Specification Test. The Review of Financial Studies 1 (1): 41--66

1988

[61] [61]

O'Hara, M. 1995. Market Microstructure Theory. Cambridge, MA: Blackwell Publishers

1995

[62] [62]

Al-Chami, and J

Rahman, N., J. Al-Chami, and J. Clark. 2025. SoK: Market Microstructure for Decentralized Prediction Markets (DePMs). arXiv preprint arXiv:2510.15612. https://arxiv.org/abs/2510.15612

arXiv 2025

[63] [63]

Reichenbach, F. and M. Walther. 2026. Exploring Decentralized Prediction Markets: Accuracy, Skill, and Bias on Polymarket. SSRN 5910522. https://ssrn.com/abstract=5910522

2026

[64] [64]

Roll, R. 1984. A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market. The Journal of Finance 39 (4): 1127--1139

1984

[65] [65]

Sirolly, A., H. Ma, Y. Kanoria, and R. Sethi. 2025. Network-Based Detection of Wash Trading. SSRN 5714122. https://ssrn.com/abstract=5714122

2025

[66] [66]

Slivkoff, N. 2025. Polymarket Volume Is Being Double-Counted. Paradigm Research Note

2025

[67] [67]

Snowberg, E. and J. Wolfers. 2010. Explaining the Favorite--Longshot Bias: Is It Risk-Love or Misperceptions? Journal of Political Economy 118 (4): 723--746

2010

[68] [68]

Wolfers, J. and E. Zitzewitz. 2004. Prediction Markets. Journal of Economic Perspectives 18 (2): 107--126

2004

[69] [69]

Yang, Z. and K. P. Tsang. 2026. The Anatomy of a Blockchain Prediction Market: Polymarket in the 2024 U.S. Presidential Election. arXiv preprint arXiv:2603.03136. https://arxiv.org/abs/2603.03136. SSRN 6336679

Pith/arXiv arXiv 2026