pith. sign in

arxiv: 2604.17183 · v1 · submitted 2026-04-19 · 💻 cs.CE · cs.LG· econ.EM

A Model and Estimation of the Bitcoin Transaction Fee

Pith reviewed 2026-05-10 06:17 UTC · model grok-4.3

classification 💻 cs.CE cs.LGecon.EM
keywords Bitcointransaction feesmempoolfee marketVickrey-Clarke-Groves mechanismconfirmation delaystructural estimationcongestion
0
0 comments X

The pith

Bitcoin fees price the marginal value of priority as congestion shapes expected confirmation delays in the mempool.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a structural model of Bitcoin fee choice by treating the mempool as a market for scarce blockspace and using high-frequency panel data from a running node that tracks arrivals, exits, inclusions, fee bumps, and congestion. It first estimates a monotone delay technology that connects fee-rate priority and network state to expected confirmation times, then models how fees respond to that technology along with transaction features. The results show congestion as the dominant driver of delay, fees incorporating the rising marginal value of moving up the priority queue, and replace-by-fee, child-pays-for-parent, and block conditions exerting sizable effects. This matters because blockchain data alone cannot observe the queueing environment that determines fees, especially as block subsidies decline.

Core claim

The paper characterizes the mempool as a Vickrey-Clarke-Groves mechanism for allocating blockspace and derives an equation to estimate fees from it. In the first stage a monotone delay technology is estimated that links fee-rate priority and network state to expected confirmation delay. Fees are then shown to respond to this technology and to transaction characteristics, producing the findings that congestion is the main determinant of delay, that the marginal value of priority is priced in fees and increases in the gradient of confirmation time reduction per movement up the fee queue, and that transactor choices of RBF, CPFP, and block conditions have economically important effects on fees.

What carries the argument

The mempool treated as a Vickrey-Clarke-Groves mechanism, which supplies the structural equation for estimating fees from the estimated delay technology.

If this is right

  • Congestion levels primarily determine transaction confirmation delays.
  • The marginal value of priority increases with greater reductions in confirmation time per movement up the fee queue.
  • Transactor choices of replace-by-fee, child-pays-for-parent, and block conditions exert economically significant effects on observed fees.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Wallet fee estimators could incorporate real-time congestion snapshots to improve predictions of required fees.
  • Changes to network rules that alter bumping options would directly shift equilibrium fee levels under the model.
  • The same structural approach could be applied to other proof-of-work chains to compare their blockspace markets.

Load-bearing premise

Users truthfully reveal their value for priority through fee choice in the modeled mechanism, without strategic manipulation beyond the included bumping options.

What would settle it

High-frequency mempool data showing that fees fail to rise with steeper gradients of confirmation-time reduction from higher priority would falsify the claim that marginal priority value is priced in fees.

Figures

Figures reproduced from arXiv: 2604.17183 by Armin Sabouri, Daniel Aronoff, Kristian Praizner.

Figure 1
Figure 1. Figure 1: Source: Cambridge Centre for Alternative Energy [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: is a chart of the fee distribution of in our data, sorted by fee order. Upon visual inspection it has a convex shape that tends to increase in the order (from left to right). This motivates our choice of a log form of the VCG equation (Equation 5) as the basis of our estimating equation [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Actual vs. predicted log(waittime) for a random 10,000-observation sub￾sample; the dashed line is the 45-degree reference. Feature importance. Blockspace utilization dominates, accounting for 52.0% of total feature importance in the fitted forest, followed by mempool transaction count (24.6%), mempool size (13.7%), and priority percentile (9.7%). This ordering con￾firms that the level of network congestion… view at source ↗
Figure 4
Figure 4. Figure 4: Random Forest feature importances for the Stage 1 delay technology model. [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Delay–priority relationship stratified by mempool congestion regime. The [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Left: Breakdown of epochs by delay gradient regime. Trivial epochs (flat delay technology, Wˆ ′ ≈ 0) account for 79.8% of the sample; the structural priority channel is active in only 20.2% of epochs. Right: Log fee-rate distributions by regime. The two distributions largely overlap, with the non-trivial regime exhibiting a slightly heavier right tail consistent with congestion-driven fee differentiation. … view at source ↗
Figure 7
Figure 7. Figure 7: Actual vs. predicted log fee rate with a 10k subsample plotted. [PITH_FULL_IMAGE:figures/full_fig_p029_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Left: Residual density with a fitted normal overlay. Right: Residuals vs. fitted values [PITH_FULL_IMAGE:figures/full_fig_p030_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: displays the base-feature coefficient estimates with 95% epoch-clustered con￾fidence intervals [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Intraclass correlation coefficients for all model features. [PITH_FULL_IMAGE:figures/full_fig_p032_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Cumulative precision curve for αˆ1: the observed SE (red) tracks the theoretical 1/ √ k rate (green dashed). Rolling-window stability. We re-estimate the full model on five non-overlapping temporal windows of approximately 400 epochs each. The structural coefficient αˆ1 ranges from −0.054 to −0.039 across windows ( [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Rolling-window FWL coefficient estimates ( [PITH_FULL_IMAGE:figures/full_fig_p034_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Out-of-sample R2 learning curves under three evaluation protocols; the shaded wedge shows the 7–9 pp ∆R2 contributed by the delay gradient. Variance decomposition [PITH_FULL_IMAGE:figures/full_fig_p035_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Variance decomposition of the 80/20 temporal split: the model ab￾sorbs within-epoch variance but increases between-epoch variance, confirming cross￾sectional rather than temporal explanatory power. Epoch fixed-effect persistence. The autocorrelation of the estimated epoch fixed effects ( [PITH_FULL_IMAGE:figures/full_fig_p036_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Autocorrelogram of epoch fixed effects. Implications. The structural coefficients drift across congestion regimes—αˆ1 ranges from −0.054 to −0.039 across five temporal windows—yet the model generalises well out-of-sample: the VCG channel adds 7–9 percentage points of R2 within epochs even when trained on disjoint time periods. The variance decomposition confirms that the model’s strength is cross-sectiona… view at source ↗
Figure 16
Figure 16. Figure 16: Second-stage coefficient estimates with 95% confidence intervals (with spline). Blue bars indicate positive effects on log fee rate; red bars indicate negative effects. Impatience has little to no effect. This is nearly identical to the baseline linear specification, indicating that the core VCG channel is not an artefact of imposing a linear impatience term or omitting impatience entirely. Spline fit and… view at source ↗
Figure 17
Figure 17. Figure 17: Aggregate impatience effect. Left graph is the sum of each of the 7 spline [PITH_FULL_IMAGE:figures/full_fig_p051_17.png] view at source ↗
read the original abstract

Bitcoin transaction fees will become more important as the block subsidy declines, but fee formation is hard to study with blockchain data alone because the relevant queueing environment is unobserved. We develop and estimate a structural model of Bitcoin fee choice that treats the mempool as a market for scarce blockspace. We assemble a novel, high-frequency mempool panel, from a self-run Bitcoin node that records transaction arrivals, exits, block inclusion, fee-bumping events, and congestion snapshots. We characterize the fee market as a Vickery-Clarke-Groves mechanism and derive an equation to estimate fees. In the first-stage we estimate a monotone delay technology linking fee-rate priority and network state to expected confirmation delay. We then estimate how fees respond to that delay technology and to transaction characteristics. We find that congestion is the main determinant of delay; that the marginal value of priority is priced in fees, which is increasing in the gradient of confirmation time reduction per movement up in the fee queue; and that transactor choice of RBF, CPFP, and block conditions have economically important effects on fees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a structural model of Bitcoin transaction fee formation, characterizing the mempool as a Vickrey-Clarke-Groves mechanism for allocating scarce blockspace. Using a novel high-frequency mempool panel assembled from a self-run Bitcoin node (capturing arrivals, exits, inclusions, fee-bumping events, and congestion snapshots), it estimates a two-stage model: first, a monotone delay technology mapping fee-rate priority and network state to expected confirmation delay; second, the response of observed fees to this technology and transaction characteristics (including RBF, CPFP, and block conditions). The central findings are that congestion is the main determinant of delay, that fees price the marginal value of priority (increasing in the gradient of confirmation-time reduction per queue position), and that the modeled strategic options have economically important effects.

Significance. If the results hold, the work offers a valuable structural framework for understanding fee markets in Bitcoin and similar blockchains, especially as the block subsidy declines. The novel high-frequency dataset from a self-run node is a clear strength, enabling direct observation of the queueing environment that blockchain data alone cannot provide. The two-stage approach with an explicit delay technology provides a falsifiable link between congestion, priority gradients, and fees, which could inform both empirical studies and protocol design.

major comments (2)
  1. [Abstract / Model Description] Abstract and model description: The structural interpretation that 'the marginal value of priority is priced in fees' and that this value increases with the confirmation-time gradient rests directly on the VCG truthful-revelation assumption (with only the modeled RBF/CPFP and block-condition options treated as strategic deviations). The paper should provide either (a) a formal test or robustness check for unmodeled strategies (e.g., preemptive overbidding to deter future RBF or coordination across related transactions) or (b) an explicit statement of how the identifying variation in the second-stage fee equation remains valid if the VCG mapping is only approximate.
  2. [First-stage estimation] First-stage delay technology estimation: The claim that 'congestion is the main determinant of delay' requires reported standard errors, robustness to alternative functional forms for the monotone technology, and checks that the estimated gradient is not mechanically driven by the priority ordering itself. Without these, the second-stage finding that fees respond to the gradient cannot be assessed for statistical or economic significance.
minor comments (2)
  1. [Abstract] The abstract would be strengthened by including the key estimating equations or at least the functional form of the delay technology and the second-stage fee equation.
  2. [Data and Variables] Clarify the exact definition of 'network state' variables used in the delay technology and whether they are observed at the moment of transaction arrival or averaged over the confirmation window.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which highlight important aspects of our structural approach and estimation. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / Model Description] Abstract and model description: The structural interpretation that 'the marginal value of priority is priced in fees' and that this value increases with the confirmation-time gradient rests directly on the VCG truthful-revelation assumption (with only the modeled RBF/CPFP and block-condition options treated as strategic deviations). The paper should provide either (a) a formal test or robustness check for unmodeled strategies (e.g., preemptive overbidding to deter future RBF or coordination across related transactions) or (b) an explicit statement of how the identifying variation in the second-stage fee equation remains valid if the VCG mapping is only approximate.

    Authors: We appreciate the referee's emphasis on the role of the VCG assumption in our structural interpretation. Our model derives the fee equation from the VCG mechanism for blockspace allocation while explicitly incorporating the main observed strategic options (RBF, CPFP, and block conditions) as deviations. Unmodeled behaviors such as preemptive overbidding or cross-transaction coordination are not directly tested in the current version, as our high-frequency mempool panel focuses on capturing the realized queueing environment rather than counterfactual strategy spaces. However, the identifying variation in the second-stage fee equation stems from observed differences in estimated delay gradients across congestion states and transaction types, which are directly measured from the self-run node data. This variation remains informative for the marginal value of priority even under an approximate VCG mapping, as fees are shown to track the expected confirmation-time reduction per queue position. We will add an explicit statement in the revised manuscript clarifying these identifying assumptions and discussing the robustness of the second-stage results to departures from exact VCG revelation. revision: partial

  2. Referee: [First-stage estimation] First-stage delay technology estimation: The claim that 'congestion is the main determinant of delay' requires reported standard errors, robustness to alternative functional forms for the monotone technology, and checks that the estimated gradient is not mechanically driven by the priority ordering itself. Without these, the second-stage finding that fees respond to the gradient cannot be assessed for statistical or economic significance.

    Authors: We agree that standard errors and further robustness checks are essential for supporting the first-stage claims and enabling evaluation of the second-stage results. In the revised version, we will report standard errors for all first-stage delay technology parameters. We will also include robustness checks using alternative monotone functional forms (e.g., different parametric or semi-parametric specifications that preserve monotonicity in fee-rate priority and congestion). On the concern that the gradient could be mechanically driven by priority ordering: the delay technology is estimated from observed confirmation times conditional on both fee-rate rank and exogenous network-state snapshots (congestion levels), with variation arising across different mempool realizations rather than from the ordering alone. We will add explicit checks, including subsample analyses by congestion regime and comparisons of gradients for similar priority positions under varying states, to demonstrate that the estimated effects reflect congestion-driven delay rather than mechanical rank. These additions will allow statistical and economic assessment of the second-stage fee responses. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper first assembles mempool panel data and estimates a monotone delay technology from observed arrivals, exits, inclusions, and congestion snapshots. It then derives an estimating equation under the VCG mechanism assumption and estimates how fees respond to the fitted delay gradient and transaction characteristics. This two-stage structure uses data-driven first-stage estimates as inputs to the second stage without any reduction of predictions to inputs by construction, self-definition, or self-citation chains. The VCG characterization is an external identifying assumption drawn from standard mechanism design, not a self-referential step. No load-bearing self-citations, ansatzes smuggled via citation, or renaming of known results appear in the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies insufficient detail to enumerate free parameters, axioms, or invented entities; the VCG characterization and monotone delay technology are invoked without stated assumptions or external benchmarks.

pith-pipeline@v0.9.0 · 5489 in / 1162 out tokens · 39375 ms · 2026-05-10T06:17:14.451295+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

  1. [1]

    Double/debiased machine learning for treatment and structural parameters

    Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21 0 (1): 0 C1--C68, 2018. doi:10.1111/ectj.12097

  2. [2]

    Edward H. Clarke. Multipart pricing of public goods. Public Choice, 11 0 (1): 0 17--33, 1971. doi:10.1007/BF01726210

  3. [3]

    From mining to markets: The evolution of bitcoin transaction fees

    David Easley, Maureen O'Hara, and Soumya Basu. From mining to markets: The evolution of bitcoin transaction fees. Journal of Financial Economics, 134 0 (1): 0 91--109, 2019. ISSN 0304-405X. doi:https://doi.org/10.1016/j.jfineco.2019.03.004. URL https://www.sciencedirect.com/science/article/pii/S0304405X19300583

  4. [4]

    Incentives in teams

    Theodore Groves. Incentives in teams. Econometrica, 41 0 (4): 0 617--631, 1973. doi:10.2307/1914085

  5. [5]

    The Review of Economic Studies , volume =

    Gur Huberman, Jacob D Leshno, and Ciamac Moallemi. Monopoly without a Monopolist: An Economic Analysis of the Bitcoin Payment System . The Review of Economic Studies, 88 0 (6): 0 3011--3040, 03 2021. doi:10.1093/restud/rdab014. URL https://doi.org/10.1093/restud/rdab014

  6. [6]

    Alfred Lehar and Christine A. Parlour. Miner collusion and the bitcoin protocol. Technical report, SSRN, March 22 2020. URL https://ssrn.com/abstract=3559894

  7. [7]

    oser and Rainer B\

    Malte M\"oser and Rainer B\"ohme. Trends, tips, tools: A longitudinal study of bitcoin transaction fees. In M. Brenner, N. Christin, B. Johnson, and K. Rohloff, editors, Financial Cryptography and Data Security, 2nd Workshop on BITCOIN Research, volume 8976 of Lecture Notes in Computer Science, pages 19--33. Springer, 2015

  8. [8]

    Econometric issues in the analysis of regressions with generated regressors

    Adrian Pagan. Econometric issues in the analysis of regressions with generated regressors. International Economic Review, 25 0 (1): 0 221--247, 1984. doi:10.2307/2526417

  9. [9]

    Counterspeculation, auctions, and competitive sealed tenders

    William Vickrey. Counterspeculation, auctions, and competitive sealed tenders. The Journal of Finance, 16 0 (1): 0 8--37, 1961. doi:10.1111/j.1540-6261.1961.tb02789.x