pith. sign in

arxiv: 1907.01538 · v1 · pith:ZDBTTRBRnew · submitted 2019-07-02 · 💻 cs.CR · cs.SI

Taint analysis of the Bitcoin network

Pith reviewed 2026-05-25 10:50 UTC · model grok-4.3

classification 💻 cs.CR cs.SI
keywords Bitcointaint analysiswallet scoringTaintRankcryptocurrency trusttransaction historyblockchain forensicsexchange security
0
0 comments X

The pith

TaintRank scores a Bitcoin wallet's taint by aggregating every address it has transacted with in its history.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TaintRank to address the lack of any ratings that tell vendors or exchanges how tainted the Bitcoins they receive might be. It calculates the score for a given wallet by factoring in the full set of addresses that wallet has interacted with over time. This approach would let exchanges evaluate the risk of trading with a particular wallet and reduce their exposure to stolen or ill-gotten funds. Without such a method, exchanges have no systematic way to gauge the trustworthiness of incoming Bitcoins.

Core claim

TaintRank is a Bitcoin address taint score that provides insight into a specific wallet by taking the addresses it interacted with throughout history into consideration. This ranking method provides Bitcoin exchange companies insight with whom they are trading.

What carries the argument

TaintRank, a numerical score that aggregates a wallet's complete historical transaction partners to produce a taint indicator.

If this is right

  • Exchanges can now obtain a concrete numerical signal about the risk level of Bitcoins received from any given address.
  • Vendors gain a way to screen incoming payments before accepting them as payment.
  • The Bitcoin network gains its first systematic ranking of addresses according to their interaction histories.
  • Liability for receiving stolen coins can be reduced by checking the score before a trade completes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If exchanges adopt the score, high-TaintRank addresses could face effective blacklisting even without legal orders.
  • Users might begin routing transactions through more intermediaries to lower their visible score.
  • The same aggregation logic could be applied to other blockchains that record address interactions.
  • Regulators could later require disclosure of TaintRank values for large transfers.

Load-bearing premise

That a wallet's past transaction partners can be aggregated into a reliable numerical indicator of taint without additional validation or external ground truth.

What would settle it

A dataset of known theft cases where wallets with high TaintRank scores show no higher rate of involvement than low-scoring wallets would falsify the method's reliability.

Figures

Figures reproduced from arXiv: 1907.01538 by Andra\v{z} Pov\v{s}e, Uro\v{s} Hercog.

Figure 1
Figure 1. Figure 1: Degree distribution in the created network for both in and out degree. degree distribution. For further examination, we look at five addresses with the highest node degree as seen in table 2 [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Degree distribution in the subgraph created from the network for both in and out degree. To reach the last node starting from the tainted one, we have to make 1660 steps. Based on this observation, we have more than enough nodes, that require different TaintRank scores, since we know they are not equally tainted. With this, we hope to gain some insight into nodes that are truly tainted, and make sure that … view at source ↗
Figure 4
Figure 4. Figure 4: Using distance to determine TaintRank. Displayed are top 1000 highest TaintRank scores. Our reasoning for this method was, the further you would be from the tainted node, the less tainted you would get. Of course this kind of spreading leaves us vulnerable when a thief intentionally makes a long chain before spending the stolen Bitcoins. D. Combined approach. With the above, we have seen the shortcomings o… view at source ↗
Figure 3
Figure 3. Figure 3: Using link weight and node value to determine TaintRank. Displayed are top 1000 highest TaintRank scores for both approaches at determining the value of the node. Y-axis is logarithmic for better separation. The benefits of this approach are that not all nodes get tainted equally but rather proportional to the amount received. Negatives include a dropping TaintRank as we split the stolen funds into lots of… view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of different approaches at calculating TaintRank. Displayed are top 1000 nodes based on highest TaintRank. E. PageRank like propagation. For this method we define our taint propagation similarly to how PageRank works. First let’s define mi as all in-edges for node ni, m0 i as all tainted in-edges for node ni, and ki and k 0 i as the number tainted and all out-edges for node ni. We define an init… view at source ↗
Figure 6
Figure 6. Figure 6: PageRank like propagation for non-clustered with one iteration 6. Discussion We’ve developed five different methods and all of them give at least slightly different results. This makes the evaluation even more difficult. Knowing whether a specific method’s results are relevant, without understanding the underlying socioeconomic mechanics, makes this problem even more difficult to tackle. Results provided b… view at source ↗
Figure 7
Figure 7. Figure 7: PageRank like propagation for non-clustered with five iterations [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: PageRank like propagation for clustered with one iteration [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: PageRank like propagation for cluster with five iterations all of the approaches. We have seen the pros and cons of each and tried some combinations of them. Entire computations are computationally very efficient with time complexity for the first four iterative approaches of O(n). The assigned scores give us some insight on what nodes we would like to avoid as a Bitcoin exchange company or a merchant. If … view at source ↗
read the original abstract

Determining the trust of an individual Bitcoin wallet is a difficult problem. There are no ratings, that offer vendors or exchanges meaningful information about the level of the taint of Bitcoins they are receiving. Lack of such information places exchanges liable in an event when the received Bitcoins are stolen or ill-gotten. In this paper, we try to solve this problem by introducing a Bitcoin address taint score called TaintRank. It provides insight into a specific wallet by taking the addresses it interacted with throughout history into consideration. This ranking method provides such Bitcoin exchange companies insight with whom they are trading.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces TaintRank, a Bitcoin address taint score intended to give exchanges insight into wallet trustworthiness by aggregating the addresses with which a given wallet has interacted over its transaction history.

Significance. A reproducible, validated numerical indicator of taint derived solely from the interaction graph could be useful for compliance and risk assessment at exchanges. The manuscript supplies neither the aggregation function nor any empirical validation against known tainted flows, so the practical utility cannot be assessed.

major comments (2)
  1. [Abstract] Abstract: the central claim that TaintRank 'provides insight into a specific wallet by taking the addresses it interacted with throughout history into consideration' cannot be evaluated because the manuscript contains no algorithm, no equations, no dataset, and no evaluation of the proposed score.
  2. [Abstract] Abstract: without a precise definition of the aggregation function or any ground-truth comparison, it is impossible to determine whether the resulting numerical score tracks actual taint or is an untested modeling assumption.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review and constructive feedback on our manuscript introducing TaintRank. We address each major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that TaintRank 'provides insight into a specific wallet by taking the addresses it interacted with throughout history into consideration' cannot be evaluated because the manuscript contains no algorithm, no equations, no dataset, and no evaluation of the proposed score.

    Authors: The referee is correct that the submitted manuscript presents TaintRank only at a high level and does not include an explicit algorithm, equations, dataset description, or empirical evaluation. This omission prevents full assessment of the central claim. We will revise the manuscript to add a precise algorithmic definition of TaintRank, the relevant equations for the aggregation function, a description of the dataset used, and an initial evaluation of the score. revision: yes

  2. Referee: [Abstract] Abstract: without a precise definition of the aggregation function or any ground-truth comparison, it is impossible to determine whether the resulting numerical score tracks actual taint or is an untested modeling assumption.

    Authors: We agree that the current manuscript lacks both a precise definition of the aggregation function and any ground-truth comparison. Without these elements the practical meaning of the numerical score cannot be verified. In the revision we will supply the exact aggregation function together with, to the extent feasible, comparisons against known tainted transaction flows or other reference data. revision: yes

Circularity Check

0 steps flagged

No circularity detected; TaintRank introduced as direct definitional aggregation with no equations or self-referential reductions.

full rationale

The provided abstract defines TaintRank solely by its construction from historical address interactions and supplies neither equations, fitted parameters, predictions, nor citations. Absent any derivation chain, no step reduces to inputs by construction, self-citation, or renaming. The paper's central claim is therefore a modeling assumption presented without internal circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or independent evidence for the invented scoring entity.

invented entities (1)
  • TaintRank no independent evidence
    purpose: Bitcoin address taint score derived from interaction history
    Newly named construct introduced to quantify wallet trust

pith-pipeline@v0.9.0 · 5623 in / 994 out tokens · 30712 ms · 2026-05-25T10:50:26.744149+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 2 internal anchors

  1. [1]

    The hacks dispossessed them of hundreds of thousands of bitcoins

    Motivation Another consequence of the quick growth of interest in cryp- tocurrencies and the lack of proper system security audits were the many hacks of exchanges. The hacks dispossessed them of hundreds of thousands of bitcoins. Because the Bit- coin network offers pseudo-anonymity to its users, it prevents any linking between addresses and any personall...

  2. [2]

    Related work The area of cryptocurrency network and transaction analysis hasreceivedquitealotofattentioninrecentyears. Researchers have done extensive research on the analysis of the level of anonymity that the network provides (1) or with the use of different third-party services called mixers or tumblers (2–4). These services promise to obfuscate the rea...

  3. [3]

    We first construct a directed network where each node represents a Bitcoin address and each directed link a weighted transaction between the two addresses

    Methods Methods we tested are based on node importance. We first construct a directed network where each node represents a Bitcoin address and each directed link a weighted transaction between the two addresses. Each of the nodes in the network receives its own TaintRank based on different parameters such as node degree and edge weight. We use prior knowled...

  4. [4]

    It has millions of addresses and hundreds of millions of transaction

    Selected data Since its inception, the Bitcoin network has grown in size exponentially. It has millions of addresses and hundreds of millions of transaction. This scale poses a great challenge when trying to analyze it. To make the problem more contained, we select a subset of the network which proves to be a good decision. We extract 285,591 transactions...

  5. [5]

    In the selected data, we know the address where the initial stolen funds went and we base our propagation on it

    Results With the methods described in 3, we test our approach on the network. In the selected data, we know the address where the initial stolen funds went and we base our propagation on it. Based on that, we start to propagate the taint throughout the network by recursively following all the out links. The number of nodes which could be reached following...

  6. [6]

    This makes the evaluation even more difficult

    Discussion We’ve developed five different methods and all of them give at least slightly different results. This makes the evaluation even more difficult. Knowing whether a specific method’s results are relevant, without understanding the underlying socioeconomic mechanics, makes this problem even more difficult to tackle. Results provided by the approaches make ...

  7. [7]

    The ever-expanding nature makes it very difficult to process the whole network fast enough for it to stay relevant with every new block

    Conclusion Analyzing the Bitcoin network is a big challenge just by itself. The ever-expanding nature makes it very difficult to process the whole network fast enough for it to stay relevant with every new block. With the increasing popularity the number of unique addresses and transactions between them grow expo- nentially. But given enough resources and s...

  8. [8]

    An Analysis of Anonymity in the Bitcoin System

    Reid F , Harrigan M (2011) An Analysis of Anonymity in the Bitcoin System. arXiv e-prints p. arXiv:1107.4524

  9. [9]

    (Springer), pp

    de Balthasar T, Hernandez-Castro J (2017) An analysis of bitcoin laundry services in Nordic Conference on Secure IT Systems. (Springer), pp. 297–312

  10. [10]

    van Wegberg R, Oerlemans JJ, van Deventer O (2018) Bitcoin money laundering: mixed re- sults? an explorative study on money laundering of cybercrime proceeds using bitcoin.Journal of Financial Crime 25(2):419–435

  11. [11]

    (Ieee), pp

    Möser M, Böhme R, Breuker D (2013) An inquiry into money laundering tools in the bitcoin ecosystem in 2013 APWG eCrime Researchers Summit. (Ieee), pp. 1–14

  12. [12]

    (Springer), pp

    Spagnuolo M, Maggi F , Zanero S (2014) Bitiodine: Extracting intelligence from the bitcoin network in International Conference on Financial Cryptography and Data Security. (Springer), pp. 457–468

  13. [13]

    Malovrh J (2018) Bitcoin anomalies analysis

  14. [14]

    Battiston S, Puliga M, Kaushik R, Tasca P , Caldarelli G (2012) Debtrank: Too central to fail? financial networks, the fed and systemic risk.Scientific reports 2:541

  15. [15]

    (2013) A fistful of bitcoins: characterizing payments among men with no names in Proceedings of the 2013 conference on Internet measurement conference

    Meiklejohn S, et al. (2013) A fistful of bitcoins: characterizing payments among men with no names in Proceedings of the 2013 conference on Internet measurement conference . (ACM), pp. 127–140

  16. [16]

    (IEEE), pp

    Maurer FK, Neudecker T, Florian M (2017) Anonymous coinjoin transactions with arbitrary values in 2017 IEEE Trustcom/BigDataSE/ICESS. (IEEE), pp. 522–529. 6 | Povše and Hercog