pith. sign in

arxiv: 1811.01348 · v2 · pith:A4BWXSJGnew · submitted 2018-11-04 · 💻 cs.CY · cs.CR· cs.SI

Structure and Content of the Visible Darknet

classification 💻 cs.CY cs.CRcs.SI
keywords contentdarknetwebsitesservicesvisiblefoundhiddenaccessible
0
0 comments X p. Extension
pith:A4BWXSJG Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{A4BWXSJG}

Prints a linked pith:A4BWXSJG badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

In this paper, we analyze the topology and the content found on the "darknet", the set of websites accessible via Tor. We created a darknet spider and crawled the darknet starting from a bootstrap list by recursively following links. We explored the whole connected component of more than 34,000 hidden services, of which we found 10,000 to be online. Contrary to folklore belief, the visible part of the darknet is surprisingly well-connected through hub websites such as wikis and forums. We performed a comprehensive categorization of the content using supervised machine learning. We observe that about half of the visible dark web content is related to apparently licit activities based on our classifier. A significant amount of content pertains to software repositories, blogs, and activism-related websites. Among unlawful hidden services, most pertain to fraudulent websites, services selling counterfeit goods, and drug markets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A traffic analysis attack against Introduction Protocol and Onion Services

    cs.CR 2026-02 conditional novelty 6.0

    A practical intersection attack identifies each hop toward a Tor onion service using single-relay observations and repeated probe intersections within INTRODUCE1-RENDEZVOUS2 intervals.

  2. Topical Shifts in the Dark Web: A Longitudinal Analysis of Content from the Cybercrime Ecosystem

    cs.CR 2026-05 unverdicted novelty 5.0

    Longitudinal topic modeling on a large dark web dataset finds 75% of discussion volume in persistent core topics with a median lifespan of 75 months and only 3% in short-lived themes.