Structure and Content of the Visible Darknet
pith:A4BWXSJG Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{A4BWXSJG}
Prints a linked pith:A4BWXSJG badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
In this paper, we analyze the topology and the content found on the "darknet", the set of websites accessible via Tor. We created a darknet spider and crawled the darknet starting from a bootstrap list by recursively following links. We explored the whole connected component of more than 34,000 hidden services, of which we found 10,000 to be online. Contrary to folklore belief, the visible part of the darknet is surprisingly well-connected through hub websites such as wikis and forums. We performed a comprehensive categorization of the content using supervised machine learning. We observe that about half of the visible dark web content is related to apparently licit activities based on our classifier. A significant amount of content pertains to software repositories, blogs, and activism-related websites. Among unlawful hidden services, most pertain to fraudulent websites, services selling counterfeit goods, and drug markets.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
A traffic analysis attack against Introduction Protocol and Onion Services
A practical intersection attack identifies each hop toward a Tor onion service using single-relay observations and repeated probe intersections within INTRODUCE1-RENDEZVOUS2 intervals.
-
Topical Shifts in the Dark Web: A Longitudinal Analysis of Content from the Cybercrime Ecosystem
Longitudinal topic modeling on a large dark web dataset finds 75% of discussion volume in persistent core topics with a median lifespan of 75 months and only 3% in short-lived themes.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.