pith. sign in

arxiv: 1806.01156 · v3 · pith:XUUHSNUDnew · submitted 2018-06-04 · 💻 cs.CR · cs.NI

Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation

classification 💻 cs.CR cs.NI
keywords rankingscompositionalexadomainslistlistsmanipulatemanipulation
0
0 comments X
read the original abstract

In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: we show for four main rankings how their inherent properties (similarity, stability, representativeness, responsiveness and benignness) affect their composition and therefore potentially skew the conclusions made in studies. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that the ranks of domains in each of the lists are easily altered, in the case of Alexa through as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. To overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide Tranco, an improved ranking that we offer through an online service available at https://tranco-list.eu.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Invisible Ink of the Android Malware World: A Longitudinal Study on the Usage of Covert Communication Channels

    cs.CR 2026-06 unverdicted novelty 7.0

    First longitudinal study reports covert channel usage in Android malware grew exponentially from 0.30% in 2012 to 50% in 2025 across 511 families, with some switching channels multiple times.

  2. WebSP-Eval: Evaluating Web Agents on Website Security and Privacy Tasks

    cs.CR 2026-04 unverdicted novelty 7.0

    WebSP-Eval shows that multimodal LLM-based web agents fail more than 45% of the time on security and privacy tasks involving stateful UI elements such as toggles and checkboxes.

  3. BaRA: Budget-constrained and Reliable Web Data Collection Agent

    cs.IR 2026-05 unverdicted novelty 5.0

    BaRA combines bounded BFS traversal with self-reflection to improve site-level web data collection and multimodal extraction for LLMs.

  4. Global Web, Local Privacy? An International Review of Web Tracking

    cs.CR 2026-04 conditional novelty 5.0

    Global top sites show 50.5% fewer tracker connections from EU countries than non-EU ones, and ignoring cookie banners cuts trackers by 48.5% in Germany.

  5. BaRA: Budget-constrained and Reliable Web Data Collection Agent

    cs.IR 2026-05 unverdicted novelty 4.0

    BaRA improves valid link discovery and multimodal artifact extraction in budget-constrained web data collection via BFS liveness checks, rule-based validation, and self-reflection.