pith. sign in

arxiv: 2605.21937 · v1 · pith:L6DQPVHQnew · submitted 2026-05-21 · 💻 cs.NI

Lost in the Prefix: Revisiting IP Geolocation Accuracy Across Networks and Geographies

Pith reviewed 2026-05-22 03:22 UTC · model grok-4.3

classification 💻 cs.NI
keywords IP geolocationmobile networksprefix granularityBGP announcementsGlobal Southgeolocation accuracyfailure ratesnetwork types
0
0 comments X

The pith

Coarser IP prefixes cause mobile networks and Global South regions to have far higher geolocation errors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates four major IP geolocation databases against ground-truth locations collected across 175 countries. It finds median errors on mobile networks exceed 179 km while fixed networks stay under 16 km, and failure rates climb above 50 percent in Asia and Africa compared to under 20 percent in Europe. Both patterns trace to the same source: many provider prefixes are coarser than the BGP announcements they cover, with roughly 70 percent of mobile prefixes spanning more than 100 km. Coarser prefixes produce the largest errors no matter which database or geography is examined. This establishes prefix granularity as the main factor behind the observed differences in accuracy.

Core claim

The paper establishes that geolocation accuracy depends on the granularity of provider prefixes relative to BGP announcements. When prefixes are coarser, assigned locations become less precise. This produces median errors of 179-207 km on mobile networks versus 3-16 km on fixed networks. The same coarseness drives higher failure rates in Global South regions. About 70 percent of mobile prefixes span more than 100 km geographically, and coarser prefixes consistently yield the highest errors across all providers, network types, and geographies.

What carries the argument

Prefix granularity, defined as whether provider prefixes are coarser than BGP announcements and how many kilometers they span.

If this is right

  • Mobile networks will keep showing large errors until their prefixes become finer.
  • Global South regions will continue to have elevated failure rates due to coarser prefixes.
  • Coarser prefixes will produce the highest errors regardless of which geolocation provider is used.
  • Finer prefix granularity would lower errors uniformly across network types and geographies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applications that rely on IP geolocation for mobile users may need to add separate error estimates by network type.
  • Network operators could improve accuracy by announcing more specific prefixes where possible.
  • Studies using these databases should separate results by network type to reduce systematic bias.
  • Future geolocation methods might directly factor prefix span into their location estimates.

Load-bearing premise

The ground-truth locations from the measurement probes accurately reflect the true geographic positions of the tested IP addresses without bias by network type or region.

What would settle it

A set of mobile prefixes that are fine-grained yet still produce errors above 100 km, or fixed-network prefixes that are coarse yet produce errors below 20 km.

Figures

Figures reproduced from arXiv: 2605.21937 by Jocelyn Bliton, Shaddi Hasan, Syed Tauhidun Nabi, Tijay Chung.

Figure 1
Figure 1. Figure 1: Geolocation error for fixed and mobile networks. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Geolocation failure rate (error > 100 km) by continent and global region across four providers. Asia and Africa exceed 50% across all providers; Europe and the Americas stay below 25%. all satellite), and Turks and Caicos observations assigned to Ja￾maica by IP2Location and DB-IP (15 observations each). Full wrong-country assignment counts by continent are shown in Appendix ( [PITH_FULL_IMAGE:figures/full… view at source ↗
Figure 3
Figure 3. Figure 3: Within-prefix geographic spread for fixed and mobile [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Ground truth vantage points from RIPE Atlas (blue circles, n=10,561 probes) and UNICEF Giga (green circles, [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Wrong-country assignment count by continent across all four providers. While rare overall (fewer than 1% of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: CDF of geolocation error by BGP prefix classification across all four providers (IPv4). Larger (Coarser) prefixes [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
read the original abstract

IP geolocation databases are widely used in research, policy, and industry, yet their accuracy across network types and geographies remains poorly characterized. We present a large scale evaluation of four major providers (MaxMind GeoLite2, IPinfo, IP2Location, and DB-IP) using ground truth from RIPE Atlas and UNICEF Giga across 175 countries. We find that mobile networks exhibit median errors more than 10 times higher than fixed networks across all providers (179--207~km vs.\ 3--16~km), and that Global South regions show significantly higher failure rates than Global North: Asia exceeds 53--61\% and Africa 66--72\%, compared to 9--20\% in Europe. We trace both gaps to a shared structural source: provider prefixes in mobile networks and Global South geographies are more likely to be coarser than BGP announcements, and approximately 70\% of mobile prefixes span more than 100~km geographically. Our findings point to prefix granularity as a common explanatory factor: coarser prefixes consistently produce the highest errors regardless of provider, network type, or geography.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a large-scale empirical evaluation of four major IP geolocation providers (MaxMind GeoLite2, IPinfo, IP2Location, and DB-IP) using ground-truth locations from RIPE Atlas probes and UNICEF Giga school measurements across 175 countries. It reports substantially higher median geolocation errors for mobile networks (179-207 km) compared to fixed networks (3-16 km) across all providers, along with higher failure rates in Global South regions (Asia 53-61%, Africa 66-72%) versus Europe (9-20%). The authors attribute both discrepancies to a shared structural cause: provider prefixes in mobile networks and Global South geographies tend to be coarser than corresponding BGP announcements, with approximately 70% of mobile prefixes spanning more than 100 km geographically.

Significance. If the central findings hold, the work is significant because IP geolocation databases are widely deployed in research, policy, and industry applications, yet their performance limitations in mobile networks and Global South regions have been under-characterized. The identification of prefix granularity as a common explanatory factor across network types and geographies offers a concrete, actionable insight that could guide improvements in database construction and usage. The study is strengthened by its scale (175 countries, four providers) and the use of large external ground-truth datasets from RIPE Atlas and UNICEF Giga, which enable direct, falsifiable comparisons rather than relying on internal assumptions.

major comments (2)
  1. [Section 3] Section 3 (Methodology): The description of how provider prefixes are matched to BGP announcements and how failure rates are computed lacks sufficient detail on the exact matching criteria, handling of overlapping prefixes, and threshold definitions. This is load-bearing for the central claim, as the attribution of error gaps and failure rates to coarser prefixes depends directly on these measurement choices.
  2. [Section 4.1] Section 4.1 (Ground-truth data): The paper does not test or discuss potential systematic biases in the RIPE Atlas and UNICEF Giga ground-truth locations by network type or region (e.g., urban bias in mobile probes or routing through regional gateways for school IPs). If such biases correlate with the same factors as prefix coarseness, they could confound the reported median error differences (179-207 km vs. 3-16 km) and failure rate gaps.
minor comments (2)
  1. [Abstract] Abstract: The statistic that 'approximately 70% of mobile prefixes span more than 100 km' should report the precise percentage, the number of prefixes analyzed, and the section where this result is derived for immediate verifiability.
  2. [Figure 3] Figure 3 (or equivalent results figure): The error distribution plots would benefit from explicit annotation of the median values and interquartile ranges directly on the figure to facilitate comparison across network types without requiring cross-reference to the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped us improve the clarity and robustness of our analysis. We address each major comment below.

read point-by-point responses
  1. Referee: [Section 3] Section 3 (Methodology): The description of how provider prefixes are matched to BGP announcements and how failure rates are computed lacks sufficient detail on the exact matching criteria, handling of overlapping prefixes, and threshold definitions. This is load-bearing for the central claim, as the attribution of error gaps and failure rates to coarser prefixes depends directly on these measurement choices.

    Authors: We agree that additional methodological detail is needed for reproducibility. In the revised manuscript we will expand Section 3 to explicitly describe: (i) the prefix-matching procedure, which performs longest-prefix matching between provider and BGP announcements with a minimum overlap of 8 bits; (ii) resolution of overlapping prefixes by retaining the most specific BGP announcement; and (iii) the precise definition of failure rate as the fraction of IPs for which a provider returns either no location or an invalid coordinate. We will also add pseudocode for the matching algorithm to the appendix. revision: yes

  2. Referee: [Section 4.1] Section 4.1 (Ground-truth data): The paper does not test or discuss potential systematic biases in the RIPE Atlas and UNICEF Giga ground-truth locations by network type or region (e.g., urban bias in mobile probes or routing through regional gateways for school IPs). If such biases correlate with the same factors as prefix coarseness, they could confound the reported median error differences (179-207 km vs. 3-16 km) and failure rate gaps.

    Authors: We acknowledge this limitation. While we cannot collect new ground-truth data to quantify every possible bias, we will insert a dedicated limitations paragraph in Section 4.1 that (a) cites prior work on urban bias and gateway routing in RIPE Atlas and UNICEF Giga, (b) reports that the mobile-versus-fixed error gap remains statistically significant after stratifying by country-level probe density, and (c) notes that the same pattern appears independently in both ground-truth sources. These additions will make the potential confounding explicit without altering the core empirical claims. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical measurement study

full rationale

The paper conducts a large-scale empirical evaluation of commercial IP geolocation databases against independent ground-truth datasets (RIPE Atlas probes and UNICEF Giga school measurements) across 175 countries. All reported results—median errors by network type, failure rates by region, and the 70% statistic on mobile prefix geographic span—are direct measurements or comparisons against external data sources and BGP announcements. No derivations, fitted parameters, predictions, or self-citations are invoked as load-bearing steps; the attribution to prefix granularity follows from separate measurements of prefix sizes rather than any definitional or self-referential reduction. The analysis is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The evaluation rests on the accuracy of two external ground-truth datasets and the assumption that BGP announcements provide a reliable baseline for prefix coarseness.

axioms (2)
  • domain assumption RIPE Atlas and UNICEF Giga ground-truth locations are accurate and unbiased with respect to network type and geography.
    All error and failure-rate calculations depend on these locations being correct.
  • domain assumption BGP announcements represent the true geographic span of IP prefixes.
    Used to classify provider prefixes as coarser or not.

pith-pipeline@v0.9.0 · 5735 in / 1296 out tokens · 41185 ms · 2026-05-22T03:22:54.001820+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    AS to organization mapping dataset

    CAIDA. AS to organization mapping dataset. https://www.caida.org/catalog/datasets/ as-organizations/, 2026. Accessed: March 2026

  2. [2]

    Routeviews prefix to AS mappings dataset (IPv4 and IPv6).https://www.caida.org/catalog/ datasets/routeviews-prefix2as/, 2026

    CAIDA. Routeviews prefix to AS mappings dataset (IPv4 and IPv6).https://www.caida.org/catalog/ datasets/routeviews-prefix2as/, 2026. Snapshot: March 15, 2026

  3. [3]

    A deep dive into the accuracy of ip ge- olocation databases and its impact on online advertising

    Patricia Callejo, Marco Gramaglia, Ruben Cuevas, and Angel Cuevas. A deep dive into the accuracy of ip ge- olocation databases and its impact on online advertising. IEEE Transactions on Mobile Computing, 22(8):4359– 4373, 2022

  4. [4]

    DB-IP IP geolocation database.https:// db-ip.com/, 2026

    DB-IP. DB-IP IP geolocation database.https:// db-ip.com/, 2026. Accessed: March 27, 2026

  5. [5]

    A look at router geolocation in public and commercial databases

    Manaf Gharaibeh, Anant Shah, Bradley Huffaker, Han Zhang, Roya Ensafi, and Christos Papadopoulos. A look at router geolocation in public and commercial databases. InProceedings of the 2017 Internet Measurement Con- ference, pages 463–469, 2017

  6. [6]

    One IP address, many users: Detecting CGNAT to reduce collateral effects

    Vasilis Giotsas and Marwan Fayed. One IP address, many users: Detecting CGNAT to reduce collateral effects. Cloudflare Blog, 2025. Accessed: April 2026

  7. [7]

    Constraint-based geolocation of internet hosts

    Bamba Gueye, Artur Ziviani, Mark Crovella, and Serge Fdida. Constraint-based geolocation of internet hosts. In Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, pages 288–293, 2004

  8. [8]

    IP2Location database.https://www

    IP2Location. IP2Location database.https://www. ip2location.com/, 2026. Accessed: March 25, 2026

  9. [9]

    IPinfo IP geolocation database.https:// ipinfo.io/, 2025

    IPinfo. IPinfo IP geolocation database.https:// ipinfo.io/, 2025. Accessed: November 2025. 6

  10. [10]

    The international identification plan for pub- lic networks and subscriptions

    ITU-T. The international identification plan for pub- lic networks and subscriptions. Technical Report Recommendation E.212, International Telecommunica- tion Union, Telecommunication Standardization Sector, Geneva, June 2024

  11. [11]

    Trust, but verify, operator- reported geolocation.arXiv preprint arXiv:2409.19109, 2024

    Katherine Izhikevich, Ben Du, Sumanth Rao, Alisha Ukani, and Liz Izhikevich. Trust, but verify, operator- reported geolocation.arXiv preprint arXiv:2409.19109, 2024

  12. [12]

    Towards ip geolocation using delay and topology mea- surements

    Ethan Katz-Bassett, John P John, Arvind Krishnamurthy, David Wetherall, Thomas Anderson, and Yatin Chawathe. Towards ip geolocation using delay and topology mea- surements. InProceedings of the 6th ACM SIGCOMM conference on Internet measurement, pages 71–84, 2006

  13. [13]

    The simple difference formula: An ap- proach to teaching nonparametric correlation.Compre- hensive Psychology, 3:11–IT, 2014

    Dave S Kerby. The simple difference formula: An ap- proach to teaching nonparametric correlation.Compre- hensive Psychology, 3:11–IT, 2014

  14. [14]

    On a test of whether one of two random variables is stochastically larger than the other.The annals of mathematical statis- tics, pages 50–60, 1947

    Henry B Mann and Donald R Whitney. On a test of whether one of two random variables is stochastically larger than the other.The annals of mathematical statis- tics, pages 50–60, 1947

  15. [15]

    GeoLite2 geolocation data.https://www

    MaxMind. GeoLite2 geolocation data.https://www. maxmind.com/, 2026. Accessed: March 25, 2026

  16. [16]

    The M-Lab NDT data set.https: //measurementlab.net/tests/ndt, 2026

    Measurement Lab. The M-Lab NDT data set.https: //measurementlab.net/tests/ndt, 2026. Bigquery tablemeasurement-lab.ndt.download

  17. [17]

    United states broadband usage percentages dataset

    Microsoft. United states broadband usage percentages dataset. GitHub repository, 2022

  18. [18]

    Red is sus: Automated identification of low-quality service availability claims in the us national broadband map

    Syed Tauhidun Nabi, Zhuowei Wen, Brooke Ritter, and Shaddi Hasan. Red is sus: Automated identification of low-quality service availability claims in the us national broadband map. InProceedings of the 2024 ACM on Internet Measurement Conference, pages 2–18, 2024

  19. [19]

    An investigation of geographic mapping tech- niques for internet hosts

    Venkata N Padmanabhan and Lakshminarayanan Subra- manian. An investigation of geographic mapping tech- niques for internet hosts. InProceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, pages 173– 185, 2001

  20. [20]

    Ip geolocation databases: Unreliable?ACM SIGCOMM Computer Communication Review, 41(2):53–56, 2011

    Ingmar Poese, Steve Uhlig, Mohamed Ali Kaafar, Benoit Donnet, and Bamba Gueye. Ip geolocation databases: Unreliable?ACM SIGCOMM Computer Communication Review, 41(2):53–56, 2011

  21. [21]

    A multi-perspective analysis of carrier-grade nat deploy- ment

    Philipp Richter, Florian Wohlfart, Narseo Vallina- Rodriguez, Mark Allman, Randy Bush, Anja Feldmann, Christian Kreibich, Nicholas Weaver, and Vern Paxson. A multi-perspective analysis of carrier-grade nat deploy- ment. InProceedings of the 2016 Internet Measurement Conference, pages 215–229, 2016

  22. [22]

    RIPE Atlas: A global internet measure- ment network.Internet Protocol Journal, 18(3):2–26, 2015

    RIPE NCC Staff. RIPE Atlas: A global internet measure- ment network.Internet Protocol Journal, 18(3):2–26, 2015

  23. [23]

    Gps-based geolocation of consumer ip addresses

    James Saxon and Nick Feamster. Gps-based geolocation of consumer ip addresses. InInternational Conference on Passive and Active Network Measurement, pages 122–

  24. [24]

    A geolocation databases study.IEEE Journal on Selected Areas in Com- munications, 29(10):2044–2056, 2011

    Yuval Shavitt and Noa Zilberman. A geolocation databases study.IEEE Journal on Selected Areas in Com- munications, 29(10):2044–2056, 2011

  25. [25]

    Giga meter: School connectivity mea- surement tool.https://github.com/unicef/ project-connect-daily-check-app, 2023

    UNICEF. Giga meter: School connectivity mea- surement tool.https://github.com/unicef/ project-connect-daily-check-app, 2023. Utilizes M-Lab NDT protocol for speed measurement

  26. [26]

    Giga: Connect every school to the internet.https://giga.global/, 2019

    UNICEF and ITU. Giga: Connect every school to the internet.https://giga.global/, 2019. Accessed: March 2026

  27. [27]

    Country classifications.https: //unctadstat.unctad.org/EN/Classifications/ DimCountries_All_Hierarchy.pdf, 2024

    United Nations Conference on Trade and De- velopment. Country classifications.https: //unctadstat.unctad.org/EN/Classifications/ DimCountries_All_Hierarchy.pdf, 2024. Ac- cessed: March 2026

  28. [28]

    Standard country or area codes for statistical use (M49).https://unstats

    United Nations Statistics Division. Standard country or area codes for statistical use (M49).https://unstats. un.org/unsd/methodology/m49/, 2024. Accessed: March 2026

  29. [29]

    Route- views: Internet routing table archive.https://www

    University of Oregon Route Views Project. Route- views: Internet routing table archive.https://www. routeviews.org/, 2026. Appendix A Ethics Our use of data in this paper raises no ethical concerns. All datasets used in this paper are publicly available: RIPE Atlas probe metadata is publicly accessible via the RIPE Atlas API, UNICEF Giga school connectivi...