pith. sign in

arxiv: 2603.09398 · v2 · submitted 2026-03-10 · 💻 cs.DB

GeoBenchr: An Application-Centric Benchmarking Suite for Spatiotemporal Database Platforms

Pith reviewed 2026-05-15 13:46 UTC · model grok-4.3

classification 💻 cs.DB
keywords spatiotemporal databasesbenchmarking suiteapplication-centric evaluationPostGISMobilityDBgeospatial workloadsdatabase scalabilityquery performance
0
0 comments X

The pith

GeoBenchr introduces an application-centric benchmark suite to evaluate spatiotemporal database platforms on realistic workloads.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GeoBenchr, a new open-source benchmarking suite built to test spatiotemporal database systems using workloads drawn from actual domains such as cycling, aviation, and maritime tracking. It runs these workloads across multiple query types and dataset scales to measure scalability, configuration effects, and cross-system differences among platforms like PostGIS and MobilityDB. A sympathetic reader would care because existing general-purpose benchmarks often miss the specific performance trade-offs that matter when a system must support continuous location tracking or route analysis in one of these domains. If the suite works as intended, it supplies concrete data that lets practitioners choose or tune a database for their particular application rather than relying on abstract metrics.

Core claim

GeoBenchr is an application-centric benchmarking suite that supplies diverse datasets, query types, and workload patterns reflecting realistic use cases from cycling, aviation, and maritime tracking; when applied to several spatiotemporal platforms, it reveals scalability behavior, configuration sensitivity, and relative performance differences that general benchmarks do not capture.

What carries the argument

GeoBenchr, an open-source benchmarking suite that executes application-derived workloads on spatiotemporal database platforms to produce comparable measurements of scalability and configuration impact.

If this is right

  • Database selection for a cycling or maritime application can be based on measured behavior under matching query mixes rather than vendor claims.
  • Configuration tuning can be validated against the same workload patterns that the target application will generate.
  • Developers of new spatiotemporal systems gain a shared test harness that directly exercises domain-specific operations.
  • Scalability limits for large tracking datasets become visible before deployment in production environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Adding a fourth domain such as urban logistics would test whether the suite's structure generalizes without redesign.
  • Publishing the raw workload traces alongside the suite would let other researchers reproduce or extend the exact scenarios.
  • Periodic re-runs on updated versions of the same platforms could track how performance evolves with software releases.

Load-bearing premise

The chosen datasets, query types, and workload patterns accurately reflect realistic use cases from domains such as cycling, aviation, and maritime tracking.

What would settle it

A controlled run of GeoBenchr on two additional database platforms that produces performance rankings identical to those from existing non-application-centric benchmarks would indicate that the new suite adds no distinguishing insight.

Figures

Figures reproduced from arXiv: 2603.09398 by David Bermbach, Diana Baumann, Natalie Carl, Nils Japke, Tim C. Rese.

Figure 1
Figure 1. Figure 1: GeoBenchr’s modular architecture allows con [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The two real-world MOD datasets we base our application scenarios on, varying heavily in their characteristics such as data distribution and movement patterns. From left to right: AIS data from the pub￾lished Piraeus AIS dataset [31] and flight data from the Deutsche Flugsicherung (DFS).6 add support for additional SUTs as well as additional bench￾mark scenarios building on other datasets. Third, to make t… view at source ↗
Figure 3
Figure 3. Figure 3: Empirical Cumulative Distribution Function of Query Durations across datasets. In some cases, database [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Depending on the query, different SUTs excel. SedonaDB, while having the best overall performance, [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

The rapid growth of spatiotemporal data volumes needs to be handled by database systems capable of efficiently managing and querying such data. Existing systems such as PostGIS, SpaceTime, and MobilityDB offer partial solutions but differ widely in scope and performance. Also, first spatiotemporal benchmarks provide valuable insights but are limited in scope and, to our knowledge, no application-centric benchmarking suite exists. In this paper, we propose GeoBenchr, an open-source, application-centric benchmarking suite for spatiotemporal platforms. GeoBenchr enables comprehensive evaluation across diverse datasets, query types, and workload patterns, reflecting realistic use cases from domains such as cycling, aviation, and maritime tracking. We use our GeoBenchr prototype to evaluate several system aspects including scalability, configuration impact, and cross-platform performance comparison. Our results highlight the importance of application-centric benchmarking in selecting suitable spatiotemporal database systems for real-world scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes GeoBenchr, an open-source application-centric benchmarking suite for spatiotemporal database platforms such as PostGIS, SpaceTime, and MobilityDB. It claims to enable comprehensive evaluation across diverse datasets, query types, and workload patterns drawn from domains including cycling, aviation, and maritime tracking, with prototype experiments assessing scalability, configuration impact, and cross-platform performance differences; the central result is that such application-centric benchmarking is important for selecting suitable systems in real-world scenarios.

Significance. If the workloads and datasets are shown to be representative, GeoBenchr could fill a documented gap in existing spatiotemporal benchmarks by providing a reusable, domain-grounded evaluation framework that supports reproducible cross-system comparisons.

major comments (1)
  1. [Evaluation setup / workload patterns] Workload and dataset construction (described in the evaluation setup): the paper states that the chosen datasets, spatial predicates, temporal windows, and update rates reflect realistic use cases from cycling, aviation, and maritime tracking, but reports no trace-driven derivation, statistical comparison to production logs, or expert validation; this assumption is load-bearing for the headline claim that application-centric benchmarking improves real-world system selection.
minor comments (1)
  1. [Abstract] The abstract summarizes goals and high-level results but contains no concrete numbers, error bars, or platform-specific findings; moving at least one quantitative result (e.g., scalability ratio or configuration delta) into the abstract would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. The concern about workload and dataset construction is valid, and we address it directly below with a commitment to revise the manuscript for greater transparency and to acknowledge limitations.

read point-by-point responses
  1. Referee: [Evaluation setup / workload patterns] Workload and dataset construction (described in the evaluation setup): the paper states that the chosen datasets, spatial predicates, temporal windows, and update rates reflect realistic use cases from cycling, aviation, and maritime tracking, but reports no trace-driven derivation, statistical comparison to production logs, or expert validation; this assumption is load-bearing for the headline claim that application-centric benchmarking improves real-world system selection.

    Authors: We acknowledge that the manuscript did not report trace-driven derivation, statistical comparisons to production logs, or formal expert validation. The datasets were drawn from publicly available real-world sources commonly used in the literature for these domains (e.g., GPS traces for cycling, ADS-B feeds for aviation, and AIS messages for maritime), with predicates, windows, and update rates selected to match typical query patterns described in domain-specific studies. In the revised version, we will expand the evaluation setup section with a dedicated subsection detailing exact data sources, selection rationale, and references to supporting literature. We will also add an explicit limitations discussion noting that, while grounded in public real-world data, the workloads have not been statistically validated against proprietary production traces. This will provide a more balanced foundation for the claim that application-centric benchmarking aids real-world system selection. revision: yes

Circularity Check

0 steps flagged

No circularity in benchmarking suite proposal

full rationale

The paper proposes GeoBenchr as a new application-centric benchmarking suite motivated by gaps in existing spatiotemporal benchmarks and systems. No equations, derivations, fitted parameters, or self-referential logic appear in the provided text. The central claim—that application-centric benchmarking aids real-world system selection—is presented as an outcome of the tool's design and cross-platform evaluation rather than reducing to its inputs by construction. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked. The work is self-contained as a tool proposal and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on domain assumptions about data growth and benchmark limitations plus the new suite itself; no free parameters or invented physical entities are introduced.

axioms (2)
  • domain assumption Existing spatiotemporal databases differ widely in scope and performance
    Invoked in the abstract to motivate the need for a new benchmark.
  • domain assumption First spatiotemporal benchmarks are limited in scope
    Used to justify creating an application-centric alternative.
invented entities (1)
  • GeoBenchr no independent evidence
    purpose: Open-source application-centric benchmarking suite for spatiotemporal database platforms
    Newly proposed tool whose value depends on adoption and validation outside this paper.

pith-pipeline@v0.9.0 · 5457 in / 1223 out tokens · 45633 ms · 2026-05-15T13:46:21.761501+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

  1. [1]

    [n. d.]. SpatialBench - SpatialBench — sedona.apache.org. https:// sedona.apache.org/spatialbench/. [Accessed 10-11-2025]

  2. [2]

    [n. d.]. TLC Trip Record Data - TLC — nyc.gov. https://www.nyc.gov/ site/tlc/about/tlc-trip-record-data.page. [Accessed 10-11-2025]. Tim C. Rese, Nils Japke, Diana Baumann, Natalie Carl, and David Bermbach

  3. [3]

    Md Mahbub Alam, Luis Torgo, and Albert Bifet. 2022. A survey on spatio-temporal data analytics systems.Comput. Surveys54, 10s (2022), 1–38

  4. [4]

    Andreas Bader, Oliver Kopp, and Michael Falkenthal. 2017. Survey and comparison of open source time series databases. InDatenbanksys- teme für Business, Technologie und Web (BTW 2017)-Workshopband. Gesellschaft für Informatik eV, 249–268

  5. [5]

    2017.Cloud Service Benchmarking: Measuring Quality of Cloud Services from a Client Per- spective

    David Bermbach, Erik Wittern, and Stefan Tai. 2017.Cloud Service Benchmarking: Measuring Quality of Cloud Services from a Client Per- spective. Springer, Cham, Switzerland

  6. [6]

    Howard Butler, Martin Daly, Allan Doyle, Sean Gillies, Tim Schaub, and Christopher Schmidt. 2014. GeoJSON.Electronic. URL: http://geojson. org(2014)

  7. [7]

    2011.Statistics for spatio- temporal data

    Noel Cressie and Christopher K Wikle. 2011.Statistics for spatio- temporal data. John Wiley & Sons

  8. [8]

    Christian Düntgen, Thomas Behr, and Ralf Hartmut Güting. 2009. Berlinmod: a benchmark for moving object databases.The VLDB Journal18, 6 (2009), 1335–1368

  9. [9]

    Ahmed Eldawy and Mohamed F Mokbel. 2015. Spatialhadoop: A mapreduce framework for spatial data. In2015 IEEE 31st international conference on Data Engineering. IEEE, 1352–1363

  10. [10]

    Mohamed Y Eltabakh, Ramy Eltarras, and Walid G Aref. 2006. Space- partitioning trees in postgresql: Realization and performance. In22nd International Conference on Data Engineering (ICDE’06). IEEE, 100–100

  11. [11]

    Fan Gao, Peng Yue, Zhipeng Cao, Shuaifeng Zhao, Boyi Shangguan, Liangcun Jiang, Lei Hu, Zhe Fang, and Zheheng Liang. 2022. A multi- source spatio-temporal data cube for large-scale geospatial analysis. International Journal of Geographical Information Science36, 9 (2022), 1853–1884

  12. [12]

    Leticia Gómez, Alejandro Vaisman, and Esteban Zimányi. 2024. Query- ing Mobile Pollution Data using MobilityDB. In2024 25th IEEE Interna- tional Conference on Mobile Data Management (MDM). IEEE, 227–234

  13. [13]

    2005.Moving objects databases

    Ralf Hartmut Güting and Markus Schneider. 2005.Moving objects databases. Academic Press

  14. [14]

    Ali Hamdi, Khaled Shaban, Abdelkarim Erradi, Amr Mohamed, Shak- ila Khan Rumi, and Flora D Salim. 2022. Spatiotemporal data mining: a survey on challenges and open problems.Artificial Intelligence Review 55, 2 (2022), 1441–1488

  15. [15]

    Di Hu, Shaosong Ma, Fei Guo, Guonian Lu, and Junzhi Liu. 2015. Describing data formats of geographical models.Environmental Earth Sciences74, 10 (2015), 7101–7115

  16. [16]

    James N Hughes, Andrew Annex, Christopher N Eichelberger, An- thony Fox, Andrew Hulbert, and Michael Ronquest. 2015. Geomesa: a distributed architecture for spatio-temporal fusion. InGeospatial infor- matics, fusion, and motion video analytics V, Vol. 9473. SPIE, 128–140

  17. [17]

    Andrew Hulbert, Thomas Kunicki, James N Hughes, Anthony D Fox, and Christopher N Eichelberger. 2016. An experimental study of big spatial data systems. In2016 IEEE International Conference on Big Data (Big Data). IEEE, 2664–2671

  18. [18]

    Christian S Jensen, Dalia Tiešyt˙e, and Nerius Tradišauskas. 2006. The COST benchmark—comparison and evaluation of spatio-temporal indexes. InInternational Conference on Database Systems for Advanced Applications. Springer, 125–140

  19. [19]

    Ahmet-Serdar Karakaya, Leonard Thomas, Denis Koljada, and David Bermbach. 2023. A Crowdsensing Approach for Deriving Surface Quality of Cycling Infrastructure. InProceedings of the 11th IEEE In- ternational Conference on Cloud Engineering(Boston, MA USA)(IC2E ’23). IEEE, New York, NY, USA, 212–219. https://doi.org/10.1109/ IC2E59103.2023.00031

  20. [20]

    Suneuy Kim, Yvonne Hoang, Tsz Ting Yu, and Yuvraj Singh Kanwar

  21. [21]

    GeoYCSB: a benchmark framework for the performance and scal- ability evaluation of geospatial NoSQL databases.Big Data Research 31 (2023), 100368

  22. [22]

    Levan Natsvlishvili, Nato Jorjiashvili, and Vakhtang Kochoradze. 2022. Development of a PostGIS-based method for creating risk maps of nat- ural disasters using the example of Georgia.Geodesy and Cartography 48, 2 (2022), 70–77

  23. [23]

    Nikos Pelekis, Babis Theodoulidis, Ioannis Kopanakis, and Yannis Theodoridis. 2004. Literature review of spatio-temporal database models.The Knowledge Engineering Review19, 3 (2004), 235–274

  24. [24]

    Suprio Ray, Bogdan Simion, and Angela Demke Brown. 2011. Jackpine: A benchmark to evaluate spatial database performance. In2011 IEEE 27th International Conference on Data Engineering. IEEE, 1139–1150

  25. [25]

    Rese and David Bermbach

    Tim C. Rese and David Bermbach. 2025. Evaluating the Impact of Spatial Features of Mobility Data and Index Choice on Database Per- formance. InProceedings of the 13th IEEE International Conference on Cloud Engineering(Rennes, France)(IC2E ’25). IEEE, New York, NY, USA, 1–12. https://doi.org/10.1109/IC2E65552.2025.00007

  26. [26]

    Rese and David Bermbach

    Tim C. Rese and David Bermbach. 2025. Towards an Application- Centric Benchmark Suite for Spatiotemporal Database Systems. In Proceedings of the 13th IEEE International Conference on Cloud Engi- neering(Rennes, France)(IC2E ’25). IEEE, New York, NY, USA, 69–70. https://doi.org/10.1109/IC2E65552.2025.00016

  27. [27]

    Vladislav Rudakov, Merembayev Timur, and Amirgaliyev Yedilkhan

  28. [28]

    In2023 17th International Conference on Electronics Computer and Computation (ICECCO)

    Comparison of time series databases. In2023 17th International Conference on Electronics Computer and Computation (ICECCO). IEEE, 1–4

  29. [29]

    Mahmoud Sakr, Esteban Zimányi, Alejandro Vaisman, and Mohamed Bakli. 2023. User-centered road network traffic analysis with Mobili- tyDB.Transactions in GIS27, 2 (2023), 323–346

  30. [30]

    Maxime Schoemans, Walid G Aref, Esteban Zimányi, and Mahmoud Sakr. 2024. Multi-Entry Generalized Search Trees for Indexing Tra- jectories. InProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems. 421–431

  31. [31]

    Shashi Shekhar, Michael R Evans, Viswanath Gunturi, KwangSoo Yang, and Daniel Cintra Cugler. 2012. Benchmarking spatial big data. In Workshop on Big Data Benchmarks. Springer, 81–93

  32. [32]

    Samriddhi Singla, Ahmed Eldawy, Tina Diao, Ayan Mukhopadhyay, and Elia Scudiero. 2021. Experimental study of big raster and vector database systems. In2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2243–2248

  33. [33]

    Andreas Tritsarolis, Yannis Kontoulis, and Yannis Theodoridis. 2022. The Piraeus AIS dataset for large-scale maritime data analytics.Data in brief40 (2022), 107782

  34. [34]

    Aske Wachs and Eleni Tzirita Zacharatou. 2024. Analysis of Geospatial data loading. InProceedings of the Tenth International Workshop on Testing Database Systems. 36–42

  35. [35]

    Jia Yu, Jinxuan Wu, and Mohamed Sarwat. 2015. Geospark: A cluster computing framework for processing large-scale spatial data. InPro- ceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems. 1–4

  36. [36]

    Esteban Zimányi, Mahmoud Sakr, and Arthur Lesuisse. 2020. Mobili- tyDB: A mobility database based on PostgreSQL and PostGIS.ACM Transactions on Database Systems (TODS)45, 4 (2020), 1–42