Towards Serverless Processing of Spatiotemporal Big Data Queries
Pith reviewed 2026-05-21 23:48 UTC · model grok-4.3
The pith
Spatiotemporal queries on growing big data volumes can be scaled by decomposing them into independent subqueries run in parallel on serverless function platforms.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose our vision of a native serverless data processing approach for spatiotemporal data: We break down queries into small subqueries which then leverage the near-instant scaling of Function-as-a-Service platforms to execute them in parallel. With this, we partially solve the scalability needs of big spatiotemporal data processing.
What carries the argument
Decomposition of spatiotemporal queries into independent subqueries executed via Function-as-a-Service platforms for parallel scaling.
Load-bearing premise
Typical spatiotemporal queries can be split into independent subqueries whose coordination, data access, and result aggregation incur only low overhead on current FaaS platforms.
What would settle it
A benchmark showing that a representative set of decomposed spatiotemporal queries runs with higher total latency or monetary cost on FaaS than on a conventional PostGIS-style system would falsify the core proposal.
Figures
read the original abstract
Spatiotemporal data are being produced in continuously growing volumes by a variety of data sources and a variety of application fields rely on rapid analysis of such data. Existing systems such as PostGIS or MobilityDB usually build on relational database systems, thus, inheriting their scale-out characteristics. As a consequence, big spatiotemporal data scenarios still have limited support even though many query types can easily be parallelized. In this paper, we propose our vision of a native serverless data processing approach for spatiotemporal data: We break down queries into small subqueries which then leverage the near-instant scaling of Function-as-a-Service platforms to execute them in parallel. With this, we partially solve the scalability needs of big spatiotemporal data processing.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a vision for a native serverless data processing approach for spatiotemporal big data. It notes that existing systems like PostGIS inherit limited scale-out from relational databases and proposes decomposing queries into small subqueries that leverage the near-instant scaling of Function-as-a-Service (FaaS) platforms for parallel execution, thereby partially solving scalability needs for big spatiotemporal data.
Significance. If the decomposition and execution model can be implemented with low coordination overhead, the vision could enable elastic, cost-effective processing of growing spatiotemporal datasets in domains such as mobility analysis and environmental monitoring. The manuscript offers no implementation, measurements, or quantitative model, so its significance hinges on whether future work can demonstrate that FaaS-based subquery execution outperforms or complements existing parallel spatiotemporal systems.
major comments (2)
- [Abstract] Abstract: The central claim that breaking queries into subqueries 'partially solve[s] the scalability needs' lacks any supporting derivation, example, or cost model. The text provides no concrete strategy for decomposing typical queries (range, kNN, joins, trajectories) while handling spatial data dependencies such as boundary overlaps or global aggregation.
- [Proposed vision] Proposed vision: The assumption that subqueries can execute independently with negligible coordination overhead on stateless FaaS is load-bearing but unexamined. No discussion addresses how data access, state, or result aggregation would occur without routing through external storage (introducing latency and egress costs) or how this compares to existing partitioned spatiotemporal indexes.
minor comments (2)
- [Abstract] Abstract and throughout: Add references to prior serverless database systems and spatiotemporal query engines to better contextualize the novelty of the vision.
- [General] General: The manuscript would benefit from a brief section outlining potential challenges (data locality, cold starts, billing implications) even at the vision level.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments on our vision paper. The feedback correctly identifies areas where the high-level proposal would benefit from additional clarification and illustrative detail. We address each major comment below and describe the planned revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that breaking queries into subqueries 'partially solve[s] the scalability needs' lacks any supporting derivation, example, or cost model. The text provides no concrete strategy for decomposing typical queries (range, kNN, joins, trajectories) while handling spatial data dependencies such as boundary overlaps or global aggregation.
Authors: We agree that the abstract would be strengthened by concrete examples. As this is a vision paper, we intentionally kept the presentation high-level and did not include a full derivation or cost model. In the revised version we will add brief illustrative decomposition strategies for representative query types (e.g., partitioning a spatial range query across sub-regions with overlap handling via data replication or boundary adjustment) while explicitly stating that a quantitative cost model remains future work. revision: partial
-
Referee: [Proposed vision] Proposed vision: The assumption that subqueries can execute independently with negligible coordination overhead on stateless FaaS is load-bearing but unexamined. No discussion addresses how data access, state, or result aggregation would occur without routing through external storage (introducing latency and egress costs) or how this compares to existing partitioned spatiotemporal indexes.
Authors: We acknowledge that coordination, data access, and aggregation mechanisms deserve explicit discussion. The vision assumes that spatiotemporal data reside in scalable object stores directly accessible by FaaS functions, enabling largely independent execution, with a lightweight final aggregation step. In the revision we will expand the proposed-vision section to describe these mechanisms, note the potential latency and cost implications of external storage, and provide a qualitative comparison to partitioned indexes used in existing systems such as PostGIS extensions or GeoMesa. revision: yes
Circularity Check
Vision proposal contains no derivation chain or fitted results
full rationale
The manuscript is a forward-looking vision paper that proposes decomposing spatiotemporal queries into subqueries for parallel FaaS execution. No equations, parameters, or formal derivations appear in the provided text or abstract. The central claim is presented as a proposal rather than derived from prior results or self-citations within the paper itself. Consequently, there is no load-bearing step that reduces to its own inputs by construction, and the content remains self-contained as an exploratory suggestion without internal circular reasoning.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Spatiotemporal queries can be broken into independent subqueries suitable for parallel execution with low coordination cost
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We break down queries into small subqueries which then leverage the near-instant scaling of Function-as-a-Service platforms to execute them in parallel.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
following the MapReduce model
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Md Mahbub Alam, Luis Torgo, and Albert Bifet. 2022. A Survey on Spatio-temporal Data Analytics Systems. ACM Comput. Surv. (2022)
work page 2022
-
[2]
Mohamed Bakli, Mahmoud Sakr, Esteban Zimányi, Nils Dijk, and Marco Slot. 2025. Distributed MobilityDB: A Scalable Moving Object Database Management System. ACM Trans. Spatial Algorithms Syst. (2025)
work page 2025
-
[3]
David Bermbach, Abhishek Chandra, Chandra Krintz, Aniruddha Gokhale, Aleksander Slominski, Lauritz Thamsen, Everton Cavalcante, Tian Guo, Ivona Brandic, and Rich Wolski. 2021. On the Future of Cloud Engineering. In Proc. of IC2E 2021 . IEEE
work page 2021
- [4]
-
[5]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM (2008)
work page 2008
-
[6]
Joseph M Hellerstein, Jose Faleiro, Joseph E Gonzalez, Johann Schleier- Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. 2019. Serverless computing: One step forward, two steps back. Proc. of CIDR (2019)
work page 2019
-
[7]
Huanghuang Liang, Zheng Zhang, Chuang Hu, Yili Gong, and Dazhao Cheng. 2024. A Survey on Spatio-Temporal Big Data Analytics Ecosys- tem: Resource Management, Processing Platform, and Applications. IEEE Transactions on Big Data (2024)
work page 2024
-
[8]
Elyes Lounissi, Suvam Kumar Das, Ronnit Peter, Xiaozheng Zhang, Suprio Ray, and Lianyin Jia. 2025. FunDa: scalable serverless data analytics and in situ query processing. Journal of Big Data (2025)
work page 2025
-
[9]
Rese, Alexandra Kapp, and David Bermbach
Tim C. Rese, Alexandra Kapp, and David Bermbach. 2025. Evaluating the Impact Of Spatial Features Of Mobility Data and Index Choice On Database Performance. arXiv:2505.14466 [cs.DB] https://arxiv.org/abs/ 2505.14466
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.