pith. sign in

arxiv: 1907.08038 · v1 · pith:OA2DIFQPnew · submitted 2019-07-18 · 💻 cs.DB · cs.CR

A Differentially Private Algorithm for Range Queries on Trajectories

Pith reviewed 2026-05-24 19:23 UTC · model grok-4.3

classification 💻 cs.DB cs.CR
keywords differential privacytrajectory datarange queriesspatial databasesdata privacyquery processingnoise addition
0
0 comments X

The pith

An algorithm for range queries on trajectories adds noise adaptively based on data partitions and query set to achieve lower error while guaranteeing differential privacy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a method that ensures epsilon-differential privacy when answering range queries on trajectory data by first privately dividing the space into uniform regions and calculating their traffic densities. These regions and densities, together with the specific queries, are then used to estimate how trajectories are distributed across the queried area before adding noise. This data- and query-aware adaptation contrasts with uniform noise addition, which the authors argue produces higher error because trajectories are unevenly spread and have sequential structure. A reader would care because the approach promises more usable answers for applications like traffic analysis without sacrificing the privacy guarantee.

Core claim

The algorithm guarantees epsilon-differential privacy for range queries on trajectories by privately partitioning the data space into uniform regions, computing traffic density per region, and then estimating the distribution of trajectories over the queried space from the regions, densities, and given query set, which yields significantly lower error than prior methods.

What carries the argument

The adaptive noise mechanism that partitions space privately into regions, computes densities, and estimates trajectory distribution from the query set to guide noise placement.

If this is right

  • Range queries on trajectory data can be answered with higher utility at the same privacy level.
  • The method preserves sequential consistency of trajectories better than treating them as unordered sets of points.
  • Privacy mechanisms can improve by incorporating both data distribution and query workload instead of adding noise uniformly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could extend to other sequential spatial data such as movement paths in indoor environments if the partitioning step generalizes.
  • Accuracy gains may shrink if the query set changes frequently after the initial partitioning, requiring re-estimation.
  • The private partitioning step itself may introduce error that limits gains on very sparse datasets.

Load-bearing premise

That estimating trajectory distribution from the privately computed regions, their densities, and the query set will produce high accuracy for the given queries.

What would settle it

Running the algorithm and a uniform-noise baseline on the same real trajectory dataset and query set at identical epsilon and measuring whether the new method's query error is not lower.

Figures

Figures reproduced from arXiv: 1907.08038 by Kotagiri Ramamohanarao, Lars Kulik, Soheila Ghane.

Figure 1
Figure 1. Figure 1: Overview of DQAM mechanism with an example. pi and bi depict the partition and its density, respectively. anism (DQAM), a mechanism that synthesizes spatial his￾tograms. As [6], DQAM takes a spatial histogram and a set of range queries as input and uses the correlation between the queries (i.e., is query-aware) to synthesize the histogram. In contrast to [6], DQAM identifies the density of different region… view at source ↗
Figure 2
Figure 2. Figure 2: A data set with two trajectories and its corresponding spatial [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Examples of uniform and non-uniform regions. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The effective area of a range query and the query borders; [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The effect of data distribution and size on the error. Dashed [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The effect of query set size on the estimation error. [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: The utility of output by DQAM (orange), PriSH (green) and MWEM (blue) algorithms. Lower KLD means higher utility. competing algorithms changes in different settings. The high￾est ratio achieved in our experiments was 7.4 for |D| = 1000, |Q| = 8000, resolution 8 and  = 0.01 comparing the average error of DQAM versus PriSH shown in [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Running time of DQAM on small to large datasets. distributions. For clarity in presentation, we focus on DQAM, PriSH and MWEM in this experiment. D. Efficiency Evaluation [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
read the original abstract

We propose a novel algorithm to ensure $\epsilon$-differential privacy for answering range queries on trajectory data. In order to guarantee privacy, differential privacy mechanisms add noise to either data or query, thus introducing errors to queries made and potentially decreasing the utility of information. In contrast to the state-of-the-art, our method achieves significantly lower error as it is the first data- and query-aware approach for such queries. The key challenge for answering range queries on trajectory data privately is to ensure an accurate count. Simply representing a trajectory as a set instead of \emph{sequence} of points will generally lead to highly inaccurate query answers as it ignores the sequential dependency of location points in trajectories, i.e., will violate the consistency of trajectory data. Furthermore, trajectories are generally unevenly distributed across a city and adding noise uniformly will generally lead to a poor utility. To achieve differential privacy, our algorithm adaptively adds noise to the input data according to the given query set. It first privately partitions the data space into uniform regions and computes the traffic density of each region. The regions and their densities, in addition to the given query set, are then used to estimate the distribution of trajectories over the queried space, which ensures high accuracy for the given query set. We show the accuracy and efficiency of our algorithm using extensive empirical evaluations on real and synthetic data sets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes a novel ε-differentially private algorithm for range queries on trajectory data. It privately partitions the space into uniform regions, computes per-region traffic densities, and then uses these densities together with the given query set to estimate the distribution of trajectories over the queried space. The central claim is that this data- and query-aware approach yields significantly lower error than prior methods while preserving privacy, by avoiding both uniform noise addition and the consistency violations that arise from treating trajectories as unordered sets.

Significance. If the estimation procedure correctly incorporates sequential dependencies, the work would constitute a meaningful advance in utility for private trajectory range queries on unevenly distributed data. The empirical evaluations on real and synthetic datasets are presented as evidence of accuracy and efficiency gains.

major comments (1)
  1. [Abstract (algorithm description paragraph)] Abstract, paragraph on algorithm description: the estimation step that 'uses the regions and their densities, in addition to the given query set, to estimate the distribution of trajectories' is described only at a high level. No mechanism is indicated for enforcing sequential consistency or modeling transitions across regions, which the paper itself identifies as the source of inaccurate counts when trajectories are treated as sets. This modeling choice is load-bearing for the 'significantly lower error' claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and recommendation. We address the single major comment below, clarifying the role of the abstract versus the full technical description.

read point-by-point responses
  1. Referee: [Abstract (algorithm description paragraph)] Abstract, paragraph on algorithm description: the estimation step that 'uses the regions and their densities, in addition to the given query set, to estimate the distribution of trajectories' is described only at a high level. No mechanism is indicated for enforcing sequential consistency or modeling transitions across regions, which the paper itself identifies as the source of inaccurate counts when trajectories are treated as sets. This modeling choice is load-bearing for the 'significantly lower error' claim.

    Authors: The abstract is written at the conventional high level of detail for that section. The concrete mechanism that enforces sequential consistency—by using the privately computed region densities together with the query set to model transitions and estimate a consistent trajectory distribution rather than an unordered set—is fully specified in the algorithm sections of the manuscript (including the estimation procedure and its analysis). This is the same modeling choice the paper contrasts with set-based approaches, and the empirical results are obtained from the implemented procedure that incorporates these transitions. revision: no

Circularity Check

0 steps flagged

No circularity; accuracy claim rests on empirical evaluation of new procedure

full rationale

The paper introduces a procedural algorithm (private partitioning into uniform regions, per-region density computation, then estimation of trajectory distribution from those densities plus the query set) and supports its accuracy claim solely via 'extensive empirical evaluations on real and synthetic data sets.' No equations, fitted parameters renamed as predictions, or self-citations appear in the provided text. The estimation step is described as an input to the procedure rather than derived from the output by construction, so the derivation chain does not reduce to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard definition of ε-differential privacy and the modeling assumption that trajectory sequences can be accurately reconstructed from region densities plus query-specific estimation; no free parameters or invented entities are named in the abstract.

axioms (2)
  • standard math ε-differential privacy definition
    Invoked as the privacy guarantee the algorithm must satisfy.
  • domain assumption sequential dependency of location points must be preserved for accurate counts
    Stated as the reason why treating trajectories as unordered sets fails.

pith-pipeline@v0.9.0 · 5777 in / 1292 out tokens · 35272 ms · 2026-05-24T19:23:29.018056+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    Bonomi and L

    L. Bonomi and L. Xiong. A two-phase algorithm for mining sequential patterns with differential privacy. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management , pages 269–278. ACM, 2013

  2. [2]

    R. Chen, B. Fung, B. C. Desai, and N. M. Sossou. Differentially private transit data publication: a case study on the montreal transportation system. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining , pages 213–221. ACM, 2012

  3. [3]

    Cormode, C

    G. Cormode, C. Procopiuc, D. Srivastava, E. Shen, and T. Yu. Differen- tially private spatial decompositions. In Data engineering (ICDE), 2012 IEEE 28th international conference on , pages 20–31. IEEE, 2012

  4. [4]

    De Montjoye, C

    Y .-A. De Montjoye, C. A. Hidalgo, M. Verleysen, and V . D. Blondel. Unique in the crowd: The privacy bounds of human mobility. Scientific reports, 3:1376, 2013

  5. [5]

    Dwork, F

    C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating Noise to Sensitivity in Private Data Analysis , pages 265–284. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006

  6. [6]

    Ghane, L

    S. Ghane, L. Kulik, and K. Ramamohanarao. Publishing spatial histograms under differential privacy. In Proceedings of the 30th Inter- national Conference on Scientific and Statistical Database Management, Bolzano-Bozen, Italy, July 9-11, 2018 , pages 27:1–27:12. ACM, 2018

  7. [7]

    Hardt, K

    M. Hardt, K. Ligett, and F. McSherry. A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems, pages 2339–2347, 2012

  8. [8]

    Hardt and G

    M. Hardt and G. N. Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. In Foundations of Computer Science, 2010 51st Annual IEEE Symposium on , pages 61–70. IEEE, 2010

  9. [9]

    M. Hay, A. Machanavajjhala, G. Miklau, Y . Chen, and D. Zhang. Principled evaluation of differentially private algorithms using dpbench. In Proceedings of the 2016 International Conference on Management of Data, pages 139–154. ACM, 2016

  10. [10]

    X. He, G. Cormode, A. Machanavajjhala, C. M. Procopiuc, and D. Sri- vastava. Dpt: differentially private trajectory synthesis using hierarchical reference systems. Proceedings of the VLDB Endowment , 8(11):1154– 1165, 2015

  11. [11]

    Karmarkar

    N. Karmarkar. A new polynomial-time algorithm for linear program- ming. In Proceedings of the sixteenth annual ACM symposium on Theory of computing, pages 302–311. ACM, 1984

  12. [12]

    Leonardi, S

    L. Leonardi, S. Orlando, A. Raffaet `a, A. Roncato, C. Silvestri, G. An- drienko, and N. Andrienko. A general framework for trajectory data warehousing and visual olap. GeoInformatica, 18(2):273–312, 2014

  13. [13]

    C. Li, M. Hay, G. Miklau, and Y . Wang. A data-and workload-aware algorithm for range queries under differential privacy. Proceedings of the VLDB Endowment , 7(5):341–352, 2014

  14. [14]

    I. V . Lopez, R. T. Snodgrass, and B. Moon. Spatiotemporal aggregate computation: A survey. IEEE Transactions on Knowledge and Data Engineering, 17(2):271–286, 2005

  15. [15]

    McSherry and K

    F. McSherry and K. Talwar. Mechanism design via differential privacy. In Foundations of Computer Science, 2007. FOCS’07. 48th Annual IEEE Symposium on, pages 94–103. IEEE, 2007

  16. [16]

    F. D. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the ACM SIGMOD International Conference on Management of data , pages 19–30. ACM, 2009

  17. [17]

    M. F. Mokbel, L. Alarabi, J. Bao, A. Eldawy, A. Magdy, M. Sarwat, E. Waytas, and S. Yackel. Mntg: an extensible web-based traffic gener- ator. In International Symposium on Spatial and Temporal Databases , pages 38–55. Springer, 2013

  18. [18]

    Monreale, W

    A. Monreale, W. H. Wang, F. Pratesi, S. Rinzivillo, D. Pedreschi, G. An- drienko, and N. Andrienko. Privacy-preserving distributed movement data aggregation. In Geographic Information Science at the Heart of Europe, pages 225–245. Springer, 2013

  19. [19]

    Moreira-Matias, J

    L. Moreira-Matias, J. Gama, M. Ferreira, J. Mendes-Moreira, and L. Damas. Predicting taxi–passenger demand using streaming data. IEEE Transactions on Intelligent Transportation Systems , 14(3):1393– 1402, 2013

  20. [20]

    Naghi Zadeh Kakhki

    E. Naghi Zadeh Kakhki. Utility-aware protection of trajectory privacy . PhD thesis, The University of Melbourne, 2016

  21. [21]

    Qardaji, W

    W. Qardaji, W. Yang, and N. Li. Differentially private grids for geospa- tial data. In Data Engineering (ICDE), 2013 IEEE 29th International Conference on, pages 757–768. IEEE, 2013

  22. [22]

    H. Xie, L. Kulik, and E. Tanin. Privacy-aware traffic monitoring. IEEE Transactions on Intelligent Transportation Systems , 11(1):61–70, 2010

  23. [23]

    H. Xie, E. Tanin, and L. Kulik. Distributed histograms for processing aggregate data from moving objects. In Mobile Data Management, 2007 International Conference on , pages 152–157. IEEE, 2007

  24. [24]

    H. Xie, E. Tanin, L. Kulik, P. Scheuermann, G. Trajcevski, and M. Fanaeepour. Euler histogram tree: A spatial data structure for aggregate range queries on vehicle trajectories. In Proceedings of the 7th ACM SIGSPATIAL International Workshop on Computational Transportation Science, pages 18–24. ACM, 2014

  25. [25]

    F. Xu, Z. Tu, Y . Li, P. Zhang, X. Fu, and D. Jin. Trajectory recovery from ash: User privacy is not preserved in aggregated mobility data. In Proceedings of the 26th International Conference on World Wide Web , pages 1241–1250. International World Wide Web Conferences Steering Committee, 2017. APPENDIX A PROOF OF THEOREM III.2 Let T RU be the cost of tru...