pith. sign in

arxiv: 1907.04420 · v2 · pith:5FSUMD37new · submitted 2019-07-09 · 💻 cs.CG

Sublinear data structures for short Fr\'echet queries

Pith reviewed 2026-05-24 23:49 UTC · model grok-4.3

classification 💻 cs.CG
keywords Fréchet distancedata structurescurvesapproximationdoubling spacesnearest neighborsstreaming
0
0 comments X

The pith

Approximate distance oracles for discrete Fréchet distance on curves can have space independent of input curve length m.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies data structures for curves under discrete Fréchet distance in an asymmetric setting: stored curves have complexity m while query curves have complexity k much smaller than m. It shows that (1+ε)-approximate distance oracles exist whose space depends only on k, ε and constant dimension d rather than total input size N. The oracles answer queries deterministically in O(k²) time and can be maintained dynamically under a streaming input model. The techniques also produce an exponential improvement in the dependence on m for approximate nearest-neighbor structures over sets of curves.

Core claim

There exist deterministic (1+ε)-approximate distance oracles for the discrete Fréchet distance between a stored curve of complexity m and a query curve of complexity k that use space O((k log(ε^{-1}) ε^{-d})^k) in Euclidean R^d, answer queries in O(k²) time, and can be maintained in the stream with polylogarithmic extra memory while supporting queries in O(k⁴ log² m) time.

What carries the argument

The asymmetric distance oracle that stores a compressed version of each input curve whose size depends only on the query complexity k.

If this is right

  • Distance queries can be answered in time O(k²) independent of m.
  • The oracle can be maintained dynamically when the input arrives as a stream, using only polylogarithmic extra memory.
  • Approximate nearest-neighbor data structures for sets of curves achieve an exponential improvement in their dependence on input-curve complexity m.
  • All bounds hold for curves embedded in Euclidean R^d with constant d.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The streaming maintenance result implies that similarity queries can be answered against the prefix of a trajectory seen so far without storing the entire prefix.
  • The same compression technique may be combinable with other sketching methods to handle slowly growing k relative to m.
  • The independence from m suggests that trajectory databases could be indexed once for a fixed query length k and reused across many different stored curves.

Load-bearing premise

Query curves must have complexity k much smaller than the stored curves' complexity m, and the ambient space must be doubling such as constant-dimensional Euclidean space.

What would settle it

An explicit family of input curves of length m for which every (1+ε)-approximate oracle for k-length queries requires space that grows with m would falsify the independence from input size.

Figures

Figures reproduced from arXiv: 1907.04420 by Anne Driemel, Ioannis Psarros, Melanie Schmidt.

Figure 1
Figure 1. Figure 1: Exponential grid construction around point [PITH_FULL_IMAGE:figures/full_fig_p037_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example of a curve X (black), its simplification Xk (orange) and a query curve q (blue) with distance r = ddF (X, q) indicated by the radius of the blue disks. 38 [PITH_FULL_IMAGE:figures/full_fig_p039_2.png] view at source ↗
read the original abstract

We study metric data structures for curves in doubling spaces, such as trajectories of moving objects in Euclidean $\mathbb{R}^d$, where the distance between two curves is measured using the discrete Fr\'echet distance. We design data structures in an \emph{asymmetric} setting where the input is a curve (or a set of $n$ curves) each of complexity $m$ and the queries are with curves of complexity $k\ll m$. We show that there exist approximate data structures that are independent of the input size $N = d \cdot n \cdot m$ and we study how to maintain them dynamically if the input is given in the stream. Concretely, we study two types of data structures: (i) distance oracles, where the task is to store a compressed version of the input curve, which can be used to answer queries for the distance of a query curve to the input curve, and (ii) nearest-neighbor data structures, where the task is to preprocess a set of input curves to answer queries for the input curve closest to the query curve. In both cases we are interested in approximation. For curves embedded in Euclidean $\mathbb{R}^d$ with constant $d$, our distance oracle uses space in $\mathcal{O}((k \log(\epsilon^{-1}) \epsilon^{-d})^k)$ ($\epsilon$ is the precision parameter). The oracle performs $(1+\epsilon)$-approximate queries in time in $\mathcal{O}(k^2)$ and is deterministic. We show how to maintain this distance oracle in the stream using polylogarithmic additional memory. In the stream, we can dynamically answer distance queries to the portion of the stream seen so far in $\mathcal{O}(k^4 \log^2 m)$ time. We apply our techniques to the second problem, approximate near neighbor (ANN) data structures, and achieve an exponential improvement in the dependency on the complexity of the input curves compared to the state of the art.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript claims to construct (1+ε)-approximate distance oracles and nearest-neighbor data structures for the discrete Fréchet distance on curves in Euclidean R^d (d constant). In the asymmetric setting, input curves have complexity m while queries have complexity k ≪ m; the distance oracle uses space O((k log(ε^{-1}) ε^{-d})^k), answers queries in O(k^2) time, is deterministic, and extends to a streaming/dynamic setting with polylog additional memory and O(k^4 log^2 m) query time. The techniques also yield an exponential improvement in the dependence on m for approximate near-neighbor structures relative to prior work.

Significance. If the claimed constructions and bounds hold, the work is significant for metric data structures on trajectories: it achieves space independent of the total input size N = d·n·m, which is essential for large-scale curve datasets. The deterministic guarantee, the polylog-space streaming maintenance, and the exponential improvement for ANN are concrete strengths that advance the state of the art in asymmetric Fréchet query problems.

minor comments (2)
  1. [§1] §1 (Introduction): the definition of the discrete Fréchet distance is referenced but not restated; adding the standard recursive definition would improve readability for readers outside computational geometry.
  2. [Abstract / Theorem 1] The space bound O((k log(ε^{-1}) ε^{-d})^k) is stated for constant d; the dependence on d should be made explicit in the theorem statement even if d is treated as constant.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper presents algorithmic constructions for asymmetric distance oracles and ANN data structures under discrete Fréchet distance in doubling spaces. Space bounds O((k log(ε^{-1}) ε^{-d})^k) and query times are obtained by direct analysis of the proposed compressed representations and dynamic maintenance procedures in the streaming model. These are explicit algorithmic results depending on input parameters k, ε, d and m, with no fitted parameters, self-definitional reductions, or load-bearing self-citations that collapse the claimed bounds back to the inputs by construction. The derivation chain consists of standard algorithmic analysis and is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The abstract relies on standard properties of doubling metric spaces and the definition of discrete Fréchet distance; no explicit free parameters beyond the user-chosen approximation ε are named.

free parameters (1)
  • epsilon
    User-chosen approximation parameter appearing in the space bound.
axioms (1)
  • domain assumption Ambient space is a doubling metric space (explicitly Euclidean R^d with constant d)
    Invoked to obtain the concrete space bound O((k log(ε^{-1}) ε^{-d})^k).

pith-pipeline@v0.9.0 · 5904 in / 1333 out tokens · 18322 ms · 2026-05-24T23:49:50.628684+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 3 internal anchors

  1. [1]

    M. A. Abam, M. de Berg, P. Hachenberger, and A. Zarei. Streaming algorithms for line simplification. Discrete & Computational Geometry , 43(3):497–515, Apr 2010. URL: https://doi.org/10. 1007/s00454-008-9132-4 , doi:10.1007/s00454-008-9132-4

  2. [2]

    A Geometric Heuristic for Rectilinear Crossing Minimization

    Peyman Afshani and Anne Driemel. On the complexity of range searching among curves. In Proceed- ings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018 , pages 898–917, 2018. URL: https://doi.org/10.1137/1. 9781611975031.58, doi:10.1137/1.9781611975031.58

  3. [3]

    Computing the Fréchet distance between two polygonal curves

    Helmut Alt and Michael Godau. Computing the Fréchet distance between two polygonal curves. International Journal of Computational Geometry & Applications, 05:75–91, 1995. doi:10.1142/ S0218195995000064

  4. [4]

    Mount, Antoine Vigneron, and Jian Xia

    Sunil Arya, David M. Mount, Antoine Vigneron, and Jian Xia. Space-time tradeoffs for proximity searching in doubling spaces. In Algorithms - ESA 2008, 16th Annual European Symposium, Karl- sruhe, Germany, September 15-17, 2008. Proceedings, pages 112–123, 2008. URL: https://doi. org/10.1007/978-3-540-87744-8_10 , doi:10.1007/978-3-540-87744-8\_10

  5. [5]

    Multi- resolution sketches and locality sensitive hashing for fast trajectory processing

    Maria Astefanoaei, Paul Cesaretti, Panagiota Katsikouli, Mayank Goswami, and Rik Sarkar. Multi- resolution sketches and locality sensitive hashing for fast trajectory processing. In International Con- ference on Advances in Geographic Information Systems (SIGSPATIAL 2018), volume 10, 2018

  6. [6]

    A fast implementation of near neighbors queries for Fréchet dis- tance (GIS Cup)

    Julian Baldus and Karl Bringmann. A fast implementation of near neighbors queries for Fréchet dis- tance (GIS Cup). In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, SIGSPATIAL’17, pages 99:1–99:4, 2017

  7. [7]

    Bentley and James B

    Jon L. Bentley and James B. Saxe. Decomposable searching problems i: Static-to-dynamic transfor- mation. Journal of Algorithms, 1(4):301 – 358, 1980

  8. [8]

    Simplifying 3d polygo- nal chains under the discrete fréchet distance

    Sergey Bereg, Minghui Jiang, Wencheng Wang, Boting Yang, and Binhai Zhu. Simplifying 3d polygo- nal chains under the discrete fréchet distance. InProceedings of the 8th Latin American Conference on Theoretical Informatics, LATIN’08, pages 630–641, Berlin, Heidelberg, 2008. Springer-Verlag. URL: http://dl.acm.org/citation.cfm?id=1792918.1792972

  9. [9]

    The One-Way Communication Complexity of Dynamic Time Warping Distance

    Vladimir Braverman, Moses Charikar, William Kuszmaul, David P. Woodruff, and Lin F. Yang. The one-way communication complexity of dynamic time warping distance.CoRR, abs/1903.03520, 2019. arXiv:1903.03520. 32

  10. [10]

    Why walking the dog takes time: Frechet distance has no strongly subquadratic algo- rithms unless seth fails

    Karl Bringmann. Why walking the dog takes time: Frechet distance has no strongly subquadratic algo- rithms unless seth fails. In Proceedings of the 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, FOCS ’14, pages 661–670, Washington, DC, USA, 2014. IEEE Computer Society. URL: http://dx.doi.org/10.1109/FOCS.2014.76, doi:10.1109/FOCS.2014.76

  11. [11]

    Efficient trajectory queries under the Fréchet distance (GIS Cup)

    Kevin Buchin, Yago Diez, Tom van Diggelen, and Wouter Meulemans. Efficient trajectory queries under the Fréchet distance (GIS Cup). In Proc. 25th Int. Conference on Advances in Geographic Information Systems (SIGSPATIAL), pages 101:1–101:4, 2017

  12. [12]

    Incremental clustering and dynamic information retrieval

    Moses Charikar, Chandra Chekuri, Tomás Feder, and Rajeev Motwani. Incremental clustering and dynamic information retrieval. SIAM Journal on Computing, 33(6):1417–1440, 2004

  13. [13]

    Searching dynamic point sets in spaces with bounded doubling dimension

    Richard Cole and Lee-Ad Gottlieb. Searching dynamic point sets in spaces with bounded doubling dimension. In Proceedings of the Thirty-eighth Annual ACM Symposium on Theory of Computing , STOC ’06, pages 574–583, New York, NY , USA, 2006. ACM. URL: http://doi.acm.org/ 10.1145/1132516.1132599, doi:10.1145/1132516.1132599

  14. [14]

    Fast Fréchet queries

    Mark De Berg, Atlas F Cook, and Joachim Gudmundsson. Fast Fréchet queries. Computational Geometry, 46(6):747–755, 2013

  15. [15]

    Mark de Berg and Ali D. Mehrabi. Straight-path queries in trajectory data. In WALCOM: Al- gorithms and Computation - 9th Int. Workshop, WALCOM 2015, Dhaka, Bangladesh, February 26-28, 2015. Proceedings , pages 101–112, 2015. URL: http://dx.doi.org/10.1007/ 978-3-319-15612-5_10 , doi:10.1007/978-3-319-15612-5_10

  16. [16]

    Driemel and S

    A. Driemel and S. Har-Peled. Jaywalking your dog: Computing the Fréchet distance with short- cuts. SIAM Journal on Computing, 42(5):1830–1866, 2013. URL:https://doi.org/10.1137/ 120865112

  17. [17]

    Clustering time series under the Fréchet dis- tance

    Anne Driemel, Amer Krivošija, and Christian Sohler. Clustering time series under the Fréchet dis- tance. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA , pages 766–785, 2016. URL: http://dx.doi.org/10.1137/1.9781611974331.ch55, doi:10.1137/1.9781611974331.ch55

  18. [18]

    Phillips, and Ioannis Psarros

    Anne Driemel, Jeff M. Phillips, and Ioannis Psarros. The VC dimension of metric balls under Fréchet and Hausdorff distances. In Proc. 35st International Symposium on Computational Geometry , pages 28:2–28:16, 2019

  19. [19]

    Locally-sensitive hashing of curves

    Anne Driemel and Francesco Silvestri. Locally-sensitive hashing of curves. In Proc. 33st International Symposium on Computational Geometry, pages 37:1–37:16, 2017

  20. [20]

    A filter-and-refinement- algorithm for range queries based on the Fréchet distance (GIS Cup)

    Fabian Dütsch and Jan Vahrenhold. A filter-and-refinement- algorithm for range queries based on the Fréchet distance (GIS Cup). In Proc. 25th Int. Conference on Advances in Geographic Information Systems (SIGSPATIAL), pages 100:1–100:4, 2017

  21. [21]

    Computing discrete Fréchet distance

    Thomas Eiter and Heikki Mannila. Computing discrete Fréchet distance. Technical Report CD-TR 94/64, Information Systems Department, Technical University of Vienna, 1994

  22. [22]

    Emiris and Ioannis Psarros

    Ioannis Z. Emiris and Ioannis Psarros. Products of Euclidean metrics and applications to proximity questions among curves. In Proc. 34th Int. Symposium on Computational Geometry (SoCG) , vol- ume 99 of LIPIcs, pages 37:1–37:13, 2018. 33

  23. [23]

    Arnold Filtser, Omrit Filtser, and Matthew J. Katz. Approximate nearest neighbor for curves - sim- ple, efficient, and deterministic. CoRR, abs/1902.07562, 2019. URL: http://arxiv.org/abs/ 1902.07562, arXiv:1902.07562

  24. [24]

    A natural metric for curves - computing the distance for polygonal chains and approx- imation algorithms

    Michael Godau. A natural metric for curves - computing the distance for polygonal chains and approx- imation algorithms. In Proceedings of the 8th Annual Symposium on Theoretical Aspects of Computer Science (STACS), pages 127–136, 1991

  25. [25]

    Fast algorithms for approximate Fréchet matching queries in geometric trees

    Joachim Gudmundsson and Michiel Smid. Fast algorithms for approximate Fréchet matching queries in geometric trees. Computational Geometry , 48(6):479 – 494, 2015. URL: http://www. sciencedirect.com/science/article/pii/S0925772115000164, doi:http:// dx.doi.org/10.1016/j.comgeo.2015.02.003

  26. [26]

    Geometric Approximation Algorithms

    Sariel Har-Peled. Geometric Approximation Algorithms . American Mathematical Society, Boston, MA, USA, 2011

  27. [27]

    Approximate nearest neighbor: Towards removing the curse of dimensionality.Theory of Computing, 8(1):321–350, 2012

    Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality.Theory of Computing, 8(1):321–350, 2012. URL: https://doi.org/ 10.4086/toc.2012.v008a014, doi:10.4086/toc.2012.v008a014

  28. [28]

    On coresets for k-means and k-median clustering

    Sariel Har-Peled and Soham Mazumdar. On coresets for k-means and k-median clustering. In Pro- ceedings of the 36th ACM Symposium on Theory of Computing (STOC), pages 291 – 300, 2004

  29. [29]

    Fast construction of nets in low-dimensional metrics and their applications

    Sariel Har-Peled and Manor Mendel. Fast construction of nets in low-dimensional metrics and their applications. SIAM J. Comput. , 35(5):1148–1184, 2006. URL: https://doi.org/10.1137/ S0097539704446281, doi:10.1137/S0097539704446281

  30. [30]

    Polygonal approximations of a curve – formulations and algorithms

    Hiroshi Imai and Masao Iri. Polygonal approximations of a curve – formulations and algorithms. Machine Intelligence and Pattern Recognition, 6:71–86, 12 1988

  31. [31]

    Approximate nearest neighbor algorithms for Fréchet distance via product metrics

    Piotr Indyk. Approximate nearest neighbor algorithms for Fréchet distance via product metrics. In Symposium on Computational Geometry, pages 102–106, 2002

  32. [32]

    Robert Krauthgamer and James R. Lee. Navigating nets: simple algorithms for proximity search. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, New Orleans, Louisiana, USA, January 11-14, 2004 , pages 798–807, 2004. URL: http://dl. acm.org/citation.cfm?id=982792.982913

  33. [33]

    Coresets-methods and history: A theoreticians design pattern for approximation and streaming algorithms

    Alexander Munteanu and Chris Schwiegelshohn. Coresets-methods and history: A theoreticians design pattern for approximation and streaming algorithms. KI - Künstliche Intelligenz , 32(1):37– 53, Feb 2018. URL: https://doi.org/10.1007/s13218-017-0519-3 , doi:10.1007/ s13218-017-0519-3

  34. [34]

    Muthukrishnan

    S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1(2):117 – 236, 2005

  35. [35]

    Coresets and Sketches

    Jeff M Phillips. Coresets and sketches. arXiv preprint arXiv:1601.00617, 2016

  36. [36]

    Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing)

    Gregory Shakhnarovich, Trevor Darrell, and Piotr Indyk. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing). The MIT Press, 2006

  37. [37]

    On Optimal Polyline Simplification using the Hausdorff and Fr\'echet Distance

    Marc van Kreveld, Maarten Löffler, and Lionov Wiratma. On optimal polyline simplification using the Hausdorff and Fréchet distance. arXiv preprint arXiv:1803.03550, 2018. 34

  38. [38]

    ACM SIGSPATIAL GIS Cup 2017: Range queries under Fréchet dis- tance

    Martin Werner and Dev Oliver. ACM SIGSPATIAL GIS Cup 2017: Range queries under Fréchet dis- tance. SIGSPATIAL Special, 10(1):24–27, June 2018. URL: http://doi.acm.org/10.1145/ 3231541.3231549, doi:10.1145/3231541.3231549. 35 A Offline Construction of the Distance Oracle (missing proofs) First we show the following lemma for exponential grids for points. Le...