Repair Entropy in Dynamic Geometric Nearest-Neighbour Structures

Bugra Kilictas; Faruk Alpay

arxiv: 2606.18314 · v1 · pith:O6GPIT4Dnew · submitted 2026-06-16 · 💻 cs.CG

Repair Entropy in Dynamic Geometric Nearest-Neighbour Structures

Faruk Alpay , Bugra Kilictas This is my paper

Pith reviewed 2026-06-26 21:55 UTC · model grok-4.3

classification 💻 cs.CG

keywords repair frontiernearest neighbor maintenancedynamic geometric structurescertificate validityentropy workloadevent-driven repairmotion maintenancetriangle inequality bound

0 comments

The pith

A triangle-inequality bound confines nearest-neighbor certificate failures to a repair frontier whose entropy selects the cheapest maintenance strategy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

After each small motion step of maximum size ε, a triangle-inequality argument shows that any certificate whose clearance exceeds 4ε stays valid, so every possible invalid certificate lies inside a repair frontier F_t. The paper defines repair-frontier entropy H(F_t) as the normalized Shannon entropy of those failed certificates across the cells of the index. This entropy value serves as a workload descriptor that decides whether to repair only the frontier, batch the repairs, or rebuild the entire structure. The resulting rule runs in O(|F_t| log N) time when cells have bounded occupancy, while a full rebuild costs Θ(N). Experiments across ten motion families in two and three dimensions confirm that the validity rule detects every failure and that entropy correlates with the observed cost crossover between the three strategies.

Core claim

We prove that certificates with clearance c_i > 4ε remain valid after a displacement of size at most ε. All possible failures are therefore confined to the repair frontier F_t. Repair-frontier entropy H(F_t), the normalized Shannon entropy of failed certificates over index cells, lower-bounds the number of cells touched by event-driven repair and shifts the empirical point at which repair becomes cheaper than rebuild. The maintenance rule that follows from this observation repairs only F_t in O(|F_t| log N) time under bounded cell occupancy.

What carries the argument

repair-frontier entropy H(F_t), the normalized Shannon entropy of failed certificates distributed over index cells, used to choose among event-driven repair, batched repair, and full rebuild.

If this is right

The maintenance rule repairs only the frontier in O(|F_t| log N) time under bounded cell occupancy.
Entropy lower-bounds the number of frontier cells touched by event-driven repair.
Entropy shifts the empirical repair-rebuild crossover observed in the experiments.
Low-pressure frontiers are usually cheaper to repair incrementally than to rebuild.
Diffuse frontiers of equal size cost more for event-driven repair but not for batched repair.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same entropy descriptor could be applied to other dynamic geometric structures whose certificates admit a comparable validity radius.
If cell occupancy is allowed to grow, the repair-time bound would require an additional rebalancing step whose cost must be folded into the entropy calculation.
The released dataset of 2400 labelled transitions supplies an immediate testbed for alternative entropy definitions or for higher-dimensional motion families.
In settings where the motion model itself is uncertain, the frontier entropy could serve as an online signal to adapt the choice of strategy without prior knowledge of the motion family.

Load-bearing premise

The underlying index structure keeps a bounded number of points per cell, which is required for the claimed linear-in-frontier repair cost.

What would settle it

A single observed certificate with clearance greater than 4ε that becomes invalid after a motion step of size ε, or a measured repair time that grows faster than O(|F_t| log N) on an index with bounded cell occupancy.

Figures

Figures reproduced from arXiv: 2606.18314 by Bugra Kilictas, Faruk Alpay.

**Figure 2.** Figure 2: Same failure count, different geometry. Top: spatial maps of two transitions with near [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

We study dynamic geometric data structures for exact nearest-neighbour maintenance under small motions. For each point we store a certificate consisting of its nearest neighbour and the two smallest neighbour distances, with clearance $c_i=d^i_2-d^i_1$. A triangle-inequality argument gives a sharp validity radius: after a step of maximum displacement $\varepsilon$, every certificate with $c_i>4\varepsilon$ remains valid, so all possible failures are confined to a repair frontier $F_t$. We introduce repair-frontier entropy $H(F_t)$, the normalized Shannon entropy of failed certificates over index cells, as a workload descriptor for choosing between event-driven repair, batched repair, and full rebuild. The resulting maintenance rule repairs only the frontier in $O(|F_t|\log N)$ time under bounded cell occupancy, while a full rebuild costs $\Theta(N)$; moreover, entropy lower-bounds the number of frontier cells touched by event-driven repair and shifts the empirical repair-rebuild crossover. We evaluate ten motion families in $d\in{2,3}$, with $N$ up to $16,000$, using an exact tiled GPU oracle and a GPU grid rebuild as ground truth and competitor. Across $2400$ labelled transitions, the validity rule misses no invalid certificate, low-pressure frontiers are usually cheaper to repair incrementally, and diffuse frontiers of the same size are more expensive for event-driven repair but not for batched repair. The released dataset records frontier geometry, certificate audits, per-strategy times, and best-strategy labels.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces repair-frontier entropy as a workload descriptor for choosing repair versus rebuild in dynamic NN structures, with a clean triangle-inequality validity argument and empirical checks, but the O(|F_t| log N) claim rests on an unproven bounded-occupancy assumption.

read the letter

The main thing to know is that this work defines repair-frontier entropy H(F_t) as a way to describe workload and decide between event-driven repair, batched repair, and full rebuild for exact nearest-neighbor maintenance under small motions.

They store per-point certificates with clearance c_i and use a triangle-inequality argument to show that after a move of size at most ε, any certificate with c_i > 4ε stays valid. All possible failures are therefore confined to a frontier F_t. That step is direct and looks correct. They then compute normalized Shannon entropy over the failed certificates across index cells and show that this value lower-bounds the number of cells touched by repair and shifts the empirical crossover point where repair beats rebuild.

The experiments run ten motion families in 2D and 3D, N up to 16,000, across 2400 labelled transitions against an exact GPU oracle. The validity rule missed no invalid certificates in those runs, and low-pressure frontiers were usually cheaper to repair incrementally. The dataset release with frontier geometry and per-strategy times is a plus.

The soft spot is the time bound. The paper states O(|F_t| log N) repair time under bounded cell occupancy, but occupancy is treated as an external assumption on the index rather than derived from the motion model or certificate clearances. If any motion family produces cells with occupancy that grows with N or local density, the per-certificate cost inside a cell becomes linear and the claimed advantage over Θ(N) rebuild disappears. That assumption is load-bearing for the asymptotic claim.

The entropy concept itself does not appear in the cited prior work and is presented with concrete empirical support rather than a fitted parameter. This paper is for people already working on dynamic geometric data structures. It deserves a serious referee because the central validity argument and the new descriptor are testable and the experiments are concrete, even though the occupancy issue needs addressing in revision.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that for exact nearest-neighbor maintenance under small motions, certificates with clearance c_i = d_2^i - d_1^i admit a triangle-inequality validity radius of 4ε: after maximum displacement ε, all failures are confined to a repair frontier F_t. It introduces repair-frontier entropy H(F_t) (normalized Shannon entropy of failed certificates over index cells) as a workload descriptor that lower-bounds cells touched by event-driven repair and shifts the empirical repair-rebuild crossover. The maintenance rule repairs only F_t in O(|F_t| log N) time under bounded cell occupancy (versus Θ(N) rebuild), validated on ten motion families in d=2,3 (N≤16k) across 2400 transitions with an exact tiled GPU oracle and GPU grid rebuild as ground truth; the dataset is released.

Significance. If the results hold, the work supplies a concrete workload descriptor (repair-frontier entropy) for deciding among event-driven, batched, and full-rebuild strategies in dynamic geometric structures. The sharp validity-radius argument and the public dataset of frontier geometry, certificate audits, and per-strategy timings are concrete strengths that enable reproducibility and further analysis. The empirical finding that diffuse frontiers of equal size are costlier for event-driven but not batched repair is a useful practical observation.

major comments (2)

[Maintenance rule description] Maintenance rule (time-bound paragraph): The O(|F_t| log N) claim for event-driven repair is explicitly conditioned on 'bounded cell occupancy'. No derivation is supplied from the motion model, the certificate clearances, or the triangle-inequality validity radius; occupancy is treated as an external structural assumption on the index. If any motion family produces cells whose occupancy grows with N or local density, per-certificate repair cost becomes linear in occupancy and the claimed asymptotic advantage over Θ(N) rebuild disappears. This assumption is load-bearing for the central efficiency claim.
[Entropy definition and evaluation] Entropy definition and empirical mapping (abstract and evaluation section): H(F_t) is computed directly from observed frontier failures and then used both to lower-bound touched cells and to shift the repair-rebuild crossover. While the paper presents the relation as empirical validation rather than a parameter-free derivation, the absence of a reduction showing that the entropy-to-cost mapping holds independently of the specific motion families leaves the predictive utility dependent on the 2400-transition dataset rather than a general theorem.

minor comments (2)

[Abstract and § on entropy] Notation for H(F_t): the precise normalization (base of logarithm, exact partitioning into cells) should be stated explicitly with an equation number so that the entropy values reported in the dataset can be recomputed from the raw frontier geometry.
[Evaluation figures] Figure clarity: the plots comparing repair versus rebuild cost versus |F_t| and versus H(F_t) would benefit from explicit indication of which motion families produce the high-entropy outliers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the two major comments. We respond point by point below, acknowledging where the manuscript relies on assumptions or empirical observation.

read point-by-point responses

Referee: The O(|F_t| log N) claim for event-driven repair is explicitly conditioned on 'bounded cell occupancy'. No derivation is supplied from the motion model, the certificate clearances, or the triangle-inequality validity radius; occupancy is treated as an external structural assumption on the index. If any motion family produces cells whose occupancy grows with N or local density, per-certificate repair cost becomes linear in occupancy and the claimed asymptotic advantage over Θ(N) rebuild disappears. This assumption is load-bearing for the central efficiency claim.

Authors: We agree that bounded cell occupancy is an external assumption on the index rather than derived from the motion model or validity radius. The O(|F_t| log N) bound is stated under this standard assumption for fixed-capacity cell indices. We will revise the maintenance-rule paragraph to state the assumption explicitly, note that it may fail under motion families inducing unbounded local density, and qualify the asymptotic claim accordingly. revision: yes
Referee: H(F_t) is computed directly from observed frontier failures and then used both to lower-bound touched cells and to shift the repair-rebuild crossover. While the paper presents the relation as empirical validation rather than a parameter-free derivation, the absence of a reduction showing that the entropy-to-cost mapping holds independently of the specific motion families leaves the predictive utility dependent on the 2400-transition dataset rather than a general theorem.

Authors: The entropy-to-cost mapping is presented as an empirical correlation observed across the 2400 transitions and ten motion families; no parameter-free reduction or general theorem is claimed. We will revise the evaluation section to emphasize the empirical nature of the predictive utility and to clarify that it is demonstrated within the tested motion families rather than asserted independently of them. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper derives the validity radius (c_i > 4ε) directly from a triangle-inequality argument on certificate clearances and displacements, defines the repair frontier F_t as the set of certificates that may fail under this radius, and introduces H(F_t) as the normalized Shannon entropy of the observed failure distribution over index cells. The O(|F_t| log N) repair time is stated only under the explicit external assumption of bounded cell occupancy, which is not derived from the motion model or certificates. The claim that entropy lower-bounds the number of touched cells follows from standard properties of Shannon entropy (high entropy requires support over multiple cells) rather than any reduction to fitted parameters or self-referential definitions. No self-citations, ansatzes smuggled via prior work, or renaming of known results appear in the load-bearing steps; the maintenance rule and crossover analysis rest on empirical evaluation across 2400 transitions rather than any derivation that collapses to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the triangle inequality for distances and the bounded-occupancy assumption for the index; the entropy measure itself is an invented descriptor whose independent predictive power is tested empirically.

axioms (2)

standard math Triangle inequality holds for Euclidean distances
Invoked to derive the 4ε validity radius for certificates.
domain assumption The spatial index has bounded cell occupancy
Required to obtain the O(|F_t| log N) repair time bound.

invented entities (1)

repair-frontier entropy H(F_t) no independent evidence
purpose: Normalized Shannon entropy of failed certificates across index cells, used as workload descriptor
Newly defined quantity whose value is computed from the frontier and used to choose maintenance strategy.

pith-pipeline@v0.9.1-grok · 5805 in / 1462 out tokens · 27334 ms · 2026-06-26T21:55:37.148199+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references

[1]

Agarwal, Lars Arge, and Jeff Erickson

Pankaj K. Agarwal, Lars Arge, and Jeff Erickson. Indexing moving points.Journal of Computer and System Sciences, 66(1):207–243, 2003

2003
[2]

Mount, Nathan S

Sunil Arya, David M. Mount, Nathan S. Netanyahu, Ruth Silverman, and Angela Y. Wu. An optimal algorithm for approximate nearest neighbor searching in fixed dimensions.Journal of the ACM, 45(6):891–923, 1998

1998
[3]

Guibas, and John Hershberger

Julien Basch, Leonidas J. Guibas, and John Hershberger. Data structures for mobile data. Journal of Algorithms, 31(1):1–28, 1999

1999
[4]

Multidimensional binary search trees used for associative searching.Com- munications of the ACM, 18(9):509–517, 1975

Jon Louis Bentley. Multidimensional binary search trees used for associative searching.Com- munications of the ACM, 18(9):509–517, 1975

1975
[5]

Jon Louis Bentley and James B. Saxe. Decomposable searching problems I: Static-to-dynamic transformation.Journal of Algorithms, 1(4):301–358, 1980

1980
[6]

A framework for generating network-based moving objects.GeoInformatica, 6(2):153–180, 2002

Thomas Brinkhoff. A framework for generating network-based moving objects.GeoInformatica, 6(2):153–180, 2002

2002
[7]

MLX: Efficient and flexible machine learning on Apple silicon

Awni Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. MLX: Efficient and flexible machine learning on Apple silicon. https://github.com/ml-explore/mlx, 2023

2023
[8]

Billion-scale similarity search with GPUs

Jeff Johnson, Matthijs Douze, and Herv´ e J´ egou. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2021

2021
[9]

Maximizing parallelism in the construction of BVHs, octrees, and k-d trees

Tero Karras. Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In Proceedings of the Fourth ACM SIGGRAPH/Eurographics Conference on High-Performance Graphics (HPG), pages 33–37, 2012

2012
[10]

Fast BVH construction on GPUs.Computer Graphics Forum, 28(2):375–384, 2009

Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. Fast BVH construction on GPUs.Computer Graphics Forum, 28(2):375–384, 2009

2009
[11]

Kinetic data structures for all nearest neighbors and closest pair in the plane

Zahed Rahmati, Valerie King, and Sue Whitesides. Kinetic data structures for all nearest neighbors and closest pair in the plane. InProceedings of the 29th Annual Symposium on Computational Geometry (SoCG), pages 137–144, 2013

2013
[12]

Morgan Kaufmann, 2006

Hanan Samet.Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, 2006

2006
[13]

Roofline: An insightful visual performance model for multicore architectures.Communications of the ACM, 52(4):65–76, 2009

Samuel Williams, Andrew Waterman, and David Patterson. Roofline: An insightful visual performance model for multicore architectures.Communications of the ACM, 52(4):65–76, 2009. 8

2009
[14]

Real-time KD-tree construction on graphics hardware.ACM Transactions on Graphics, 27(5):1–11, 2008

Kun Zhou, Qiming Hou, Rui Wang, and Baining Guo. Real-time KD-tree construction on graphics hardware.ACM Transactions on Graphics, 27(5):1–11, 2008. A Proofs and analysis Astepdisplaces every point by Euclidean distance at most ε; ci = di 2 −d i 1 ≥ 0 is the clearance of pointibefore the step and nn i its nearest neighbour. We writed ′ for post-step dista...

2008

[1] [1]

Agarwal, Lars Arge, and Jeff Erickson

Pankaj K. Agarwal, Lars Arge, and Jeff Erickson. Indexing moving points.Journal of Computer and System Sciences, 66(1):207–243, 2003

2003

[2] [2]

Mount, Nathan S

Sunil Arya, David M. Mount, Nathan S. Netanyahu, Ruth Silverman, and Angela Y. Wu. An optimal algorithm for approximate nearest neighbor searching in fixed dimensions.Journal of the ACM, 45(6):891–923, 1998

1998

[3] [3]

Guibas, and John Hershberger

Julien Basch, Leonidas J. Guibas, and John Hershberger. Data structures for mobile data. Journal of Algorithms, 31(1):1–28, 1999

1999

[4] [4]

Multidimensional binary search trees used for associative searching.Com- munications of the ACM, 18(9):509–517, 1975

Jon Louis Bentley. Multidimensional binary search trees used for associative searching.Com- munications of the ACM, 18(9):509–517, 1975

1975

[5] [5]

Jon Louis Bentley and James B. Saxe. Decomposable searching problems I: Static-to-dynamic transformation.Journal of Algorithms, 1(4):301–358, 1980

1980

[6] [6]

A framework for generating network-based moving objects.GeoInformatica, 6(2):153–180, 2002

Thomas Brinkhoff. A framework for generating network-based moving objects.GeoInformatica, 6(2):153–180, 2002

2002

[7] [7]

MLX: Efficient and flexible machine learning on Apple silicon

Awni Hannun, Jagrit Digani, Angelos Katharopoulos, and Ronan Collobert. MLX: Efficient and flexible machine learning on Apple silicon. https://github.com/ml-explore/mlx, 2023

2023

[8] [8]

Billion-scale similarity search with GPUs

Jeff Johnson, Matthijs Douze, and Herv´ e J´ egou. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2021

2021

[9] [9]

Maximizing parallelism in the construction of BVHs, octrees, and k-d trees

Tero Karras. Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In Proceedings of the Fourth ACM SIGGRAPH/Eurographics Conference on High-Performance Graphics (HPG), pages 33–37, 2012

2012

[10] [10]

Fast BVH construction on GPUs.Computer Graphics Forum, 28(2):375–384, 2009

Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. Fast BVH construction on GPUs.Computer Graphics Forum, 28(2):375–384, 2009

2009

[11] [11]

Kinetic data structures for all nearest neighbors and closest pair in the plane

Zahed Rahmati, Valerie King, and Sue Whitesides. Kinetic data structures for all nearest neighbors and closest pair in the plane. InProceedings of the 29th Annual Symposium on Computational Geometry (SoCG), pages 137–144, 2013

2013

[12] [12]

Morgan Kaufmann, 2006

Hanan Samet.Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, 2006

2006

[13] [13]

Roofline: An insightful visual performance model for multicore architectures.Communications of the ACM, 52(4):65–76, 2009

Samuel Williams, Andrew Waterman, and David Patterson. Roofline: An insightful visual performance model for multicore architectures.Communications of the ACM, 52(4):65–76, 2009. 8

2009

[14] [14]

Real-time KD-tree construction on graphics hardware.ACM Transactions on Graphics, 27(5):1–11, 2008

Kun Zhou, Qiming Hou, Rui Wang, and Baining Guo. Real-time KD-tree construction on graphics hardware.ACM Transactions on Graphics, 27(5):1–11, 2008. A Proofs and analysis Astepdisplaces every point by Euclidean distance at most ε; ci = di 2 −d i 1 ≥ 0 is the clearance of pointibefore the step and nn i its nearest neighbour. We writed ′ for post-step dista...

2008