LINSCAN -- A Linearity Based Clustering Algorithm
Pith reviewed 2026-05-24 00:11 UTC · model grok-4.3
The pith
LINSCAN detects lineated clusters that are spatially close but orthogonally oriented by embedding points as local normal distributions and using a KL-divergence distance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By embedding points as normal distributions approximating their local neighborhoods and leveraging a distance function derived from the Kullback-Leibler Divergence, LINSCAN can detect and distinguish lineated clusters that are spatially close but have orthogonal covariances.
What carries the argument
Embedding each point as a normal distribution fitted to its local neighborhood, paired with a distance derived from the Kullback-Leibler Divergence between those distributions.
If this is right
- The algorithm recovers intersecting faults and their orientations from seismic data.
- It extends density-based clustering while aiming to keep the stability properties of DBSCAN and OPTICS.
- Any successful generalization of DBSCAN or OPTICS must satisfy specific stability conditions stated in the paper.
Where Pith is reading between the lines
- The same local-normal embedding could be tested on other data with crossing linear features such as road networks or filamentary structures in images.
- Replacing the fixed neighborhood size with an adaptive scale might improve performance on lines of varying density.
- The KL distance could be swapped for other divergences between Gaussians to check whether separation of orthogonal directions remains robust.
Load-bearing premise
Local neighborhoods can be approximated by normal distributions whose covariances capture linear direction well enough for the KL distance to separate orthogonal clusters without extra tuning.
What would settle it
Run LINSCAN on a synthetic dataset of two thin, crossing line segments whose centers are close and whose directions are orthogonal; failure to recover them as separate clusters would falsify the separation claim.
Figures
read the original abstract
DBSCAN and OPTICS are powerful algorithms for identifying clusters of points in domains where few assumptions can be made about the structure of the data. In this paper, we leverage these strengths and introduce a new algorithm, LINSCAN, designed to seek lineated clusters that are difficult to find and isolate with existing methods. In particular, by embedding points as normal distributions approximating their local neighborhoods and leveraging a distance function derived from the Kullback Leibler Divergence, LINSCAN can detect and distinguish lineated clusters that are spatially close but have orthogonal covariances. We demonstrate how LINSCAN can be applied to seismic data to identify active faults, including intersecting faults, and determine their orientation. Finally, we discuss the properties a generalization of DBSCAN and OPTICS must have in order to retain the stability benefits of these algorithms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces LINSCAN as an extension of DBSCAN and OPTICS for detecting lineated clusters. Points are embedded as normal distributions approximating local neighborhoods, and a distance derived from the Kullback-Leibler Divergence is used to distinguish clusters that are spatially close but have orthogonal covariances. The method is applied to seismic data to identify active faults including intersecting ones, and the paper discusses the properties needed for generalizations of DBSCAN/OPTICS to retain stability benefits.
Significance. If the local-normal embedding and KL-distance construction can be shown to separate orthogonal lineated structures without neighborhood contamination or hidden parameters, the approach would be significant for applications such as fault detection in seismology where standard density-based methods fail on intersecting linear features. The explicit focus on parameter-free operation and stability inheritance from DBSCAN/OPTICS is a constructive framing.
major comments (2)
- [Abstract] Abstract: the central claim that normal-distribution embeddings of local neighborhoods yield covariances whose principal axes reflect only one lineated cluster (enabling reliable KL-based separation of orthogonal structures) is load-bearing yet unsupported by any derivation, definition of the neighborhood, or pseudocode; without these the claim cannot be checked against the contamination mode in which boundary points mix points from spatially close orthogonal clusters.
- [Final section] Final section (stability discussion): the claimed retention of DBSCAN/OPTICS stability is not shown to survive the case of intersecting or spatially adjacent lineated clusters, where any fixed or adaptive neighborhood produces a mixture covariance; the closed-form KL divergence between such mixtures does not automatically encode orthogonality, leaving the motivating use-case unaddressed.
minor comments (1)
- The manuscript would be strengthened by the addition of explicit pseudocode for the embedding step and the KL-distance computation, together with at least one small synthetic example showing the covariance matrices and resulting distances for two orthogonal lines.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. We address each major point below and indicate the revisions that will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that normal-distribution embeddings of local neighborhoods yield covariances whose principal axes reflect only one lineated cluster (enabling reliable KL-based separation of orthogonal structures) is load-bearing yet unsupported by any derivation, definition of the neighborhood, or pseudocode; without these the claim cannot be checked against the contamination mode in which boundary points mix points from spatially close orthogonal clusters.
Authors: We agree that the abstract claim is central and that the current manuscript does not supply an explicit derivation, a precise definition of the neighborhood used for the normal embedding, or pseudocode. These omissions make it difficult to verify the claim against boundary contamination. We will revise the manuscript to add (i) a formal definition of the local neighborhood, (ii) a short derivation showing that the fitted covariance principal axis aligns with a single lineated structure under the stated assumptions, (iii) pseudocode for the embedding step, and (iv) a brief discussion of how boundary points are treated to limit mixing from orthogonal clusters. revision: yes
-
Referee: [Final section] Final section (stability discussion): the claimed retention of DBSCAN/OPTICS stability is not shown to survive the case of intersecting or spatially adjacent lineated clusters, where any fixed or adaptive neighborhood produces a mixture covariance; the closed-form KL divergence between such mixtures does not automatically encode orthogonality, leaving the motivating use-case unaddressed.
Authors: We acknowledge that the stability discussion in the final section is framed at a general level and does not contain a specific analysis of mixture covariances arising from intersecting or adjacent lineated clusters. While the seismic application demonstrates successful separation of intersecting faults, this is empirical rather than a theoretical guarantee that the KL-based distance preserves orthogonality under mixture covariances. We will expand the final section to (a) explicitly consider the mixture case, (b) clarify the scope of the inherited stability claim, and (c) either provide additional reasoning or note the limitation and rely on the empirical results for the motivating use-case. revision: partial
Circularity Check
No circularity; derivation is self-contained extension of DBSCAN/OPTICS
full rationale
No equations, fitted parameters, or self-citations are exhibited in the provided text that would reduce the claimed KL-based distance or local normal embeddings to a definitionally equivalent input. The algorithm is presented as a direct construction using standard KL divergence on covariance approximations, with stability properties argued from generalization of existing methods rather than internal redefinition. The reader's assessment of score 2.0 aligns with absence of load-bearing reductions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Local neighborhoods of points can be approximated by normal distributions whose covariance matrices capture linear structure
- domain assumption A distance derived from Kullback-Leibler divergence between these distributions distinguishes clusters with orthogonal covariances
Reference graph
Works this paper leans on
-
[1]
M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD ’96, AAAI Press, 1996, p. 226–231
work page 1996
-
[2]
Y . Fialko, Estimation of absolute stress in the hypocentral region of the 2019 Ridgecrest, California, earthquakes, J. Geophys. Res. 126 (2021) e2021JB022000
work page 2019
-
[3]
X. Zou, Y . Fialko, A. Dennehy, A. Cloninger, S. Semnani, High-angle active conjugate faults in the Anza-Borrego shear zone, Southern California, Geophys. Res. Lett. 50 (2023) e2023GL105783
work page 2023
-
[4]
D. R. Shelly, R. J. Skoumal, J. L. Hardebeck, Fracture-mesh faulting in the swarm-like 2020 Maacama sequence revealed by high-precision earthquake detection, location, and focal mechanisms, Geophys. Res. Lett. 50 (2023) e2022GL101233
work page 2020
-
[5]
Barden, Stresses and displacements in a cross-anisotropic soil, Geotechnique 13 (3) (1963) 198–210
L. Barden, Stresses and displacements in a cross-anisotropic soil, Geotechnique 13 (3) (1963) 198–210
work page 1963
-
[6]
E. H. Isaaks, M. Srivastava, Applied geostatistics (1989)
work page 1989
-
[7]
G. Mai, K. Janowicz, Y . Hu, S. Gao, Adcn: An anisotropic density-based clustering algorithm for discovering spatial point patterns with noise, Transactions in GIS 22 (1) (2018) 348–369
work page 2018
-
[8]
E. S. Cochran, R. J. Skoumal, D. McPhillips, Z. E. Ross, K. M. Keranen, Activation of optimally and unfavourably oriented faults in a uniform local stress field during the 2011 Prague, Oklahoma, sequence, Geophys. J. Int. 222 (2020) 153–168. 8
work page 2011
-
[9]
G. Mai, K. Janowicz, Y . Hu, S. Gao, Adcn: An anisotropic density-based clustering algorithm, in: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, SIGSPACIAL ’16, Association for Computing Machinery, New York, NY , USA, 2016.doi:10.1145/2996913.2996940
-
[10]
M. Ankerst, M. M. Breunig, H.-P. Kriegel, J. Sander, Optics: Ordering points to identify the clustering structure, SIGMOD Rec. 28 (2) (1999) 49–60
work page 1999
-
[11]
M. A. Khamsi, W. A. Kirk, An introduction to metric spaces and fixed point theory, 304 pp., John Wiley & Sons, 2011
work page 2011
- [12]
- [13]
-
[14]
L. Hubert, P. Arabie, Comparing partitions, Journal of Classification 2 (1) (1985) 193–218. Appendix A. Approximation of KL(P|Q) First, log |A| is the logarithm of the product of the eigenvalues ofA, which is the same as the sum of the logarithms of the eigenvalues. Therefore, log |A| = tr(log(A)) where log(A) is the matrix logarithm, which exists and is ...
work page 1985
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.