Subspace Determination through Local Intrinsic Dimensional Decomposition: Theory and Experimentation

Arthur Zimek; Imane Hafnaoui; Michael E. Houle; Pan Li; Ruben Becker

arxiv: 1907.06771 · v1 · pith:UPY6OVSZnew · submitted 2019-07-15 · 💻 cs.LG · stat.ML

Subspace Determination through Local Intrinsic Dimensional Decomposition: Theory and Experimentation

Ruben Becker , Imane Hafnaoui , Michael E. Houle , Pan Li , Arthur Zimek This is my paper

Pith reviewed 2026-05-24 21:10 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords subspace clusteringlocal intrinsic dimensionaxis-aligned subspacesfeature discriminabilityhigh-dimensional dataLID estimationlocal neighborhoodscluster formation

0 comments

The pith

Decomposing local intrinsic dimension along feature axes identifies subspaces that support cluster formation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to estimate local intrinsic dimension along individual axis projections for each data point. This decomposition highlights the feature directions that contribute most to the local complexity and discriminability around points. By focusing on these directions, it aims to find axis-aligned subspaces without exhaustively searching all feature combinations. If successful, this could make subspace clustering more efficient in high-dimensional spaces where many features may be irrelevant locally.

Core claim

An estimator of LID along axis projections is developed, and preliminary evidence is provided that this LID decomposition can indicate axis-aligned data subspaces that support the formation of clusters, by identifying axes with the greatest local discriminability or fewest LID components capturing local complexity.

What carries the argument

The estimator of local intrinsic dimension along axis projections, which identifies directions of greatest local discriminability.

If this is right

For each data point the axes with highest LID components indicate the most discriminative features locally.
The decomposition reduces the search space for subspace clustering by avoiding evaluation of every possible feature combination.
Identified subspaces are those where clusters are likely to form due to high local discriminability.
The fewest LID components needed to capture local complexity mark the key contributing axes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The per-point axis selection could be integrated into density-based clustering pipelines to focus computation on relevant dimensions.
If the axis extension holds, the same decomposition might guide local feature weighting in outlier detection tasks.
Experiments on data with planted axis-aligned structures would provide a direct test of whether the recovered subspaces match ground-truth clusters.

Load-bearing premise

That the Local Intrinsic Dimension model extends to axis projections in a way that preserves identification of features with greatest local discriminability.

What would settle it

A controlled experiment on synthetic data with known axis-aligned cluster subspaces where the LID decomposition fails to recover those subspaces would falsify the central claim.

Figures

Figures reproduced from arXiv: 1907.06771 by Arthur Zimek, Imane Hafnaoui, Michael E. Houle, Pan Li, Ruben Becker.

**Figure 2.** Figure 2: Plot for the results for estimators hill distances, sum hill projections, and sum w hill projections in a neighborhood of size k = 100, for dimensions m between 2 and 1024. Note that the neighborhoods for hill distances and sum w hill projections are chosen with respect to the Euclidean norm, while the neighborhoods for sum hill projections are chosen with respect to the L∞ norm. The errorbars denote 95% c… view at source ↗

**Figure 3.** Figure 3: ID estimates as they were computed by sum hill projections in an L∞ neighborhood of k = 100 points, for different combinations of uniform distributions. We compare this value IDd∗ F with the sum Pm i=1 IDd∗ i , where we consider two ways of obtaining the estimates IDd∗ i . In the first case (sum hill projections), we pick k nearest neighbors with respect to the L∞ norm. In the second case (sum w hill proj… view at source ↗

**Figure 4.** Figure 4: DiSH 3D Dataset. (a) Description. d ||S|| Noisy Ai T1 30 {5, 5, 5, 5, 5} 5 T2 50 {3, 5, 7, 7, 11} 17 T3 100 {3, 5, 7, 7, 11} 67 (b) Results. NMI AMI ARI Recall T1 DiSH 0.535 0.362 0.264 0.582 CLIQUE 0.431 0.275 0.303 0.635 LID-DBSCAN 0.801 0.734 0.803 0.726 T2 DiSH 0.568 0.396 0.532 0.7 CLIQUE 0.644 0.473 0.568 0.78 LID-DBSCAN 0.779 0.695 0.716 0.765 T3 DiSH 0.570 0.397 0.412 0.702 CLIQUE 0.644 0.473 0.568… view at source ↗

**Figure 5.** Figure 5: Clustering results on the DiSH 3D dataset shown in Figure 4. [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: With respect to both RNIA and ARR, LID decomposition significantly outperforms [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

read the original abstract

Axis-aligned subspace clustering generally entails searching through enormous numbers of subspaces (feature combinations) and evaluation of cluster quality within each subspace. In this paper, we tackle the problem of identifying subsets of features with the most significant contribution to the formation of the local neighborhood surrounding a given data point. For each point, the recently-proposed Local Intrinsic Dimension (LID) model is used in identifying the axis directions along which features have the greatest local discriminability, or equivalently, the fewest number of components of LID that capture the local complexity of the data. In this paper, we develop an estimator of LID along axis projections, and provide preliminary evidence that this LID decomposition can indicate axis-aligned data subspaces that support the formation of clusters.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The new axis-projection LID estimator is a targeted incremental step for local subspace ranking, but the abstract-level evidence leaves the practical payoff unclear.

read the letter

The paper develops an estimator for local intrinsic dimension along single-axis projections and uses it to rank features by their contribution to local neighborhood complexity. This is meant to flag axis-aligned subspaces that matter for clustering without exhaustive search over combinations. The core move is adapting the existing LID model to projections, then interpreting the resulting decomposition as a signal for which directions carry the most local discriminability. That is the actual new piece, and it sits cleanly on top of prior LID work without obvious circularity or invented parameters. If the derivation is clean and the experiments hold, it gives a direct way to prune features per point in high-dimensional settings. The approach is straightforward and addresses a real pain point in axis-aligned subspace clustering. The soft spot is the thin evidence base. The abstract calls the results preliminary and gives no derivation steps, error analysis, or bias checks, so it is hard to judge whether the projection extension actually preserves the needed local information or introduces artifacts. The central assumption—that LID on axes reliably identifies the most discriminative features—remains untested in the provided text. This paper is aimed at people already working on local dimensionality estimation or subspace clustering methods. A reader in that narrow area could extract the estimator and try it, but broader audiences will not find much to take away yet. The work shows honest engagement with the problem and the literature, so it deserves a serious referee to check the math and the experiments rather than a desk reject.

Referee Report

2 major / 1 minor

Summary. The paper claims to develop an estimator of Local Intrinsic Dimension (LID) along axis projections for each data point, using it to identify axis directions with greatest local discriminability (equivalently, the fewest LID components capturing local complexity). This is positioned as a way to determine axis-aligned subspaces supporting cluster formation without exhaustive search over feature combinations, with preliminary experimental evidence provided.

Significance. If the estimator and its extension to projections are sound, the approach could offer a scalable, non-combinatorial alternative for subspace clustering in high dimensions by directly linking LID decomposition to feature relevance for local neighborhoods.

major comments (2)

[Abstract] Abstract: the central claim rests on developing and validating an estimator of LID along axis projections, yet the provided text gives no derivation, no explicit extension of the prior LID model to projections, and no error analysis or bias discussion; this absence makes the soundness of the extension (that it preserves identification of features with greatest local discriminability) impossible to assess from the manuscript.
[Abstract] Abstract: preliminary evidence is asserted for the claim that LID decomposition indicates axis-aligned subspaces supporting clusters, but no details are supplied on the experimental setup, datasets, baselines, quantitative metrics, or controls for confounding factors such as dimensionality or noise; without these the evidence cannot be evaluated as supporting the claim.

minor comments (1)

[Abstract] Abstract: the parenthetical equivalence between 'greatest local discriminability' and 'fewest number of components of LID' is stated without supporting justification or reference to prior LID work; a brief clarification would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful comments on our manuscript. We address each of the major comments below and will revise the abstract to better convey the key elements of our contribution.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim rests on developing and validating an estimator of LID along axis projections, yet the provided text gives no derivation, no explicit extension of the prior LID model to projections, and no error analysis or bias discussion; this absence makes the soundness of the extension (that it preserves identification of features with greatest local discriminability) impossible to assess from the manuscript.

Authors: While the abstract is concise by nature, the full manuscript details the derivation of the axis-projection estimator for LID in the methods section, explicitly extends the prior LID model to handle axis projections, and includes discussion of error and bias in the theoretical analysis. To address the concern that the abstract alone does not allow assessment, we will revise the abstract to include a brief overview of the estimator's development and its properties. revision: yes
Referee: [Abstract] Abstract: preliminary evidence is asserted for the claim that LID decomposition indicates axis-aligned subspaces supporting clusters, but no details are supplied on the experimental setup, datasets, baselines, quantitative metrics, or controls for confounding factors such as dimensionality or noise; without these the evidence cannot be evaluated as supporting the claim.

Authors: The experimental details, including the setup with synthetic and real datasets, comparison to baselines, metrics used for evaluation, and controls for dimensionality and noise, are presented in the experiments section of the paper. The abstract characterizes the results as 'preliminary evidence' to reflect their initial nature. We will update the abstract to provide a short summary of the experimental validation to make this clearer. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation builds on independent prior LID model

full rationale

The paper develops an estimator of LID along axis projections by extending the recently-proposed LID model to identify axis directions of greatest local discriminability. No equations or derivations are shown that reduce the target result to a fitted parameter or self-defined quantity by construction. The central claim relies on the prior LID framework as an external input rather than re-deriving it from the subspace result itself. No self-citation chain, ansatz smuggling, or renaming of known results is exhibited in the provided text that would force the outcome. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review conducted from abstract only; no explicit free parameters, axioms, or invented entities are detailed. The LID model itself is treated as a background assumption from prior work.

axioms (1)

domain assumption The Local Intrinsic Dimension (LID) model accurately captures local data complexity and can be meaningfully decomposed along axes.
Invoked as the foundation for the new estimator and subspace indication claim.

pith-pipeline@v0.9.0 · 5660 in / 1047 out tokens · 17507 ms · 2026-05-24T21:10:45.037893+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

[1]

Achtert, E., B¨ ohm, C., David, J., Kr¨ oger, P., Zimek, A.: Global correlation clustering based on the Hough transform. Stat. Anal. Data Min. 1(3), 111–127 (2008)

work page 2008
[2]

In: Proc

Achtert, E., B¨ ohm, C., Kriegel, H.P., Kr¨ oger, P., M¨ uller-Gorman, I., Zimek, A.: Finding hierarchies of subspace clusters. In: Proc. PKDD. pp. 446–453 (2006)

work page 2006
[3]

In: Proc

Achtert, E., B¨ ohm, C., Kriegel, H.P., Kr¨ oger, P., M¨ uller-Gorman, I., Zimek, A.: Detection and visualization of subspace cluster hierarchies. In: Proc. DASFAA. pp. 152–163 (2007)

work page 2007
[4]

In: Proc

Achtert, E., B¨ ohm, C., Kriegel, H.P., Kr¨ oger, P., Zimek, A.: Robust, complete, and eﬃcient correlation clustering. In: Proc. SDM. pp. 413–418 (2007)

work page 2007
[5]

In: Proc

Aggarwal, C.C., Procopiuc, C.M., Wolf, J.L., Yu, P.S., Park, J.S.: Fast algorithms for projected clustering. In: Proc. SIGMOD. pp. 61–72 (1999)

work page 1999
[6]

In: Proc

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. SIGMOD. pp. 94–105 (1998)

work page 1998
[7]

In: Proc

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB. pp. 487–499 (1994)

work page 1994
[8]

In: Proc

Amsaleg, L., Chelly, O., Furon, T., Girard, S., Houle, M.E., Kawarabayashi, K., Nett, M.: Estimating local intrinsic dimensionality. In: Proc. KDD. pp. 29–38 (2015)

work page 2015
[9]

In: Proc

Assent, I., Krieger, R., M¨ uller, E., Seidl, T.: DUSC: dimensionality unbiased subspace clustering. In: Proc. ICDM. pp. 409–414 (2007)

work page 2007
[10]

In: Proc

Assent, I., Krieger, R., M¨ uller, E., Seidl, T.: EDSC: eﬃcient density-based subspace clus- tering. In: Proc. CIKM. pp. 1093–1102 (2008)

work page 2008
[11]

In: Proc

Becker, R., Hafnaoui, I., Houle, M.E., Li, P., Zimek, A.: Subspace determination through local intrinsic dimensional decomposition. In: Proc. SISAP (2019)

work page 2019
[12]

In: Proc

Bernecker, T., Emrich, T., Graf, F., Kriegel, H.P., Kr¨ oger, P., Renz, M., Schubert, E., Zimek, A.: Subspace similarity search: Eﬃcient k-NN queries in arbitrary subspaces. In: Proc. SSDBM. pp. 555–564 (2010)

work page 2010
[13]

nearest neighbor

Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” mean- ingful? In: Proc. ICDT. pp. 217–235 (1999)

work page 1999
[14]

In: Proc

B¨ ohm, C., Kailing, K., Kriegel, H.P., Kr¨ oger, P.: Density connected clustering with local subspace preferences. In: Proc. ICDM. pp. 27–34 (2004)

work page 2004
[15]

In: Proc

B¨ ohm, C., Kailing, K., Kr¨ oger, P., Zimek, A.: Computing clusters of correlation connected objects. In: Proc. SIGMOD. pp. 455–466 (2004) 15

work page 2004
[16]

PVLDB 10(7), 769–780 (2017)

Casanova, G., Englmeier, E., Houle, M., Kr¨ oger, P., Nett, M., Schubert, E., Zimek, A.: Dimensional testing for reverse k-nearest neighbor search. PVLDB 10(7), 769–780 (2017)

work page 2017
[17]

In: Proc

Cheng, C.H., Fu, A.W.C., Zhang, Y.: Entropy-based subspace clustering for mining nu- merical data. In: Proc. KDD. pp. 84–93 (1999)

work page 1999
[18]

In: Proc

Domeniconi, C., Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: Proc. SDM (2004)

work page 2004
[19]

In: Proc

Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD. pp. 226–231 (1996)

work page 1996
[20]

IEEE TKDE 19(7), 873–886 (2007)

Fran¸ cois, D., Wertz, V., Verleysen, M.: The concentration of fractional distances. IEEE TKDE 19(7), 873–886 (2007)

work page 2007
[21]

Friedman, J.H., Meulman, J.J.: Clustering objects on subsets of attributes. J. R. Statist. Soc. B 66(4), 825–849 (2004)

work page 2004
[22]

In: Proc

Houle, M.E.: Dimensionality, discriminability, density and distance distributions. In: Proc. ICDM Workshops. pp. 468–473 (2013)

work page 2013
[23]

In: Proc

Houle, M.E.: Local intrinsic dimensionality I: an extreme-value-theoretic foundation for similarity applications. In: Proc. SISAP. pp. 64–79 (2017)

work page 2017
[24]

In: Proc

Houle, M.E.: Local intrinsic dimensionality II: multivariate analysis and distributional support. In: Proc. SISAP. pp. 80–95 (2017)

work page 2017
[25]

Houle, M.E., Kriegel, H.P., Kr¨ oger, P., Schubert, E., Zimek, A.: Can shared-neighbor distances defeat the curse of dimensionality? In: Proc. SSDBM. pp. 482–500 (2010)

work page 2010
[26]

In: Proc

Houle, M.E., Ma, X., Oria, V., Sun, J.: Eﬃcient algorithms for similarity search in axis- aligned subspaces. In: Proc. SISAP. pp. 1–12 (2014)

work page 2014
[27]

Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

work page 1985
[28]

The Teaching of Mathematics 8(1), 15–29 (2005)

Kadelburg, Z., Marjanovi´ c, M.: Interchanging two limits. The Teaching of Mathematics 8(1), 15–29 (2005)

work page 2005
[29]

In: Proc

Kailing, K., Kriegel, H.P., Kr¨ oger, P.: Density-connected subspace clustering for high- dimensional data. In: Proc. SDM. pp. 246–257 (2004)

work page 2004
[30]

In: Proc

Kriegel, H.P., Kr¨ oger, P., Renz, M., Wurst, S.: A generic framework for eﬃcient subspace clustering of high-dimensional data. In: Proc. ICDM. pp. 250–257 (2005)

work page 2005
[31]

ACM TKDD3(1), 1–58 (2009)

Kriegel, H.P., Kr¨ oger, P., Zimek, A.: Clustering high dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM TKDD3(1), 1–58 (2009)

work page 2009
[32]

WIREs DMKD 2(4), 351–364 (2012)

Kriegel, H.P., Kr¨ oger, P., Zimek, A.: Subspace clustering. WIREs DMKD 2(4), 351–364 (2012)

work page 2012
[33]

In: Proc

Ma, X., Li, B., Wang, Y., Erfani, S.M., Wijewickrema, S.N.R., Schoenebeck, G., Song, D., Houle, M.E., Bailey, J.: Characterizing adversarial subspaces using local intrinsic dimen- sionality. In: Proc. ICLR. pp. 1–15 (2018)

work page 2018
[34]

In: Proc

Ma, X., Wang, Y., Houle, M.E., Zhou, S., Erfani, S.M., Xia, S., Wijewickrema, S.N.R., Bailey, J.: Dimensionality-driven learning with noisy labels. In: Proc. ICML. pp. 3361–3370 (2018) 16

work page 2018
[35]

KAIS 14(3), 273–298 (2008)

Moise, G., Sander, J., Ester, M.: Robust projected clustering. KAIS 14(3), 273–298 (2008)

work page 2008
[36]

Muller, M.E.: A note on a method for generating points uniformly on n-dimensional spheres. Commun. ACM 2(4), 19–20 (Apr 1959)

work page 1959
[37]

In: Proc

Nagesh, H.S., Goil, S., Choudhary, A.: Adaptive grids for clustering massive data sets. In: Proc. SDM. pp. 1–17 (2001)

work page 2001
[38]

IEEE Transactions on Knowl- edge and Data Engineering 18(7), 902–916 (2006)

Patrikainen, A., Meila, M.: Comparing subspace clusterings. IEEE Transactions on Knowl- edge and Data Engineering 18(7), 902–916 (2006)

work page 2006
[39]

In: ICPR16

Romano, S., Chelly, O., Nguyen, V., Bailey, J., Houle, M.E.: Measuring dependency via intrinsic dimensionality. In: ICPR16. pp. 1207–1212 (Dec 2016)

work page 2016
[40]

Science 290, 2323–2326 (2000)

Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

work page 2000
[41]

Machine learning 89(1-2), 37–65 (2012)

Rozza, A., Lombardi, G., Ceruti, C., Casiraghi, E., Campadelli, P.: Novel high intrinsic dimensionality estimators. Machine learning 89(1-2), 37–65 (2012)

work page 2012
[42]

Data Min

Sim, K., Gopalkrishnan, V., Zimek, A., Cong, G.: A survey on enhanced subspace cluster- ing. Data Min. Knowl. Disc. 26(2), 332–397 (2013)

work page 2013
[43]

Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

work page 2002
[44]

Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clustering comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)

work page 2010
[45]

Woo, K.G., Lee, J.H., Kim, M.H., Lee, Y.J.: FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. Inform. Software Technol. 46(4), 255–271 (2004)

work page 2004
[46]

In: Proc

Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: Proc. ICDE. pp. 329–340 (2005)

work page 2005
[47]

IEEE TKDE 17(2), 176–189 (2005)

Yiu, M.L., Mamoulis, N.: Iterative projected clustering by subspace mining. IEEE TKDE 17(2), 176–189 (2005)

work page 2005
[48]

In: Aggarwal, C.C., Han, J

Zimek, A., Assent, I., Vreeken, J.: Frequent pattern mining algorithms for data clustering. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, chap. 16, pp. 403–423. Springer (2014)

work page 2014
[49]

Zimek, A., Schubert, E., Kriegel, H.P.: A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Min. 5(5), 363–387 (2012) 17

work page 2012

[1] [1]

Achtert, E., B¨ ohm, C., David, J., Kr¨ oger, P., Zimek, A.: Global correlation clustering based on the Hough transform. Stat. Anal. Data Min. 1(3), 111–127 (2008)

work page 2008

[2] [2]

In: Proc

Achtert, E., B¨ ohm, C., Kriegel, H.P., Kr¨ oger, P., M¨ uller-Gorman, I., Zimek, A.: Finding hierarchies of subspace clusters. In: Proc. PKDD. pp. 446–453 (2006)

work page 2006

[3] [3]

In: Proc

Achtert, E., B¨ ohm, C., Kriegel, H.P., Kr¨ oger, P., M¨ uller-Gorman, I., Zimek, A.: Detection and visualization of subspace cluster hierarchies. In: Proc. DASFAA. pp. 152–163 (2007)

work page 2007

[4] [4]

In: Proc

Achtert, E., B¨ ohm, C., Kriegel, H.P., Kr¨ oger, P., Zimek, A.: Robust, complete, and eﬃcient correlation clustering. In: Proc. SDM. pp. 413–418 (2007)

work page 2007

[5] [5]

In: Proc

Aggarwal, C.C., Procopiuc, C.M., Wolf, J.L., Yu, P.S., Park, J.S.: Fast algorithms for projected clustering. In: Proc. SIGMOD. pp. 61–72 (1999)

work page 1999

[6] [6]

In: Proc

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. SIGMOD. pp. 94–105 (1998)

work page 1998

[7] [7]

In: Proc

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB. pp. 487–499 (1994)

work page 1994

[8] [8]

In: Proc

Amsaleg, L., Chelly, O., Furon, T., Girard, S., Houle, M.E., Kawarabayashi, K., Nett, M.: Estimating local intrinsic dimensionality. In: Proc. KDD. pp. 29–38 (2015)

work page 2015

[9] [9]

In: Proc

Assent, I., Krieger, R., M¨ uller, E., Seidl, T.: DUSC: dimensionality unbiased subspace clustering. In: Proc. ICDM. pp. 409–414 (2007)

work page 2007

[10] [10]

In: Proc

Assent, I., Krieger, R., M¨ uller, E., Seidl, T.: EDSC: eﬃcient density-based subspace clus- tering. In: Proc. CIKM. pp. 1093–1102 (2008)

work page 2008

[11] [11]

In: Proc

Becker, R., Hafnaoui, I., Houle, M.E., Li, P., Zimek, A.: Subspace determination through local intrinsic dimensional decomposition. In: Proc. SISAP (2019)

work page 2019

[12] [12]

In: Proc

Bernecker, T., Emrich, T., Graf, F., Kriegel, H.P., Kr¨ oger, P., Renz, M., Schubert, E., Zimek, A.: Subspace similarity search: Eﬃcient k-NN queries in arbitrary subspaces. In: Proc. SSDBM. pp. 555–564 (2010)

work page 2010

[13] [13]

nearest neighbor

Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” mean- ingful? In: Proc. ICDT. pp. 217–235 (1999)

work page 1999

[14] [14]

In: Proc

B¨ ohm, C., Kailing, K., Kriegel, H.P., Kr¨ oger, P.: Density connected clustering with local subspace preferences. In: Proc. ICDM. pp. 27–34 (2004)

work page 2004

[15] [15]

In: Proc

B¨ ohm, C., Kailing, K., Kr¨ oger, P., Zimek, A.: Computing clusters of correlation connected objects. In: Proc. SIGMOD. pp. 455–466 (2004) 15

work page 2004

[16] [16]

PVLDB 10(7), 769–780 (2017)

Casanova, G., Englmeier, E., Houle, M., Kr¨ oger, P., Nett, M., Schubert, E., Zimek, A.: Dimensional testing for reverse k-nearest neighbor search. PVLDB 10(7), 769–780 (2017)

work page 2017

[17] [17]

In: Proc

Cheng, C.H., Fu, A.W.C., Zhang, Y.: Entropy-based subspace clustering for mining nu- merical data. In: Proc. KDD. pp. 84–93 (1999)

work page 1999

[18] [18]

In: Proc

Domeniconi, C., Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: Proc. SDM (2004)

work page 2004

[19] [19]

In: Proc

Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD. pp. 226–231 (1996)

work page 1996

[20] [20]

IEEE TKDE 19(7), 873–886 (2007)

Fran¸ cois, D., Wertz, V., Verleysen, M.: The concentration of fractional distances. IEEE TKDE 19(7), 873–886 (2007)

work page 2007

[21] [21]

Friedman, J.H., Meulman, J.J.: Clustering objects on subsets of attributes. J. R. Statist. Soc. B 66(4), 825–849 (2004)

work page 2004

[22] [22]

In: Proc

Houle, M.E.: Dimensionality, discriminability, density and distance distributions. In: Proc. ICDM Workshops. pp. 468–473 (2013)

work page 2013

[23] [23]

In: Proc

Houle, M.E.: Local intrinsic dimensionality I: an extreme-value-theoretic foundation for similarity applications. In: Proc. SISAP. pp. 64–79 (2017)

work page 2017

[24] [24]

In: Proc

Houle, M.E.: Local intrinsic dimensionality II: multivariate analysis and distributional support. In: Proc. SISAP. pp. 80–95 (2017)

work page 2017

[25] [25]

Houle, M.E., Kriegel, H.P., Kr¨ oger, P., Schubert, E., Zimek, A.: Can shared-neighbor distances defeat the curse of dimensionality? In: Proc. SSDBM. pp. 482–500 (2010)

work page 2010

[26] [26]

In: Proc

Houle, M.E., Ma, X., Oria, V., Sun, J.: Eﬃcient algorithms for similarity search in axis- aligned subspaces. In: Proc. SISAP. pp. 1–12 (2014)

work page 2014

[27] [27]

Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

work page 1985

[28] [28]

The Teaching of Mathematics 8(1), 15–29 (2005)

Kadelburg, Z., Marjanovi´ c, M.: Interchanging two limits. The Teaching of Mathematics 8(1), 15–29 (2005)

work page 2005

[29] [29]

In: Proc

Kailing, K., Kriegel, H.P., Kr¨ oger, P.: Density-connected subspace clustering for high- dimensional data. In: Proc. SDM. pp. 246–257 (2004)

work page 2004

[30] [30]

In: Proc

Kriegel, H.P., Kr¨ oger, P., Renz, M., Wurst, S.: A generic framework for eﬃcient subspace clustering of high-dimensional data. In: Proc. ICDM. pp. 250–257 (2005)

work page 2005

[31] [31]

ACM TKDD3(1), 1–58 (2009)

Kriegel, H.P., Kr¨ oger, P., Zimek, A.: Clustering high dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM TKDD3(1), 1–58 (2009)

work page 2009

[32] [32]

WIREs DMKD 2(4), 351–364 (2012)

Kriegel, H.P., Kr¨ oger, P., Zimek, A.: Subspace clustering. WIREs DMKD 2(4), 351–364 (2012)

work page 2012

[33] [33]

In: Proc

Ma, X., Li, B., Wang, Y., Erfani, S.M., Wijewickrema, S.N.R., Schoenebeck, G., Song, D., Houle, M.E., Bailey, J.: Characterizing adversarial subspaces using local intrinsic dimen- sionality. In: Proc. ICLR. pp. 1–15 (2018)

work page 2018

[34] [34]

In: Proc

Ma, X., Wang, Y., Houle, M.E., Zhou, S., Erfani, S.M., Xia, S., Wijewickrema, S.N.R., Bailey, J.: Dimensionality-driven learning with noisy labels. In: Proc. ICML. pp. 3361–3370 (2018) 16

work page 2018

[35] [35]

KAIS 14(3), 273–298 (2008)

Moise, G., Sander, J., Ester, M.: Robust projected clustering. KAIS 14(3), 273–298 (2008)

work page 2008

[36] [36]

Muller, M.E.: A note on a method for generating points uniformly on n-dimensional spheres. Commun. ACM 2(4), 19–20 (Apr 1959)

work page 1959

[37] [37]

In: Proc

Nagesh, H.S., Goil, S., Choudhary, A.: Adaptive grids for clustering massive data sets. In: Proc. SDM. pp. 1–17 (2001)

work page 2001

[38] [38]

IEEE Transactions on Knowl- edge and Data Engineering 18(7), 902–916 (2006)

Patrikainen, A., Meila, M.: Comparing subspace clusterings. IEEE Transactions on Knowl- edge and Data Engineering 18(7), 902–916 (2006)

work page 2006

[39] [39]

In: ICPR16

Romano, S., Chelly, O., Nguyen, V., Bailey, J., Houle, M.E.: Measuring dependency via intrinsic dimensionality. In: ICPR16. pp. 1207–1212 (Dec 2016)

work page 2016

[40] [40]

Science 290, 2323–2326 (2000)

Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

work page 2000

[41] [41]

Machine learning 89(1-2), 37–65 (2012)

Rozza, A., Lombardi, G., Ceruti, C., Casiraghi, E., Campadelli, P.: Novel high intrinsic dimensionality estimators. Machine learning 89(1-2), 37–65 (2012)

work page 2012

[42] [42]

Data Min

Sim, K., Gopalkrishnan, V., Zimek, A., Cong, G.: A survey on enhanced subspace cluster- ing. Data Min. Knowl. Disc. 26(2), 332–397 (2013)

work page 2013

[43] [43]

Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

work page 2002

[44] [44]

Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clustering comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)

work page 2010

[45] [45]

Woo, K.G., Lee, J.H., Kim, M.H., Lee, Y.J.: FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. Inform. Software Technol. 46(4), 255–271 (2004)

work page 2004

[46] [46]

In: Proc

Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: Proc. ICDE. pp. 329–340 (2005)

work page 2005

[47] [47]

IEEE TKDE 17(2), 176–189 (2005)

Yiu, M.L., Mamoulis, N.: Iterative projected clustering by subspace mining. IEEE TKDE 17(2), 176–189 (2005)

work page 2005

[48] [48]

In: Aggarwal, C.C., Han, J

Zimek, A., Assent, I., Vreeken, J.: Frequent pattern mining algorithms for data clustering. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, chap. 16, pp. 403–423. Springer (2014)

work page 2014

[49] [49]

Zimek, A., Schubert, E., Kriegel, H.P.: A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Min. 5(5), 363–387 (2012) 17

work page 2012