pith. machine review for the scientific record. sign in

arxiv: 2605.00637 · v1 · submitted 2026-05-01 · 💻 cs.LG

Recognition: unknown

Class Angular Distortion Index for Dimensionality Reduction

Authors on Pith no claims yet

Pith reviewed 2026-05-09 20:25 UTC · model grok-4.3

classification 💻 cs.LG
keywords dimensionality reductioncluster qualityangular distortionCADIprojection evaluationdifferentiable metrict-SNEUMAP
0
0 comments X

The pith

The Class Angular Distortion Index measures how faithfully dimensionality reduction projections preserve cluster organization by computing internal angles from point triples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Dimensionality reduction techniques such as t-SNE and UMAP frequently produce projections in which clusters appear distinct yet sit in arbitrary or misleading relative positions. The paper introduces the Class Angular Distortion Index to evaluate the faithfulness of those arrangements through the internal angles formed by triples of points. This approach avoids assumptions about cluster shape or density and supplies usable signals in both real and synthetic cases where standard cluster metrics return uninformative results. Because the index rests on angle calculations, it is differentiable and therefore usable as an objective for training new projection methods, which the authors demonstrate with one such technique.

Core claim

The authors claim that the relative organization of clusters in a projection can be assessed by tracking how internal angles among point triples change from the original space. They formalize this idea as the Class Angular Distortion Index and show that it returns interpretable values on data sets where existing separability or shape-based metrics do not. The same angle foundation makes the index differentiable, allowing it to guide optimization of dimensionality reduction directly toward better cluster arrangements.

What carries the argument

The Class Angular Distortion Index (CADI), which quantifies distortion of internal angles formed by point triples to judge cluster arrangement faithfulness in a projection.

If this is right

  • Projections can be optimized directly against CADI to reduce misleading cluster placements.
  • The index supplies concrete, interpretable feedback on which inter-cluster angles are most distorted.
  • Evaluation of existing methods such as t-SNE and UMAP becomes possible even when clusters are non-spherical or unevenly dense.
  • A new dimensionality reduction algorithm that minimizes CADI is presented as a working example.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • CADI could be added to routine DR evaluation toolkits alongside neighborhood-preservation scores.
  • Its differentiability opens the possibility of hybrid objectives that balance local structure with global cluster geometry.
  • Different strategies for choosing which point triples to measure may affect sensitivity to particular types of distortion.

Load-bearing premise

Angles computed from point triples in the projected space reliably reflect the true relative organization of clusters in the original data.

What would settle it

A controlled synthetic data set whose ground-truth clusters are rotated or reordered in the projection; CADI should report high distortion while visual inspection or label comparison confirms the change.

Figures

Figures reproduced from arXiv: 2605.00637 by Jacob Miller, Kaviru Gunaratne, Stephen Kobourov.

Figure 1
Figure 1. Figure 1: An example of a synthetic dataset where the globularity assumption for clusters fails to hold. (a) shows the input 3D dataset, which [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: (a) For different numbers of triplets sampled, we calculate CADI for the t-SNE projection of the MNIST dataset on 10,000 different [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Projections of the rings (top) and concentric3 (bottom) datasets created by different DR algorithms. scores to the Random projection, as it assumes that all clusters fully overlap due to the fact that the cluster centroids almost coincide. Other synthetic data. We note that this pattern continues in the remaining two synthetic datasets. UMAP, t-SNE, and PaCMAP continue to separate clusters into uninformati… view at source ↗
Figure 4
Figure 4. Figure 4: Projections of the liver (top), usps (middle), and trec (bottom) datasets created by different DR algorithms. identifies and rewards these clusters formed by t-SNE. However, it fails to recognize the class-level information revealed by AngleEm￾bedding and ranks it last as a result. Other data. Across all datasets, only CADI consistently assigns the worst score to the Random projection. All other metrics at… view at source ↗
Figure 5
Figure 5. Figure 5: CADI scores for UMAP projections of the concentric3 [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Times taken to compute each metric on the t-SNE projec [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: For different numbers of triplets sampled, we calculate [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: The per-pixel standard deviation across images within [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗
Figure 8
Figure 8. Figure 8: The epsilon graphs of the liver dataset are plotted using [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 11
Figure 11. Figure 11: AngleEmbedding projection of the TREC dataset. [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: AngleEmbedding projection of the Emotion dataset. (a) [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: CADI scores for UMAP projections of the concentric3 dataset with different values for the hyperparameters [PITH_FULL_IMAGE:figures/full_fig_p022_13.png] view at source ↗
read the original abstract

Dimensionality reduction (DR) techniques are often characterized by whether they preserve global, high-level structures in the data or local, neighborhood structures. This distinction matters in visualization: global methods can obscure clusters while local methods can over-emphasize them. Yet, even when clusters appear distinct, their relative arrangement in the projection may be arbitrary or misleading, a common issue in techniques such as t-SNE and UMAP. Existing cluster quality metrics either only measure cluster separability or assume spherical, globular clusters in the original space. We introduce the Class Angular Distortion Index (CADI), a metric that uses internal angles among point triples to determine the faithfulness of cluster organization in a projection. We show cases on both real and synthetic data where existing cluster metrics fail, but CADI provides an interpretable result. Since it relies on computing angles, CADI is also differentiable, enabling optimization. We demonstrate this with a CADI-based DR technique.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Class Angular Distortion Index (CADI), a metric that computes internal angles among point triples in a low-dimensional projection to assess the faithfulness of cluster organization. The authors claim CADI succeeds on real and synthetic data where existing cluster metrics fail (due to assumptions of spherical clusters or focus only on separability), requires no assumptions on cluster shape, density, or triple selection, is differentiable (enabling optimization), and is demonstrated via a CADI-optimized DR technique.

Significance. If the central claims hold, CADI would provide a useful, interpretable addition to DR evaluation tools by addressing limitations of metrics like silhouette score or Davies-Bouldin index for non-globular clusters. The differentiability property is a concrete strength that could support new optimization-based DR methods, and the empirical cases where standard metrics fail illustrate potential practical value.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (definition of CADI): the claim that CADI 'works without needing assumptions about cluster shape, density, or the specific choice of point triples' is load-bearing but unsupported. The metric is defined from angles on point triples, yet no explicit, parameter-free rule for selecting or averaging over the triple set is given; random, nearest-neighbor, or density-biased sampling would systematically affect results for elongated, non-convex, or varying-density clusters—the exact regimes highlighted as failures of prior metrics.
  2. [§4] §4 (differentiability and optimization): the assertion that CADI is differentiable and enables a new DR technique is central to the contribution, but no explicit derivative, gradient expression, or verification that the angle-based formulation remains well-defined under projection is supplied, leaving the optimization demonstration unverifiable.
minor comments (2)
  1. [§3] Notation for the angle computation and the final CADI formula is introduced without a clear preliminary definition or diagram, making the geometric construction harder to follow on first reading.
  2. [Abstract] The abstract states the metric 'exists and is differentiable' but supplies no equations; moving a concise definition or key formula to the abstract would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The two major comments identify areas where the manuscript would benefit from greater explicitness. We address each point below and will revise the manuscript to incorporate the requested clarifications and mathematical details.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (definition of CADI): the claim that CADI 'works without needing assumptions about cluster shape, density, or the specific choice of point triples' is load-bearing but unsupported. The metric is defined from angles on point triples, yet no explicit, parameter-free rule for selecting or averaging over the triple set is given; random, nearest-neighbor, or density-biased sampling would systematically affect results for elongated, non-convex, or varying-density clusters—the exact regimes highlighted as failures of prior metrics.

    Authors: We agree that the current text does not supply an explicit rule for triple selection and averaging. We will revise the abstract and §3 to state that CADI is defined by averaging the absolute difference between the original-space and projection-space angles over all possible triples of points that belong to the same class. This choice is parameter-free and does not rely on cluster shape, density, or any sampling heuristic. We will also add a short robustness subsection demonstrating that the resulting index remains stable under moderate subsampling for the non-globular and varying-density cases already shown in the experiments. revision: yes

  2. Referee: [§4] §4 (differentiability and optimization): the assertion that CADI is differentiable and enables a new DR technique is central to the contribution, but no explicit derivative, gradient expression, or verification that the angle-based formulation remains well-defined under projection is supplied, leaving the optimization demonstration unverifiable.

    Authors: We acknowledge that §4 currently asserts differentiability without supplying the gradient. In the revision we will insert the explicit gradient of CADI with respect to the low-dimensional coordinates, obtained by differentiating the arccos expression for each triple angle via the chain rule on vector dot products and norms. We will also note the measure-zero set of configurations where points are collinear (where the derivative is undefined) and describe how the optimization procedure handles these cases. A brief verification that the gradient is well-defined for generic projections will be included. revision: yes

Circularity Check

0 steps flagged

CADI is introduced as a direct geometric definition with no reduction to fitted inputs or self-citations.

full rationale

The paper defines the Class Angular Distortion Index explicitly from internal angles computed on point triples in the projection. This is a first-principles construction using standard Euclidean geometry rather than any fitted parameter, renamed empirical pattern, or load-bearing self-citation. No equations or claims in the abstract reduce the metric to its own inputs by construction; the differentiability claim follows immediately from the angle formula. The paper contrasts CADI with existing metrics on real and synthetic data without invoking prior author work as a uniqueness theorem or ansatz. The derivation chain is therefore self-contained as a new index definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the geometric interpretation of angles as faithful indicators of cluster organization and on the assumption that this measure is meaningfully different from existing metrics.

axioms (1)
  • domain assumption Euclidean angles computed from point coordinates capture the intended notion of cluster angular organization.
    Invoked when the metric is defined from internal angles among point triples.
invented entities (1)
  • Class Angular Distortion Index (CADI) no independent evidence
    purpose: To quantify faithfulness of cluster angular arrangement in projections.
    Newly introduced quantity whose independent validation is not shown in the abstract.

pith-pipeline@v0.9.0 · 5460 in / 1162 out tokens · 24911 ms · 2026-05-09T20:25:32.150714+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 43 canonical work pages · 3 internal anchors

  1. [1]

    Alpaydin and Fevzi

    Alpaydin E., Alimoglu F. : Pen-Based Recognition of Handwritten Digits . UCI Machine Learning Repository, 1996. https://doi.org/10.24432/C5MG6K doi:10.24432/C5MG6K

  2. [2]

    Borg I., Groenen P. J. : Modern multidimensional scaling: Theory and applications. Springer, 2005. https://doi.org/10.1007/0-387-28981-X doi:10.1007/0-387-28981-X

  3. [3]

    In 2025 IEEE Visualization and Visual Analytics (VIS)

    Bae J., Jeon H., Seo J. : Metric design != metric behavior: Improving metric selection for the unbiased evaluation of dimensionality reduction. In 2025 IEEE Visualization and Visual Analytics (VIS) (2025), IEEE, pp. 46--50. https://doi.org/10.1109/VIS60296.2025.00014 doi:10.1109/VIS60296.2025.00014

  4. [4]

    : Robust dimensionality reduction for data visualization with deep neural networks

    Becker M., Lippel J., Stuhlsatz A., Zielke T. : Robust dimensionality reduction for data visualization with deep neural networks. Graphical Models 108 (2020), 101060. https://doi.org/https://doi.org/10.1016/j.gmod.2020.101060 doi:https://doi.org/10.1016/j.gmod.2020.101060

  5. [5]

    Belkin and P

    Belkin M., Niyogi P. : Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation 15, 6 (2003), 1373--1396. https://doi.org/10.1162/089976603321780317 doi:10.1162/089976603321780317

  6. [6]

    : Dimensionality reduction for data visualization and exploratory analysis of ceramic assemblages

    Cardarelli L., Lapadula A. : Dimensionality reduction for data visualization and exploratory analysis of ceramic assemblages. Archeologia e Calcolatori 33 (2022), 33--52. Licensed under CC BY-NC-ND 4.0. https://doi.org/10.19282/ac.33.2.2022.03 doi:10.19282/ac.33.2.2022.03

  7. [7]

    IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1(2), 224-227 (1979)

    Davies D. L., Bouldin D. W. : A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1, 2 (1979), 224--227. https://doi.org/10.1109/TPAMI.1979.4766909 doi:10.1109/TPAMI.1979.4766909

  8. [8]

    : Comparing community structure identification

    Danon L., D \'i az- G uilera A., Duch J., Arenas A. : Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 9 (Sept. 2005), 219--228. https://doi.org/10.1088/1742-5468/2005/09/P09008 doi:10.1088/1742-5468/2005/09/P09008

  9. [9]

    : Towards scaling-invariant projections for data visualization

    Dierkes J., Stelter D., R \"o ssl C., Theisel H. : Towards scaling-invariant projections for data visualization. In Computer Graphics Forum (2025), Wiley Online Library, p. e70063. https://doi.org/10.1111/CGF.70063 doi:10.1111/CGF.70063

  10. [10]

    M., Kerren A., Hirata N

    Espadoto M., Martins R. M., Kerren A., Hirata N. S., Telea A. C. : Toward a quantitative survey of dimension reduction techniques. IEEE Transactions on Visualization and Computer Graphics 27, 3 (2019), 2153--2173. https://doi.org/10.1109/TVCG.2019.2944182 doi:10.1109/TVCG.2019.2944182

  11. [11]

    The role of mutation bias in adaptive molecular evolution: insights from convergent changes in protein function

    Feltes B., Chandelier E. B., Grisci B. I., Dorn M. : Cumida: An extensively curated microarray database for benchmarking and testing of machine learning approaches in cancer research. Journal of Computational Biology 26, 4 (2019), 376--386. https://doi.org/10.1089/cmb.2018.0238 doi:10.1089/cmb.2018.0238

  12. [12]

    : Sailing in high-dimensional spaces: Low-dimensional embeddings through angle preservation, 2024

    Fischer J., Ma R. : Sailing in high-dimensional spaces: Low-dimensional embeddings through angle preservation, 2024. URL: https://arxiv.org/abs/2406.09876, http://arxiv.org/abs/2406.09876 arXiv:2406.09876

  13. [13]

    Journal of Classification2(1), 193–218 (1985) https://doi.org/10.1007/BF01908075

    Hubert L. J., Arabie P. : Comparing partitions. Journal of Classification 2 (1985), 193--218. https://doi.org/10.1007/BF01908075 doi:10.1007/BF01908075

  14. [14]

    , author Batistakis, Y

    Halkidi M., Batistakis Y., Vazirgiannis M. : On clustering validation techniques. Journal of Intelligent Information Systems 17 (2001), 107--145. https://doi.org/10.1023/A:1012801612483 doi:10.1023/A:1012801612483

  15. [15]

    M., Hill A

    Horst A. M., Hill A. P., Gorman K. B. : palmerpenguins: P almer Archipelago ( A ntarctica) penguin data , 2020. R package version 0.1.0. https://doi.org/10.5281/zenodo.3960218 doi:10.5281/zenodo.3960218

  16. [16]

    I., Huroyan V., Kobourov S., Navarrete R

    Hossain M. I., Huroyan V., Kobourov S., Navarrete R. : Multi-perspective, simultaneous embedding. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2020), 1569--1579

  17. [17]

    : pbmc3k: Raw and Processed Matrices of the PBMC 3k Dataset, 2024

    Hoffman P. : pbmc3k: Raw and Processed Matrices of the PBMC 3k Dataset, 2024. R package version 0.1.0. URL: https://github.com/mojaveazure/pbmc3k

  18. [18]

    two heads are better than one

    Jeon H., Cho A., Jang J., Lee S., Hyun J., Ko H.-K., Jo J., Seo J. : ZADU : A python library for evaluating the reliability of dimensionality reduction embeddings. In 2023 IEEE Visualization and Visual Analytics (VIS) (2023), pp. 196--200. https://doi.org/10.1109/VIS54172.2023.00048 doi:10.1109/VIS54172.2023.00048

  19. [19]

    Heer and D

    Jeon H., Kuo Y.-H., Aupetit M., Ma K.-L., Seo J. : Classes are not clusters: Improving label-based evaluation of dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics 30, 1 (2023), 781--791. https://doi.org/10.1109/TVCG.2023.3327187 doi:10.1109/TVCG.2023.3327187

  20. [20]

    Bressa, H

    Jeon H., Ko H.-K., Jo J., Kim Y., Seo J. : Measuring and explaining the inter-cluster reliability of multidimensional projections. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2021), 551--561. https://doi.org/10.1109/TVCG.2021.3114833 doi:10.1109/TVCG.2021.3114833

  21. [21]

    : UMATO : Bridging local and global structures for reliable visual analytics with dimensionality reduction

    Jeon H., Ko K., Lee S., Hyun J., Yang T., Go G., Jo J., Seo J. : UMATO : Bridging local and global structures for reliable visual analytics with dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics 31, 12 (2025), 10503--10520. https://doi.org/10.1109/TVCG.2025.3602735 doi:10.1109/TVCG.2025.3602735

  22. [22]

    Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , articleno =

    Jeon H., Lee H., Kuo Y.-H., Yang T., Archambault D., Ko S., Fujiwara T., Ma K.-L., Seo J. : Unveiling high-dimensional backstage: A survey for reliable visual analytics with dimensionality reduction. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (2025), pp. 1--24. https://doi.org/10.1145/3706598.3713551 doi:10.1145/370659...

  23. [23]

    Mehta, G

    Jeon H., Park J., Lee S., Kim D. H., Shin S., Seo J. : Dataset-Adaptive Dimensionality Reduction . IEEE Transactions on Visualization & Computer Graphics, 01 (Nov. 5555), 1--11. https://doi.org/10.1109/TVCG.2025.3634784 doi:10.1109/TVCG.2025.3634784

  24. [24]

    : Stop misusing t- SNE and UMAP for visual analytics

    Jeon H., Park J., Shin S., Seo J. : Stop misusing t- SNE and UMAP for visual analytics. arXiv preprint arXiv:2506.08725 (2025)

  25. [25]

    Adam: A Method for Stochastic Optimization

    Kingma D. P., Ba J. : Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015), Bengio Y., LeCun Y., (Eds.). URL: http://arxiv.org/abs/1412.6980

  26. [26]

    a J., Oja M., Venna J., T \

    Kaski S., Nikkil \"a J., Oja M., Venna J., T \"o r \"o nen P., Castr \'e n E. : Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4, 1 (2003), 48. https://doi.org/10.1186/1471-2105-4-48 doi:10.1186/1471-2105-4-48

  27. [27]

    : Sentiment Labelled Sentences

    Kotzias D. : Sentiment Labelled Sentences . UCI Machine Learning Repository, 2015. https://doi.org/10.24432/C57604 doi:10.24432/C57604

  28. [28]

    : Benchmarking of dimensionality reduction methods to capture drug response in transcriptome data

    Kwon Y., Park S., Park S., Lee H. : Benchmarking of dimensionality reduction methods to capture drug response in transcriptome data. Scientific Reports 15 (2025), 32173. https://doi.org/10.1038/s41598-025-12021-7 doi:10.1038/s41598-025-12021-7

  29. [29]

    1998 , month = nov, journal =

    Lecun Y., Bottou L., Bengio Y., Haffner P. : Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324. https://doi.org/10.1109/5.726791 doi:10.1109/5.726791

  30. [30]

    : Learning question classifiers

    Li X., Roth D. : Learning question classifiers. In 19th International Conference on Computational Linguistics, COLING 2002, Howard International House and Academia Sinica, Taipei, Taiwan, August 24 - September 1, 2002 (2002). URL: https://aclanthology.org/C02-1150/

  31. [31]

    : State of the art of graph visualization in non-euclidean spaces

    Miller J., Bhatia D., Kobourov S. : State of the art of graph visualization in non-euclidean spaces. In Computer Graphics Forum (2024), vol. 43, Wiley Online Library, p. e15113

  32. [32]

    Machado, M

    Machado A., Behrisch M., Telea A. : Necessary but not sufficient: Limitations of projection quality metrics. In Computer Graphics Forum (2025), Wiley Online Library, p. e70101. https://doi.org/10.1111/CGF.70101 doi:10.1111/CGF.70101

  33. [33]

    L., Daly R

    Maas A. L., Daly R. E., Pham P. T., Huang D., Ng A. Y., Potts C. : Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (Portland, Oregon, USA, June 2011), Association for Computational Linguistics, pp. 142--150. URL: http://www.aclweb.org/anth...

  34. [34]

    : Balancing between the local and global structures ( LGS ) in graph embedding

    Miller J., Huroyan V., Kobourov S. : Balancing between the local and global structures ( LGS ) in graph embedding. In Graph Drawing and Network Visualization (2023), Springer-Verlag, p. 263–279. https://doi.org/10.1007/978-3-031-49272-3_18 doi:10.1007/978-3-031-49272-3_18

  35. [35]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    McInnes L., Healy J., Melville J. : UMAP : Uniform manifold approximation and projection for dimension reduction. In arXiv preprint arXiv:1802.03426 (2018)

  36. [36]

    Motta R., Minghim R., de Andrade Lopes A., Oliveira M. C. F. : Graph-based measures to assist user assessment of multidimensional projections. Neurocomputing 150 (2015), 583--598

  37. [37]

    G., Aupetit M

    Nonato L. G., Aupetit M. : Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrichment. IEEE Transactions on Visualization and Computer Graphics 25, 8 (2018), 2650--2673. https://doi.org/10.1109/TVCG.2018.2846735 doi:10.1109/TVCG.2018.2846735

  38. [38]

    A., Nayar S

    Nene S. A., Nayar S. K., Murase H. : Columbia Object Image Library (COIL-100). Tech. Rep. CUCS-006-96, Department of Computer Science, Columbia University, February 1996

  39. [39]

    A., Nayar S

    Nene S. A., Nayar S. K., Murase H. : Columbia Object Image Library (COIL-20). Tech. Rep. CUCS-005-96, Department of Computer Science, Columbia University, February 1996

  40. [40]

    V., Arleo A., van den Elzen S

    Paulovich F. V., Arleo A., van den Elzen S. : When dimensionality reduction meets graph (drawing) theory: Introducing a common framework, challenges and opportunities. In Computer Graphics Forum (2025), Wiley Online Library, p. e70105. https://doi.org/10.1111/CGF.70105 doi:10.1111/CGF.70105

  41. [41]

    : Principal components analysis

    Pearson K. : Principal components analysis. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 6, 2 (1901), 559

  42. [42]

    Rousseeuw P. J. : Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20 (1987), 53--65. https://doi.org/https://doi.org/10.1016/0377-0427(87)90125-7 doi:https://doi.org/10.1016/0377-0427(87)90125-7

  43. [43]

    : How scale breaks `` Normalized Stress '' and KL divergence: Rethinking quality metrics

    Smelser K., Gunaratne K., Miller J., Kobourov S. : How scale breaks `` Normalized Stress '' and KL divergence: Rethinking quality metrics. IEEE Transactions on Visualization and Computer Graphics (2026), 1--14. https://doi.org/10.1109/TVCG.2026.3657654 doi:10.1109/TVCG.2026.3657654

  44. [44]

    : Parameterisation of a stochastic model for human face identification

    Samaria F., Harter A. : Parameterisation of a stochastic model for human face identification. In Proceedings of 1994 IEEE Workshop on Applications of Computer Vision (1994), pp. 138--142. https://doi.org/10.1109/ACV.1994.341300 doi:10.1109/ACV.1994.341300

  45. [45]

    Shepard R. N. : The analysis of proximities: Multidimensional scaling with an unknown distance function. i. Psychometrika 27, 2 (1962), 125–140. https://doi.org/10.1007/BF02289630 doi:10.1007/BF02289630

  46. [46]

    CARER : Contextualized Affect Representations for Emotion Recognition

    Saravia E., Liu H.-C. T., Huang Y.-H., Wu J., Chen Y.-S. : CARER : Contextualized affect representations for emotion recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (Brussels, Belgium, Oct.-Nov. 2018), Association for Computational Linguistics, pp. 3687--3697. https://doi.org/10.18653/v1/D18-1404 doi:...

  47. [47]

    Sainburg T., McInnes L., Gentner T. Q. : Parametric UMAP embeddings for representation and semisupervised learning. Neural Computation 33, 11 (10 2021), 2881--2907. http://arxiv.org/abs/https://direct.mit.edu/neco/article-pdf/33/11/2881/1966656/neco_a_01434.pdf arXiv:https://direct.mit.edu/neco/article-pdf/33/11/2881/1966656/neco_a_01434.pdf , https://doi...

  48. [48]

    Computer Graphics Forum29(2), 419–428 (2010) https://doi.org/10.1111/j.1467-8659.2009.01611.x

    Sips M., Neubert B., Lewis J., Hanrahan P. : Selecting good views of high-dimensional data using class consistency. Computer Graphics Forum (2009). https://doi.org/10.1111/j.1467-8659.2009.01467.x doi:10.1111/j.1467-8659.2009.01467.x

  49. [49]

    : Accuracy, robustness and scalability of dimensionality reduction methods for single-cell rna-seq analysis

    Sun S., Zhu J., Ma Y., Zhou X. : Accuracy, robustness and scalability of dimensionality reduction methods for single-cell rna-seq analysis. Genome Biology 20 (12 2019). https://doi.org/10.1186/s13059-019-1898-6 doi:10.1186/s13059-019-1898-6

  50. [50]

    : Learning a parametric embedding by preserving local structure

    van der Maaten L. : Learning a parametric embedding by preserving local structure. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA, 16--18 Apr 2009), van Dyk D., Welling M., (Eds.), vol. 5 of Proceedings of Machine Learning Research, PMLR, pp. 3...

  51. [51]

    : Visualizing data using t- SNE

    van der Maaten L., Hinton G. : Visualizing data using t- SNE . In Journal of Machine Learning Research (2008)

  52. [52]

    : Information retrieval perspective to nonlinear dimensionality reduction for data visualization

    Venna J., Peltonen J., Nybo K., Aidos H., Kaski S. : Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research 11 (Mar. 2010), 451–490

  53. [53]

    : Understanding how dimension reduction tools work: an empirical approach to deciphering t- SNE , UMAP , tri M ap, and PaCMAP for data visualization

    Wang Y., Huang H., Rudin C., Shaposhnik Y. : Understanding how dimension reduction tools work: an empirical approach to deciphering t- SNE , UMAP , tri M ap, and PaCMAP for data visualization. JMLR 22, 1 (Jan. 2021). URL: http://jmlr.org/papers/v22/20-1061.html

  54. [54]

    : How to use t- SNE effectively

    Wattenberg M., Vi \'e gas F., Johnson I. : How to use t- SNE effectively. Distill 1, 10 (2016), e2

  55. [55]

    : Deep metric learning with angular loss

    Wang J., Zhou F., Wen S., Liu X., Lin Y. : Deep metric learning with angular loss. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

  56. [56]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    Xiao H., Rasul K., Vollgraf R. : Fashion- MNIST : a novel image dataset for benchmarking machine learning algorithms, 2017. URL: https://arxiv.org/abs/1708.07747, http://arxiv.org/abs/1708.07747 arXiv:1708.07747

  57. [57]

    : Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data

    Yang Y., Sun H., Zhang Y., Zhang T., Gong J., Wei Y., Duan Y.-G., Shu M., Yang Y., Wu D., Yu D. : Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data. Cell Reports 36, 4 (2021), 109442. https://doi.org/https://doi.org/10.1016/j.celrep.2021.109442 doi:https://doi.org/10.1016/j.celrep.2021.109442

  58. [58]

    : Silhouette-based evaluation of PCA , Isomap , and t- SNE on linear and nonlinear data structures

    Zahed M., Skafyan M. : Silhouette-based evaluation of PCA , Isomap , and t- SNE on linear and nonlinear data structures. Stats 8, 4 (2025). https://doi.org/10.3390/stats8040105 doi:10.3390/stats8040105