arxiv: 2605.00637 · v1 · submitted 2026-05-01 · 💻 cs.LG

Recognition: unknown

Class Angular Distortion Index for Dimensionality Reduction

Kaviru Gunaratne , Stephen Kobourov , Jacob Miller

Authors on Pith no claims yet

Pith reviewed 2026-05-09 20:25 UTC · model grok-4.3

classification 💻 cs.LG

keywords dimensionality reductioncluster qualityangular distortionCADIprojection evaluationdifferentiable metrict-SNEUMAP

0 comments

The pith

The Class Angular Distortion Index measures how faithfully dimensionality reduction projections preserve cluster organization by computing internal angles from point triples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Dimensionality reduction techniques such as t-SNE and UMAP frequently produce projections in which clusters appear distinct yet sit in arbitrary or misleading relative positions. The paper introduces the Class Angular Distortion Index to evaluate the faithfulness of those arrangements through the internal angles formed by triples of points. This approach avoids assumptions about cluster shape or density and supplies usable signals in both real and synthetic cases where standard cluster metrics return uninformative results. Because the index rests on angle calculations, it is differentiable and therefore usable as an objective for training new projection methods, which the authors demonstrate with one such technique.

Core claim

The authors claim that the relative organization of clusters in a projection can be assessed by tracking how internal angles among point triples change from the original space. They formalize this idea as the Class Angular Distortion Index and show that it returns interpretable values on data sets where existing separability or shape-based metrics do not. The same angle foundation makes the index differentiable, allowing it to guide optimization of dimensionality reduction directly toward better cluster arrangements.

What carries the argument

The Class Angular Distortion Index (CADI), which quantifies distortion of internal angles formed by point triples to judge cluster arrangement faithfulness in a projection.

If this is right

Projections can be optimized directly against CADI to reduce misleading cluster placements.
The index supplies concrete, interpretable feedback on which inter-cluster angles are most distorted.
Evaluation of existing methods such as t-SNE and UMAP becomes possible even when clusters are non-spherical or unevenly dense.
A new dimensionality reduction algorithm that minimizes CADI is presented as a working example.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

CADI could be added to routine DR evaluation toolkits alongside neighborhood-preservation scores.
Its differentiability opens the possibility of hybrid objectives that balance local structure with global cluster geometry.
Different strategies for choosing which point triples to measure may affect sensitivity to particular types of distortion.

Load-bearing premise

Angles computed from point triples in the projected space reliably reflect the true relative organization of clusters in the original data.

What would settle it

A controlled synthetic data set whose ground-truth clusters are rotated or reordered in the projection; CADI should report high distortion while visual inspection or label comparison confirms the change.

Figures

Figures reproduced from arXiv: 2605.00637 by Jacob Miller, Kaviru Gunaratne, Stephen Kobourov.

**Figure 1.** Figure 1: An example of a synthetic dataset where the globularity assumption for clusters fails to hold. (a) shows the input 3D dataset, which [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: (a) For different numbers of triplets sampled, we calculate CADI for the t-SNE projection of the MNIST dataset on 10,000 different [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Projections of the rings (top) and concentric3 (bottom) datasets created by different DR algorithms. scores to the Random projection, as it assumes that all clusters fully overlap due to the fact that the cluster centroids almost coincide. Other synthetic data. We note that this pattern continues in the remaining two synthetic datasets. UMAP, t-SNE, and PaCMAP continue to separate clusters into uninformati… view at source ↗

**Figure 4.** Figure 4: Projections of the liver (top), usps (middle), and trec (bottom) datasets created by different DR algorithms. identifies and rewards these clusters formed by t-SNE. However, it fails to recognize the class-level information revealed by AngleEmbedding and ranks it last as a result. Other data. Across all datasets, only CADI consistently assigns the worst score to the Random projection. All other metrics at… view at source ↗

**Figure 5.** Figure 5: CADI scores for UMAP projections of the concentric3 [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Times taken to compute each metric on the t-SNE projec [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: For different numbers of triplets sampled, we calculate [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 9.** Figure 9: The per-pixel standard deviation across images within [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗

**Figure 8.** Figure 8: The epsilon graphs of the liver dataset are plotted using [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

**Figure 11.** Figure 11: AngleEmbedding projection of the TREC dataset. [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗

**Figure 12.** Figure 12: AngleEmbedding projection of the Emotion dataset. (a) [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗

**Figure 13.** Figure 13: CADI scores for UMAP projections of the concentric3 dataset with different values for the hyperparameters [PITH_FULL_IMAGE:figures/full_fig_p022_13.png] view at source ↗

read the original abstract

Dimensionality reduction (DR) techniques are often characterized by whether they preserve global, high-level structures in the data or local, neighborhood structures. This distinction matters in visualization: global methods can obscure clusters while local methods can over-emphasize them. Yet, even when clusters appear distinct, their relative arrangement in the projection may be arbitrary or misleading, a common issue in techniques such as t-SNE and UMAP. Existing cluster quality metrics either only measure cluster separability or assume spherical, globular clusters in the original space. We introduce the Class Angular Distortion Index (CADI), a metric that uses internal angles among point triples to determine the faithfulness of cluster organization in a projection. We show cases on both real and synthetic data where existing cluster metrics fail, but CADI provides an interpretable result. Since it relies on computing angles, CADI is also differentiable, enabling optimization. We demonstrate this with a CADI-based DR technique.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CADI introduces an angle-based metric for cluster organization in DR projections and shows how to optimize with it, but the claim of no dependence on sampling or density needs explicit verification.

read the letter

CADI defines a metric for how well a dimensionality reduction keeps the angular relationships that define cluster organization, and the authors use it both to evaluate projections and to drive a new optimization-based DR technique. The paper does a good job showing where common metrics fall short. On synthetic data with non-globular clusters and on real examples, things like separation scores or silhouette measures don't capture the distortion in relative cluster positions, while CADI gives results that align with what you see in the plots. Building the index from angles is a clean choice because angles are preserved under certain transformations and the whole thing becomes differentiable without extra work. The main concern is whether the no-assumption claim holds up. The paper says CADI relies on internal angles among point triples and works independently of cluster shape, density, or how the triples are picked. But if triple selection ends up being influenced by local point density—say through random sampling in dense areas or nearest-neighbor selection—the average distortion could still vary with those properties. The stress-test note is right to flag this. The manuscript should include the exact procedure for choosing triples and some checks that different reasonable selections give similar CADI values. The citation pattern looks standard for the DR literature, and the geometry is basic enough that it doesn't introduce circularity. This paper is for researchers working on dimensionality reduction algorithms or visualization tools who need better ways to measure and enforce cluster fidelity. Anyone building or comparing DR methods would find the examples and the optimization demonstration useful. It has enough new content and practical relevance to warrant peer review, though the reviewers will probably ask for more on the sampling details and additional baseline comparisons.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Class Angular Distortion Index (CADI), a metric that computes internal angles among point triples in a low-dimensional projection to assess the faithfulness of cluster organization. The authors claim CADI succeeds on real and synthetic data where existing cluster metrics fail (due to assumptions of spherical clusters or focus only on separability), requires no assumptions on cluster shape, density, or triple selection, is differentiable (enabling optimization), and is demonstrated via a CADI-optimized DR technique.

Significance. If the central claims hold, CADI would provide a useful, interpretable addition to DR evaluation tools by addressing limitations of metrics like silhouette score or Davies-Bouldin index for non-globular clusters. The differentiability property is a concrete strength that could support new optimization-based DR methods, and the empirical cases where standard metrics fail illustrate potential practical value.

major comments (2)

[Abstract and §3] Abstract and §3 (definition of CADI): the claim that CADI 'works without needing assumptions about cluster shape, density, or the specific choice of point triples' is load-bearing but unsupported. The metric is defined from angles on point triples, yet no explicit, parameter-free rule for selecting or averaging over the triple set is given; random, nearest-neighbor, or density-biased sampling would systematically affect results for elongated, non-convex, or varying-density clusters—the exact regimes highlighted as failures of prior metrics.
[§4] §4 (differentiability and optimization): the assertion that CADI is differentiable and enables a new DR technique is central to the contribution, but no explicit derivative, gradient expression, or verification that the angle-based formulation remains well-defined under projection is supplied, leaving the optimization demonstration unverifiable.

minor comments (2)

[§3] Notation for the angle computation and the final CADI formula is introduced without a clear preliminary definition or diagram, making the geometric construction harder to follow on first reading.
[Abstract] The abstract states the metric 'exists and is differentiable' but supplies no equations; moving a concise definition or key formula to the abstract would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The two major comments identify areas where the manuscript would benefit from greater explicitness. We address each point below and will revise the manuscript to incorporate the requested clarifications and mathematical details.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (definition of CADI): the claim that CADI 'works without needing assumptions about cluster shape, density, or the specific choice of point triples' is load-bearing but unsupported. The metric is defined from angles on point triples, yet no explicit, parameter-free rule for selecting or averaging over the triple set is given; random, nearest-neighbor, or density-biased sampling would systematically affect results for elongated, non-convex, or varying-density clusters—the exact regimes highlighted as failures of prior metrics.

Authors: We agree that the current text does not supply an explicit rule for triple selection and averaging. We will revise the abstract and §3 to state that CADI is defined by averaging the absolute difference between the original-space and projection-space angles over all possible triples of points that belong to the same class. This choice is parameter-free and does not rely on cluster shape, density, or any sampling heuristic. We will also add a short robustness subsection demonstrating that the resulting index remains stable under moderate subsampling for the non-globular and varying-density cases already shown in the experiments. revision: yes
Referee: [§4] §4 (differentiability and optimization): the assertion that CADI is differentiable and enables a new DR technique is central to the contribution, but no explicit derivative, gradient expression, or verification that the angle-based formulation remains well-defined under projection is supplied, leaving the optimization demonstration unverifiable.

Authors: We acknowledge that §4 currently asserts differentiability without supplying the gradient. In the revision we will insert the explicit gradient of CADI with respect to the low-dimensional coordinates, obtained by differentiating the arccos expression for each triple angle via the chain rule on vector dot products and norms. We will also note the measure-zero set of configurations where points are collinear (where the derivative is undefined) and describe how the optimization procedure handles these cases. A brief verification that the gradient is well-defined for generic projections will be included. revision: yes

Circularity Check

0 steps flagged

CADI is introduced as a direct geometric definition with no reduction to fitted inputs or self-citations.

full rationale

The paper defines the Class Angular Distortion Index explicitly from internal angles computed on point triples in the projection. This is a first-principles construction using standard Euclidean geometry rather than any fitted parameter, renamed empirical pattern, or load-bearing self-citation. No equations or claims in the abstract reduce the metric to its own inputs by construction; the differentiability claim follows immediately from the angle formula. The paper contrasts CADI with existing metrics on real and synthetic data without invoking prior author work as a uniqueness theorem or ansatz. The derivation chain is therefore self-contained as a new index definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the geometric interpretation of angles as faithful indicators of cluster organization and on the assumption that this measure is meaningfully different from existing metrics.

axioms (1)

domain assumption Euclidean angles computed from point coordinates capture the intended notion of cluster angular organization.
Invoked when the metric is defined from internal angles among point triples.

invented entities (1)

Class Angular Distortion Index (CADI) no independent evidence
purpose: To quantify faithfulness of cluster angular arrangement in projections.
Newly introduced quantity whose independent validation is not shown in the abstract.

pith-pipeline@v0.9.0 · 5460 in / 1162 out tokens · 24911 ms · 2026-05-09T20:25:32.150714+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 43 canonical work pages · 3 internal anchors

[1]

Alpaydin and Fevzi

Alpaydin E., Alimoglu F. : Pen-Based Recognition of Handwritten Digits . UCI Machine Learning Repository, 1996. https://doi.org/10.24432/C5MG6K doi:10.24432/C5MG6K

work page doi:10.24432/c5mg6k 1996
[2]

Borg I., Groenen P. J. : Modern multidimensional scaling: Theory and applications. Springer, 2005. https://doi.org/10.1007/0-387-28981-X doi:10.1007/0-387-28981-X

work page doi:10.1007/0-387-28981-x 2005
[3]

In 2025 IEEE Visualization and Visual Analytics (VIS)

Bae J., Jeon H., Seo J. : Metric design != metric behavior: Improving metric selection for the unbiased evaluation of dimensionality reduction. In 2025 IEEE Visualization and Visual Analytics (VIS) (2025), IEEE, pp. 46--50. https://doi.org/10.1109/VIS60296.2025.00014 doi:10.1109/VIS60296.2025.00014

work page doi:10.1109/vis60296.2025.00014 2025
[4]

: Robust dimensionality reduction for data visualization with deep neural networks

Becker M., Lippel J., Stuhlsatz A., Zielke T. : Robust dimensionality reduction for data visualization with deep neural networks. Graphical Models 108 (2020), 101060. https://doi.org/https://doi.org/10.1016/j.gmod.2020.101060 doi:https://doi.org/10.1016/j.gmod.2020.101060

work page doi:10.1016/j.gmod.2020.101060 2020
[5]

Belkin and P

Belkin M., Niyogi P. : Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation 15, 6 (2003), 1373--1396. https://doi.org/10.1162/089976603321780317 doi:10.1162/089976603321780317

work page doi:10.1162/089976603321780317 2003
[6]

: Dimensionality reduction for data visualization and exploratory analysis of ceramic assemblages

Cardarelli L., Lapadula A. : Dimensionality reduction for data visualization and exploratory analysis of ceramic assemblages. Archeologia e Calcolatori 33 (2022), 33--52. Licensed under CC BY-NC-ND 4.0. https://doi.org/10.19282/ac.33.2.2022.03 doi:10.19282/ac.33.2.2022.03

work page doi:10.19282/ac.33.2.2022.03 2022
[7]

IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1(2), 224-227 (1979)

Davies D. L., Bouldin D. W. : A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1, 2 (1979), 224--227. https://doi.org/10.1109/TPAMI.1979.4766909 doi:10.1109/TPAMI.1979.4766909

work page doi:10.1109/tpami.1979.4766909 1979
[8]

: Comparing community structure identification

Danon L., D \'i az- G uilera A., Duch J., Arenas A. : Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 9 (Sept. 2005), 219--228. https://doi.org/10.1088/1742-5468/2005/09/P09008 doi:10.1088/1742-5468/2005/09/P09008

work page doi:10.1088/1742-5468/2005/09/p09008 2005
[9]

: Towards scaling-invariant projections for data visualization

Dierkes J., Stelter D., R \"o ssl C., Theisel H. : Towards scaling-invariant projections for data visualization. In Computer Graphics Forum (2025), Wiley Online Library, p. e70063. https://doi.org/10.1111/CGF.70063 doi:10.1111/CGF.70063

work page doi:10.1111/cgf.70063 2025
[10]

M., Kerren A., Hirata N

Espadoto M., Martins R. M., Kerren A., Hirata N. S., Telea A. C. : Toward a quantitative survey of dimension reduction techniques. IEEE Transactions on Visualization and Computer Graphics 27, 3 (2019), 2153--2173. https://doi.org/10.1109/TVCG.2019.2944182 doi:10.1109/TVCG.2019.2944182

work page doi:10.1109/tvcg.2019.2944182 2019
[11]

The role of mutation bias in adaptive molecular evolution: insights from convergent changes in protein function

Feltes B., Chandelier E. B., Grisci B. I., Dorn M. : Cumida: An extensively curated microarray database for benchmarking and testing of machine learning approaches in cancer research. Journal of Computational Biology 26, 4 (2019), 376--386. https://doi.org/10.1089/cmb.2018.0238 doi:10.1089/cmb.2018.0238

work page doi:10.1089/cmb.2018.0238 2019
[12]

: Sailing in high-dimensional spaces: Low-dimensional embeddings through angle preservation, 2024

Fischer J., Ma R. : Sailing in high-dimensional spaces: Low-dimensional embeddings through angle preservation, 2024. URL: https://arxiv.org/abs/2406.09876, http://arxiv.org/abs/2406.09876 arXiv:2406.09876

work page arXiv 2024
[13]

Journal of Classification2(1), 193–218 (1985) https://doi.org/10.1007/BF01908075

Hubert L. J., Arabie P. : Comparing partitions. Journal of Classification 2 (1985), 193--218. https://doi.org/10.1007/BF01908075 doi:10.1007/BF01908075

work page doi:10.1007/bf01908075 1985
[14]

, author Batistakis, Y

Halkidi M., Batistakis Y., Vazirgiannis M. : On clustering validation techniques. Journal of Intelligent Information Systems 17 (2001), 107--145. https://doi.org/10.1023/A:1012801612483 doi:10.1023/A:1012801612483

work page doi:10.1023/a:1012801612483 2001
[15]

M., Hill A

Horst A. M., Hill A. P., Gorman K. B. : palmerpenguins: P almer Archipelago ( A ntarctica) penguin data , 2020. R package version 0.1.0. https://doi.org/10.5281/zenodo.3960218 doi:10.5281/zenodo.3960218

work page doi:10.5281/zenodo.3960218 2020
[16]

I., Huroyan V., Kobourov S., Navarrete R

Hossain M. I., Huroyan V., Kobourov S., Navarrete R. : Multi-perspective, simultaneous embedding. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2020), 1569--1579

2020
[17]

: pbmc3k: Raw and Processed Matrices of the PBMC 3k Dataset, 2024

Hoffman P. : pbmc3k: Raw and Processed Matrices of the PBMC 3k Dataset, 2024. R package version 0.1.0. URL: https://github.com/mojaveazure/pbmc3k

2024
[18]

two heads are better than one

Jeon H., Cho A., Jang J., Lee S., Hyun J., Ko H.-K., Jo J., Seo J. : ZADU : A python library for evaluating the reliability of dimensionality reduction embeddings. In 2023 IEEE Visualization and Visual Analytics (VIS) (2023), pp. 196--200. https://doi.org/10.1109/VIS54172.2023.00048 doi:10.1109/VIS54172.2023.00048

work page doi:10.1109/vis54172.2023.00048 2023
[19]

Heer and D

Jeon H., Kuo Y.-H., Aupetit M., Ma K.-L., Seo J. : Classes are not clusters: Improving label-based evaluation of dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics 30, 1 (2023), 781--791. https://doi.org/10.1109/TVCG.2023.3327187 doi:10.1109/TVCG.2023.3327187

work page doi:10.1109/tvcg.2023.3327187 2023
[20]

Bressa, H

Jeon H., Ko H.-K., Jo J., Kim Y., Seo J. : Measuring and explaining the inter-cluster reliability of multidimensional projections. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2021), 551--561. https://doi.org/10.1109/TVCG.2021.3114833 doi:10.1109/TVCG.2021.3114833

work page doi:10.1109/tvcg.2021.3114833 2021
[21]

: UMATO : Bridging local and global structures for reliable visual analytics with dimensionality reduction

Jeon H., Ko K., Lee S., Hyun J., Yang T., Go G., Jo J., Seo J. : UMATO : Bridging local and global structures for reliable visual analytics with dimensionality reduction. IEEE Transactions on Visualization and Computer Graphics 31, 12 (2025), 10503--10520. https://doi.org/10.1109/TVCG.2025.3602735 doi:10.1109/TVCG.2025.3602735

work page doi:10.1109/tvcg.2025.3602735 2025
[22]

Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , articleno =

Jeon H., Lee H., Kuo Y.-H., Yang T., Archambault D., Ko S., Fujiwara T., Ma K.-L., Seo J. : Unveiling high-dimensional backstage: A survey for reliable visual analytics with dimensionality reduction. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (2025), pp. 1--24. https://doi.org/10.1145/3706598.3713551 doi:10.1145/370659...

work page doi:10.1145/3706598.3713551 2025
[23]

Mehta, G

Jeon H., Park J., Lee S., Kim D. H., Shin S., Seo J. : Dataset-Adaptive Dimensionality Reduction . IEEE Transactions on Visualization & Computer Graphics, 01 (Nov. 5555), 1--11. https://doi.org/10.1109/TVCG.2025.3634784 doi:10.1109/TVCG.2025.3634784

work page doi:10.1109/tvcg.2025.3634784 2025
[24]

: Stop misusing t- SNE and UMAP for visual analytics

Jeon H., Park J., Shin S., Seo J. : Stop misusing t- SNE and UMAP for visual analytics. arXiv preprint arXiv:2506.08725 (2025)

work page arXiv 2025
[25]

Adam: A Method for Stochastic Optimization

Kingma D. P., Ba J. : Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015), Bengio Y., LeCun Y., (Eds.). URL: http://arxiv.org/abs/1412.6980

work page internal anchor Pith review Pith/arXiv arXiv 2015
[26]

a J., Oja M., Venna J., T \

Kaski S., Nikkil \"a J., Oja M., Venna J., T \"o r \"o nen P., Castr \'e n E. : Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4, 1 (2003), 48. https://doi.org/10.1186/1471-2105-4-48 doi:10.1186/1471-2105-4-48

work page doi:10.1186/1471-2105-4-48 2003
[27]

: Sentiment Labelled Sentences

Kotzias D. : Sentiment Labelled Sentences . UCI Machine Learning Repository, 2015. https://doi.org/10.24432/C57604 doi:10.24432/C57604

work page doi:10.24432/c57604 2015
[28]

: Benchmarking of dimensionality reduction methods to capture drug response in transcriptome data

Kwon Y., Park S., Park S., Lee H. : Benchmarking of dimensionality reduction methods to capture drug response in transcriptome data. Scientific Reports 15 (2025), 32173. https://doi.org/10.1038/s41598-025-12021-7 doi:10.1038/s41598-025-12021-7

work page doi:10.1038/s41598-025-12021-7 2025
[29]

1998 , month = nov, journal =

Lecun Y., Bottou L., Bengio Y., Haffner P. : Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324. https://doi.org/10.1109/5.726791 doi:10.1109/5.726791

work page doi:10.1109/5.726791 1998
[30]

: Learning question classifiers

Li X., Roth D. : Learning question classifiers. In 19th International Conference on Computational Linguistics, COLING 2002, Howard International House and Academia Sinica, Taipei, Taiwan, August 24 - September 1, 2002 (2002). URL: https://aclanthology.org/C02-1150/

2002
[31]

: State of the art of graph visualization in non-euclidean spaces

Miller J., Bhatia D., Kobourov S. : State of the art of graph visualization in non-euclidean spaces. In Computer Graphics Forum (2024), vol. 43, Wiley Online Library, p. e15113

2024
[32]

Machado, M

Machado A., Behrisch M., Telea A. : Necessary but not sufficient: Limitations of projection quality metrics. In Computer Graphics Forum (2025), Wiley Online Library, p. e70101. https://doi.org/10.1111/CGF.70101 doi:10.1111/CGF.70101

work page doi:10.1111/cgf.70101 2025
[33]

L., Daly R

Maas A. L., Daly R. E., Pham P. T., Huang D., Ng A. Y., Potts C. : Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (Portland, Oregon, USA, June 2011), Association for Computational Linguistics, pp. 142--150. URL: http://www.aclweb.org/anth...

2011
[34]

: Balancing between the local and global structures ( LGS ) in graph embedding

Miller J., Huroyan V., Kobourov S. : Balancing between the local and global structures ( LGS ) in graph embedding. In Graph Drawing and Network Visualization (2023), Springer-Verlag, p. 263–279. https://doi.org/10.1007/978-3-031-49272-3_18 doi:10.1007/978-3-031-49272-3_18

work page doi:10.1007/978-3-031-49272-3_18 2023
[35]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

McInnes L., Healy J., Melville J. : UMAP : Uniform manifold approximation and projection for dimension reduction. In arXiv preprint arXiv:1802.03426 (2018)

work page internal anchor Pith review arXiv 2018
[36]

Motta R., Minghim R., de Andrade Lopes A., Oliveira M. C. F. : Graph-based measures to assist user assessment of multidimensional projections. Neurocomputing 150 (2015), 583--598

2015
[37]

G., Aupetit M

Nonato L. G., Aupetit M. : Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrichment. IEEE Transactions on Visualization and Computer Graphics 25, 8 (2018), 2650--2673. https://doi.org/10.1109/TVCG.2018.2846735 doi:10.1109/TVCG.2018.2846735

work page doi:10.1109/tvcg.2018.2846735 2018
[38]

A., Nayar S

Nene S. A., Nayar S. K., Murase H. : Columbia Object Image Library (COIL-100). Tech. Rep. CUCS-006-96, Department of Computer Science, Columbia University, February 1996

1996
[39]

A., Nayar S

Nene S. A., Nayar S. K., Murase H. : Columbia Object Image Library (COIL-20). Tech. Rep. CUCS-005-96, Department of Computer Science, Columbia University, February 1996

1996
[40]

V., Arleo A., van den Elzen S

Paulovich F. V., Arleo A., van den Elzen S. : When dimensionality reduction meets graph (drawing) theory: Introducing a common framework, challenges and opportunities. In Computer Graphics Forum (2025), Wiley Online Library, p. e70105. https://doi.org/10.1111/CGF.70105 doi:10.1111/CGF.70105

work page doi:10.1111/cgf.70105 2025
[41]

: Principal components analysis

Pearson K. : Principal components analysis. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 6, 2 (1901), 559

1901
[42]

Rousseeuw P. J. : Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20 (1987), 53--65. https://doi.org/https://doi.org/10.1016/0377-0427(87)90125-7 doi:https://doi.org/10.1016/0377-0427(87)90125-7

work page doi:10.1016/0377-0427(87)90125-7 1987
[43]

: How scale breaks `` Normalized Stress '' and KL divergence: Rethinking quality metrics

Smelser K., Gunaratne K., Miller J., Kobourov S. : How scale breaks `` Normalized Stress '' and KL divergence: Rethinking quality metrics. IEEE Transactions on Visualization and Computer Graphics (2026), 1--14. https://doi.org/10.1109/TVCG.2026.3657654 doi:10.1109/TVCG.2026.3657654

work page doi:10.1109/tvcg.2026.3657654 2026
[44]

: Parameterisation of a stochastic model for human face identification

Samaria F., Harter A. : Parameterisation of a stochastic model for human face identification. In Proceedings of 1994 IEEE Workshop on Applications of Computer Vision (1994), pp. 138--142. https://doi.org/10.1109/ACV.1994.341300 doi:10.1109/ACV.1994.341300

work page doi:10.1109/acv.1994.341300 1994
[45]

Shepard R. N. : The analysis of proximities: Multidimensional scaling with an unknown distance function. i. Psychometrika 27, 2 (1962), 125–140. https://doi.org/10.1007/BF02289630 doi:10.1007/BF02289630

work page doi:10.1007/bf02289630 1962
[46]

CARER : Contextualized Affect Representations for Emotion Recognition

Saravia E., Liu H.-C. T., Huang Y.-H., Wu J., Chen Y.-S. : CARER : Contextualized affect representations for emotion recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (Brussels, Belgium, Oct.-Nov. 2018), Association for Computational Linguistics, pp. 3687--3697. https://doi.org/10.18653/v1/D18-1404 doi:...

work page doi:10.18653/v1/d18-1404 2018
[47]

Sainburg T., McInnes L., Gentner T. Q. : Parametric UMAP embeddings for representation and semisupervised learning. Neural Computation 33, 11 (10 2021), 2881--2907. http://arxiv.org/abs/https://direct.mit.edu/neco/article-pdf/33/11/2881/1966656/neco_a_01434.pdf arXiv:https://direct.mit.edu/neco/article-pdf/33/11/2881/1966656/neco_a_01434.pdf , https://doi...

work page doi:10.1162/neco_a_01434 2021
[48]

Computer Graphics Forum29(2), 419–428 (2010) https://doi.org/10.1111/j.1467-8659.2009.01611.x

Sips M., Neubert B., Lewis J., Hanrahan P. : Selecting good views of high-dimensional data using class consistency. Computer Graphics Forum (2009). https://doi.org/10.1111/j.1467-8659.2009.01467.x doi:10.1111/j.1467-8659.2009.01467.x

work page doi:10.1111/j.1467-8659.2009.01467.x 2009
[49]

: Accuracy, robustness and scalability of dimensionality reduction methods for single-cell rna-seq analysis

Sun S., Zhu J., Ma Y., Zhou X. : Accuracy, robustness and scalability of dimensionality reduction methods for single-cell rna-seq analysis. Genome Biology 20 (12 2019). https://doi.org/10.1186/s13059-019-1898-6 doi:10.1186/s13059-019-1898-6

work page doi:10.1186/s13059-019-1898-6 2019
[50]

: Learning a parametric embedding by preserving local structure

van der Maaten L. : Learning a parametric embedding by preserving local structure. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA, 16--18 Apr 2009), van Dyk D., Welling M., (Eds.), vol. 5 of Proceedings of Machine Learning Research, PMLR, pp. 3...

2009
[51]

: Visualizing data using t- SNE

van der Maaten L., Hinton G. : Visualizing data using t- SNE . In Journal of Machine Learning Research (2008)

2008
[52]

: Information retrieval perspective to nonlinear dimensionality reduction for data visualization

Venna J., Peltonen J., Nybo K., Aidos H., Kaski S. : Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research 11 (Mar. 2010), 451–490

2010
[53]

: Understanding how dimension reduction tools work: an empirical approach to deciphering t- SNE , UMAP , tri M ap, and PaCMAP for data visualization

Wang Y., Huang H., Rudin C., Shaposhnik Y. : Understanding how dimension reduction tools work: an empirical approach to deciphering t- SNE , UMAP , tri M ap, and PaCMAP for data visualization. JMLR 22, 1 (Jan. 2021). URL: http://jmlr.org/papers/v22/20-1061.html

2021
[54]

: How to use t- SNE effectively

Wattenberg M., Vi \'e gas F., Johnson I. : How to use t- SNE effectively. Distill 1, 10 (2016), e2

2016
[55]

: Deep metric learning with angular loss

Wang J., Zhou F., Wen S., Liu X., Lin Y. : Deep metric learning with angular loss. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

2017
[56]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Xiao H., Rasul K., Vollgraf R. : Fashion- MNIST : a novel image dataset for benchmarking machine learning algorithms, 2017. URL: https://arxiv.org/abs/1708.07747, http://arxiv.org/abs/1708.07747 arXiv:1708.07747

work page internal anchor Pith review arXiv 2017
[57]

: Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data

Yang Y., Sun H., Zhang Y., Zhang T., Gong J., Wei Y., Duan Y.-G., Shu M., Yang Y., Wu D., Yu D. : Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data. Cell Reports 36, 4 (2021), 109442. https://doi.org/https://doi.org/10.1016/j.celrep.2021.109442 doi:https://doi.org/10.1016/j.celrep.2021.109442

work page doi:10.1016/j.celrep.2021.109442 2021
[58]

: Silhouette-based evaluation of PCA , Isomap , and t- SNE on linear and nonlinear data structures

Zahed M., Skafyan M. : Silhouette-based evaluation of PCA , Isomap , and t- SNE on linear and nonlinear data structures. Stats 8, 4 (2025). https://doi.org/10.3390/stats8040105 doi:10.3390/stats8040105

work page doi:10.3390/stats8040105 2025