VERA: Generating Visual Explanations of Two-Dimensional Embeddings via Region Annotation

Bla\v{z} Zupan; Pavlin G. Poli\v{c}ar

arxiv: 2406.04808 · v2 · submitted 2024-06-07 · 💻 cs.LG · cs.HC

VERA: Generating Visual Explanations of Two-Dimensional Embeddings via Region Annotation

Pavlin G. Poli\v{c}ar , Bla\v{z} Zupan This is my paper

Pith reviewed 2026-05-23 23:57 UTC · model grok-4.3

classification 💻 cs.LG cs.HC

keywords visual explanationsregion annotationtwo-dimensional embeddingsdimensionality reductionexploratory data analysisstatic visualizationsuser study evaluation

0 comments

The pith

VERA automatically generates static visual explanations for 2D embeddings by annotating informative regions with human-interpretable features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces VERA as a method to explain two-dimensional embeddings from dimensionality reduction techniques without requiring interactive exploration. It does this by identifying regions in the embedding space and linking them to user-provided features through automated filtering, merging, and ranking. This produces concise annotations that summarize the embedding structure at a glance. A user study demonstrates that these static explanations convey essential insights while requiring less time and effort than interactive toolkits.

Core claim

VERA identifies informative regions in the embedding space and associates them with user-provided human-interpretable features, producing concise visual annotations that summarize the structure of the embedding landscape at a glance. Rather than merely showing where feature values occur, VERA automatically filters, merges, and ranks candidate explanations, enabling users to focus on the most informative embedding structures without manual exploration.

What carries the argument

Automatic identification, filtering, merging, and ranking of region annotations based on user-provided features to explain embedding patterns.

If this is right

Users gain understanding of clusters and patterns in embeddings with reduced manual effort.
Static explanations can replace or supplement interactive data mining for exploratory tasks.
Analysis of high-dimensional data visualizations becomes more efficient in terms of time spent.
The approach supports typical EDA tasks by highlighting structures that matter most.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

VERA might be combined with machine learning to suggest relevant features when user input is limited.
Similar region-based annotation could apply to other visualization types like graphs or 3D embeddings.
Testing on larger datasets could reveal scalability limits of the automatic ranking process.

Load-bearing premise

The method assumes that user-provided human-interpretable features are sufficient to label the most informative regions and that the automatic filtering, merging, and ranking steps will surface the structures that matter most without missing key patterns.

What would settle it

An experiment showing that VERA misses important embedding structures identified by domain experts or that users complete EDA tasks slower with VERA than with interactive tools.

Figures

Figures reproduced from arXiv: 2406.04808 by Bla\v{z} Zupan, Pavlin G. Poli\v{c}ar.

**Figure 1.** Figure 1: We visualize each feature of our fictional Bookworm data set in the typical scatter plot fashion. The point positions are specified by a two-dimensional [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: We illustrate the region construction process for a single variable using a synthetic example. (a) Two-dimensional embeddings are often inspected [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The contrastive merge. If two base variables contain explanatory [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Contrastive layout generation. (a) Generating candidate panels simply [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: The descriptive merge. In this example, we consider a subset of [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: The descriptive layout generation follows an iterative approach. Given a set of available explanatory variables in (a), we identify non-overlapping [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: VERA explanations of the IBM Employee Attrition data set. Panels (a-d) show four contrastive explanations corresponding to the features that are [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

read the original abstract

Two-dimensional embeddings obtained from dimensionality reduction techniques such as MDS, t-SNE, or UMAP, are widely used to visualize high-dimensional data and support researchers in visually identifying clusters, outliers, and other interesting patterns in the data. However, the main challenge is not only to detect such patterns, but to explain what they represent in terms of the original, human-interpretable features of the data. Existing approaches often rely on interactive exploration or direct feature encodings, requiring substantial manual inspection that can be time-consuming and repetitive. As an alternative, we propose VERA (Visual Explanations via Region Annotation), a general-purpose method for explaining two-dimensional embeddings through automatically generated, static, region-based visual explanations. VERA identifies informative regions in the embedding space and associates them with user-provided human-interpretable features, producing concise visual annotations that summarize the structure of the embedding landscape at a glance. Rather than merely showing where feature values occur, VERA automatically filters, merges, and ranks candidate explanations, enabling users to focus on the most informative embedding structures without manual exploration. We demonstrate VERA's utility on several real-world datasets and evaluate its effectiveness in a user study comparing it with the utility of a comprehensive interactive data mining toolkit. Our results show that VERA's generated static explanations can convey the essential insights of complex embeddings and support users in typical exploratory data analysis tasks, while requiring significantly less time and user effort.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VERA gives a concrete pipeline for static region annotations on embeddings but the whole thing stands or falls on whether the user-supplied features already cover the structures worth explaining.

read the letter

VERA automates the creation of static visual explanations for two-dimensional embeddings by finding regions and tying them to user-supplied features, then filtering and ranking the results. The new part is the end-to-end procedure that includes automatic filtering, merging, and ranking of candidate explanations so the output stays concise and focused on the most informative parts. This differs from prior work that either requires ongoing interaction or just overlays raw feature values without the selection steps. The paper does a decent job showing the idea on several real datasets and running a user study against an interactive toolkit. The claim that it conveys essential insights with less time and effort is the core result, and if the study holds up, it addresses a practical bottleneck. The soft spot is the reliance on the user providing the right features upfront. The method will only explain structures that can be labeled with those features, and there is no clear test of what happens when the feature set is incomplete or misses key aspects of the data. That could lead to explanations that look good but overlook important clusters or outliers. The details on how regions are detected and how the ranking works will also matter for anyone trying to implement or extend this. This is aimed at people who work with embedding visualizations in exploratory analysis and want a way to generate quick, shareable static figures instead of live exploration sessions. It is solid enough on the method and evaluation side to warrant sending out for peer review, though reviewers will likely press on the feature assumption and the study design.

Referee Report

3 major / 1 minor

Summary. The paper introduces VERA, a general-purpose method for producing static visual explanations of 2D embeddings (from MDS, t-SNE, UMAP, etc.) by automatically detecting informative regions, associating them with user-supplied human-interpretable features, and applying filtering/merging/ranking steps to generate concise annotations. It demonstrates the approach on real-world datasets and reports a user study claiming that the resulting static explanations convey essential insights into clusters/outliers while requiring significantly less time and effort than a comprehensive interactive data-mining toolkit.

Significance. If the algorithmic details and user-study evidence hold, VERA would offer a practical alternative to interactive exploration for embedding analysis, potentially lowering the barrier for non-expert users. The paper receives credit for including both real-dataset demonstrations and a comparative user study, which directly addresses a common pain point in exploratory data analysis.

major comments (3)

[Abstract and §3] Abstract and §3 (Method): No description is given of the region-detection procedure, the concrete criteria or thresholds used for automatic filtering/merging/ranking of candidate explanations, or the precise mechanism that associates regions with user-provided features. Without these, the central claim that the generated annotations reliably surface the most informative structures cannot be evaluated or reproduced.
[§5] §5 (User Study): The headline result that VERA reduces time and user effort is stated without reporting participant count, task design, statistical tests, p-values, effect sizes, or any analysis of variance. This absence makes it impossible to assess whether the evidence supports the claim that static explanations are sufficient for typical EDA tasks.
[§4–5] §4–5 (Evaluation): The method’s correctness rests on the untested assumption that the supplied human-interpretable features plus the automatic steps will capture all structures an analyst would discover interactively. No ablation removing features, no failure-case examples, and no comparison against ground-truth structures are provided, leaving open the possibility that key patterns are systematically omitted.

minor comments (1)

[Abstract] Abstract: The phrase “several real-world datasets” is used without naming the datasets or citing their sources.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough review and constructive feedback. We address each of the major comments below, indicating the revisions we will make to improve the manuscript.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (Method): No description is given of the region-detection procedure, the concrete criteria or thresholds used for automatic filtering/merging/ranking of candidate explanations, or the precise mechanism that associates regions with user-provided features. Without these, the central claim that the generated annotations reliably surface the most informative structures cannot be evaluated or reproduced.

Authors: We agree that additional details are necessary for reproducibility. Although the method section describes the high-level procedure, specific implementation details such as the region detection algorithm, thresholds for filtering and merging, and the exact association mechanism were not sufficiently elaborated. In the revised manuscript, we will expand §3 with a precise description of these components, including the criteria used, any thresholds, and pseudocode for the key steps. This will allow readers to evaluate and reproduce the central claims. revision: yes
Referee: [§5] §5 (User Study): The headline result that VERA reduces time and user effort is stated without reporting participant count, task design, statistical tests, p-values, effect sizes, or any analysis of variance. This absence makes it impossible to assess whether the evidence supports the claim that static explanations are sufficient for typical EDA tasks.

Authors: We acknowledge the lack of detailed statistical reporting in the user study section. The study was designed to compare VERA with an interactive toolkit, but the manuscript does not include the requested details. We will revise §5 to report the participant count, provide a full description of the task design, and include appropriate statistical analyses such as t-tests or ANOVA with p-values and effect sizes to substantiate the claims about reduced time and effort. revision: yes
Referee: [§4–5] §4–5 (Evaluation): The method’s correctness rests on the untested assumption that the supplied human-interpretable features plus the automatic steps will capture all structures an analyst would discover interactively. No ablation removing features, no failure-case examples, and no comparison against ground-truth structures are provided, leaving open the possibility that key patterns are systematically omitted.

Authors: The user study provides some validation that the generated explanations convey essential insights, as evidenced by user performance. However, we agree that more rigorous evaluation is warranted. We will add ablation studies on the role of user-provided features, include examples of potential failure cases, and where possible, compare against known ground-truth structures in the datasets used. This will address the concern about systematic omissions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; VERA is a self-contained procedural method

full rationale

The paper describes VERA as a general-purpose algorithmic procedure: identify informative regions in 2D embeddings, associate them with user-provided human-interpretable features, then apply automatic filtering/merging/ranking to produce static annotations. No equations, fitted parameters, or derivations are presented that reduce to their own inputs by construction. No self-citations appear as load-bearing premises for uniqueness, ansatzes, or theorems. The central claims rest on demonstrations across datasets and a user study comparing static output to an interactive toolkit; these evaluations are independent of any prior author results. This matches the default case of a non-circular methods paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; region identification and ranking logic are not detailed enough to audit.

pith-pipeline@v0.9.0 · 5793 in / 1078 out tokens · 13849 ms · 2026-05-23T23:57:34.985037+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

[1]

Vi- sualizing high-dimensional data: Advances in the past decade,

S. Liu, D. Maljovec, B. Wang, P.-T. Bremer, and V . Pascucci, “Vi- sualizing high-dimensional data: Advances in the past decade,” IEEE Transactions on Visualization and Computer Graphics , vol. 23, no. 3, pp. 1249–1268, 2017

work page 2017
[2]

I. T. Jolliffe, Principal Component Analysis . Springer, 2002

work page 2002
[3]

The use of multiple measurements in taxonomic prob- lems,

R. A. Fisher, “The use of multiple measurements in taxonomic prob- lems,” Annals of Eugenics , vol. 7, no. 2, pp. 179–188, 1936

work page 1936
[4]

FreeViz—An intelligent multivari- ate visualization approach to explorative analysis of biomedical data,

J. Dem ˇsar, G. Leban, and B. Zupan, “FreeViz—An intelligent multivari- ate visualization approach to explorative analysis of biomedical data,” Journal of biomedical informatics , vol. 40, no. 6, pp. 661–671, 2007

work page 2007
[5]

J. B. Kruskal and M. Wish, Multidimensional Scaling. SAGE Publica- tions, Inc., 1978

work page 1978
[6]

Visualizing data using t-SNE,

L. Van der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research , vol. 9, no. 11, 2008

work page 2008
[7]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,

L. McInnes, J. Healy, and J. Melville, “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,” ArXiv e- prints, 2018

work page 2018
[8]

Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations,

E. Kandogan, “Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations,” in 2012 IEEE Conference on Visual Analytics Science and Technology (VAST) , pp. 73–82, 2012

work page 2012
[9]

The Data Context Map: Fusing Data and Attributes into a Unified Display,

S. Cheng and K. Mueller, “The Data Context Map: Fusing Data and Attributes into a Unified Display,” IEEE Transactions on Visualization and Computer Graphics , vol. 22, pp. 121–130, jan 2016

work page 2016
[10]

IXVC: An interac- tive pipeline for explaining visual clusters in dimensionality reduction visualizations with decision trees,

A. Bibal, A. Clarinval, B. Dumas, and B. Fr ´enay, “IXVC: An interac- tive pipeline for explaining visual clusters in dimensionality reduction visualizations with decision trees,” Array, vol. 11, p. 100080, 2021

work page 2021
[11]

Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data,

G. C. Linderman, M. Rachh, J. G. Hoskins, S. Steinerberger, and Y . Kluger, “Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data,” Nature Methods, vol. 16, no. 3, pp. 243– 245, 2019

work page 2019
[12]

SQuadMDS: A lean Stochastic Quartet MDS improving global structure preservation in neighbor embedding like t-SNE and UMAP,

P. Lambert, C. de Bodt, M. Verleysen, and J. A. Lee, “SQuadMDS: A lean Stochastic Quartet MDS improving global structure preservation in neighbor embedding like t-SNE and UMAP,” Neurocomputing, vol. 503, pp. 17–27, 2022

work page 2022
[13]

The art of using t-sne for single-cell tran- scriptomics,

D. Kobak and P. Berens, “The art of using t-sne for single-cell tran- scriptomics,” Nature Communications, vol. 10, p. 5416, Nov 2019

work page 2019
[14]

Attribute-based Visual Explanation of Multidimensional Projec- tions,

R. R. O. d. Silva, P. E. Rauber, R. M. Martins, R. Minghim, and A. C. Telea, “Attribute-based Visual Explanation of Multidimensional Projec- tions,” in EuroVis Workshop on Visual Analytics (EuroVA) (E. Bertini and J. C. Roberts, eds.), The Eurographics Association, 2015

work page 2015
[15]

Enhanced Attribute-Based Explanations of Multidimensional Projections,

D. v. Driel, X. Zhai, Z. Tian, and A. Telea, “Enhanced Attribute-Based Explanations of Multidimensional Projections,” in EuroVis Workshop on Visual Analytics (EuroVA) (C. Turkay and K. Vrotsou, eds.), The Eurographics Association, 2020. 12

work page 2020
[16]

Using multiple attribute-based explanations of multidimen- sional projections to explore high-dimensional data,

Z. Tian, X. Zhai, D. van Driel, G. van Steenpaal, M. Espadoto, and A. Telea, “Using multiple attribute-based explanations of multidimen- sional projections to explore high-dimensional data,” Computers & Graphics, vol. 98, pp. 93–104, 2021

work page 2021
[17]

DimReader: Axis Lines That Explain Non-Linear Projections,

R. Faust, D. Glickenstein, and C. Scheidegger, “DimReader: Axis Lines That Explain Non-Linear Projections,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 481–490, 2019

work page 2019
[18]

Explaining Groups of Points in Low-Dimensional Representations,

G. Plumb, J. Terhorst, S. Sankararaman, and A. Talwalkar, “Explaining Groups of Points in Low-Dimensional Representations,” in Proceedings of the 37th International Conference on Machine Learning (H. D. III and A. Singh, eds.), vol. 119 of Proceedings of Machine Learning Research, pp. 7762–7771, PMLR, 13–18 Jul 2020

work page 2020
[19]

Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrich- ment,

L. G. Nonato and M. Aupetit, “Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrich- ment,” IEEE Transactions on Visualization and Computer Graphics , vol. 25, no. 8, pp. 2650–2673, 2019

work page 2019
[20]

Techniques for precision-based visual analysis of projected data,

T. Schreck, T. von Landesberger, and S. Bremm, “Techniques for precision-based visual analysis of projected data,” Information Visual- ization, vol. 9, no. 3, pp. 181–193, 2010

work page 2010
[21]

Stress Maps: Analysing Lo- cal Phenomena in Dimensionality Reduction Based Visualisations,

C. Seifert, V . Sabol, and W. Kienreich, “Stress Maps: Analysing Lo- cal Phenomena in Dimensionality Reduction Based Visualisations,” in EuroVAST 2010: International Symposium on Visual Analytics Science and Technology (J. Kohlhammer and D. Keim, eds.), The Eurographics Association, 2010

work page 2010
[22]

Visualizing distortions and recovering topology in contin- uous projection techniques,

M. Aupetit, “Visualizing distortions and recovering topology in contin- uous projection techniques,” Neurocomputing, vol. 70, no. 7, pp. 1304– 1330, 2007

work page 2007
[23]

Checkviz: Sanity check and topological clues for linear and non-linear mappings,

S. Lespinats and M. Aupetit, “Checkviz: Sanity check and topological clues for linear and non-linear mappings,” Computer Graphics Forum , vol. 30, no. 1, pp. 113–125, 2011

work page 2011
[24]

Visual anal- ysis of dimensionality reduction quality for parameterized projections,

R. M. Martins, D. B. Coimbra, R. Minghim, and A. Telea, “Visual anal- ysis of dimensionality reduction quality for parameterized projections,” Computers & Graphics , vol. 41, pp. 26–42, 2014

work page 2014
[25]

ProxiLens: Interactive Explo- ration of High-Dimensional Data using Projections,

N. Heulot, M. Aupetit, and J.-D. Fekete, “ProxiLens: Interactive Explo- ration of High-Dimensional Data using Projections,” in EuroVis Work- shop on Visual Analytics using Multidimensional Projections(M. Aupetit and L. van der Maaten, eds.), The Eurographics Association, 2013

work page 2013
[26]

Probing projections: Interaction techniques for interpreting arrangements and errors of dimen- sionality reductions,

J. Stahnke, M. D ¨ork, B. M ¨uller, and A. Thom, “Probing projections: Interaction techniques for interpreting arrangements and errors of dimen- sionality reductions,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 629–638, 2016

work page 2016
[27]

Natively Interpretable t-SNE,

E. Couplet, P. Lambert, M. Verleysen, D. Mulders, J. A. Lee, and C. De Bodt, “Natively Interpretable t-SNE,” in Proceedings of AIMLAI workshop, vol. 1, p. 1, 2023

work page 2023
[28]

To- ward computing attributions for dimensionality reduction techniques,

M. Scicluna, J.-C. Grenier, R. Poujol, S. Lemieux, and J. G. Hussin, “To- ward computing attributions for dimensionality reduction techniques,” Bioinformatics advances, vol. 3, no. 1, p. vbad097, 2023

work page 2023
[29]

Visual Analysis of Multi- Dimensional Categorical Data Sets,

B. Broeksema, A. C. Telea, and T. Baudel, “Visual Analysis of Multi- Dimensional Categorical Data Sets,” Computer Graphics Forum, 2013

work page 2013
[30]

Explaining three-dimensional dimensionality reduction plots,

D. B. Coimbra, R. M. Martins, T. T. Neves, A. C. Telea, and F. V . Paulovich, “Explaining three-dimensional dimensionality reduction plots,” Information Visualization, vol. 15, no. 2, pp. 154–172, 2016

work page 2016
[31]

Uncovering representative groups in multidimensional projections,

P. Joia, F. Petronetto, and L. Nonato, “Uncovering representative groups in multidimensional projections,” Computer Graphics Forum , vol. 34, no. 3, pp. 281–290, 2015

work page 2015
[32]

Understanding attribute variability in multidimensional projections,

L. Pagliosa, P. Pagliosa, and L. G. Nonato, “Understanding attribute variability in multidimensional projections,” in 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) , pp. 297– 304, 2016

work page 2016
[33]

Explanation in artificial intelligence: Insights from the social sciences,

T. Miller, “Explanation in artificial intelligence: Insights from the social sciences,” Artificial Intelligence, vol. 267, pp. 1–38, 2019

work page 2019
[34]

Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning,

T. Fujiwara, O.-H. Kwon, and K.-L. Ma, “Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning,” IEEE Transactions on Visualization and Computer Graphics , vol. 26, no. 1, pp. 45–55, 2020

work page 2020
[35]

Contrastive analysis for scatterplot-based representations of dimensionality reduction,

W. E. Marc ´ılio-Jr, D. M. Eler, and R. E. Garcia, “Contrastive analysis for scatterplot-based representations of dimensionality reduction,” Com- puters & Graphics , vol. 101, pp. 46–58, 2021

work page 2021
[36]

Explaining dimensionality reduction results using shapley values,

W. E. Marc ´ılio-Jr and D. M. Eler, “Explaining dimensionality reduction results using shapley values,”Expert Systems with Applications, vol. 178, p. 115020, 2021

work page 2021
[37]

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections,

A. Chatzimparmpas, R. M. Martins, and A. Kerren, “t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 8, pp. 2696–2714, 2020

work page 2020
[38]

Visual explo- ration of relationships and structure in low-dimensional embeddings,

K. Eckelt, A. Hinterreiter, P. Adelberger, C. Walchshofer, V . Dhanoa, C. Humer, M. Heckmann, C. Steinparz, and M. Streit, “Visual explo- ration of relationships and structure in low-dimensional embeddings,” IEEE Transactions on Visualization and Computer Graphics , vol. 29, p. 3312–3326, mar 2022

work page 2022
[39]

Explaining t-SNE embeddings locally by adapting LIME,

A. Bibal, V . Vu, G. Nanfack, and B. Fr ´enay, “Explaining t-SNE embeddings locally by adapting LIME,” in ESANN 2020, pp. 393–398, ESANN (i6doc.com), 2020

work page 2020
[40]

Splatterplots: Overcoming overdraw in scatter plots,

A. Mayorga and M. Gleicher, “Splatterplots: Overcoming overdraw in scatter plots,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 9, pp. 1526–1538, 2013

work page 2013
[41]

General projective maps for multidi- mensional data projection,

D. J. Lehmann and H. Theisel, “General projective maps for multidi- mensional data projection,” Computer Graphics Forum , vol. 35, no. 2, pp. 443–453, 2016

work page 2016
[42]

The magical number seven, plus or minus two: Some lim- its on our capacity for processing information.,

G. A. Miller, “The magical number seven, plus or minus two: Some lim- its on our capacity for processing information.,” Psychological Review, vol. 63, no. 2, pp. 81–97, 1956

work page 1956
[43]

The magical number 4 in short-term memory: A recon- sideration of mental storage capacity,

N. Cowan, “The magical number 4 in short-term memory: A recon- sideration of mental storage capacity,” Behavioral and Brain Sciences , vol. 24, no. 1, p. 87–114, 2001

work page 2001
[44]

Chunks in expert memory: evidence for the magical number four ... or is it two?,

F. Gobet and G. Clarkson, “Chunks in expert memory: evidence for the magical number four ... or is it two?,” Memory, vol. 12, pp. 732–747, Nov. 2004

work page 2004
[45]

Ibm employee attrition dataset

I. W. D. Scientists, “Ibm employee attrition dataset.”

work page
[46]

openTSNE: A Modular Python Library for t-SNE Dimensionality Reduction and Embedding,

P. G. Poli ˇcar, M. Straˇzar, and B. Zupan, “openTSNE: A Modular Python Library for t-SNE Dimensionality Reduction and Embedding,” Journal of Statistical Software , vol. 109, no. 3, p. 1–30, 2024

work page 2024
[47]

Orange: data mining toolbox in python,

J. Dem ˇsar, T. Curk, A. Erjavec, ˇC. Gorup, T. Ho ˇcevar, M. Milutinovi ˇc, M. Mo ˇzina, M. Polajnar, M. Toplak, A. Stari ˇc, et al. , “Orange: data mining toolbox in python,” The Journal of Machine Learning Research , vol. 14, no. 1, pp. 2349–2353, 2013

work page 2013
[48]

The species problem in iris,

E. Anderson, “The species problem in iris,” Annals of the Missouri Botanical Garden, vol. 23, no. 3, pp. 457–509, 1936

work page 1936
[49]

C ¸ inar, M

I. C ¸ inar, M. Koklu, and S. Tasdemir, “Raisin.” UCI Machine Learning Repository, 2023

work page 2023
[50]

A. M. Horst, A. P. Hill, and K. B. Gorman, palmerpenguins: Palmer Archipelago (Antarctica) penguin data , 2020. R package version 0.1.0

work page 2020
[51]

The “unusual episode

R. J. M. Dawson, “The “unusual episode” data revisited,” Journal of Statistics Education, vol. 3, no. 3, 1995

work page 1995
[52]

Modeling wine preferences by data mining from physicochemical properties,

P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, “Modeling wine preferences by data mining from physicochemical properties,” Decision support systems , vol. 47, no. 4, pp. 547–553, 2009

work page 2009
[53]

Language models are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert- V oss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amode...

work page 1901
[54]

LLaMA: Open and Efficient Foundation Language Models,

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “LLaMA: Open and Efficient Foundation Language Models,” 2023

work page 2023
[55]

The Gene Ontology (GO) database and informatics resource,

G. O. Consortium, “The Gene Ontology (GO) database and informatics resource,” Nucleic Acids Research , vol. 32, pp. D258–D261, 01 2004

work page 2004
[56]

Nation-wide eprescription data reveals landscape of physicians and their drug prescribing patterns in slovenia,

P. G. Poli ˇcar, D. Stanimirovi´c, and B. Zupan, “Nation-wide eprescription data reveals landscape of physicians and their drug prescribing patterns in slovenia,” in Artificial Intelligence in Medicine (J. M. Juarez, M. Mar- cos, G. Stiglic, and A. Tucker, eds.), (Cham), pp. 283–292, Springer Nature Switzerland, 2023

work page 2023

[1] [1]

Vi- sualizing high-dimensional data: Advances in the past decade,

S. Liu, D. Maljovec, B. Wang, P.-T. Bremer, and V . Pascucci, “Vi- sualizing high-dimensional data: Advances in the past decade,” IEEE Transactions on Visualization and Computer Graphics , vol. 23, no. 3, pp. 1249–1268, 2017

work page 2017

[2] [2]

I. T. Jolliffe, Principal Component Analysis . Springer, 2002

work page 2002

[3] [3]

The use of multiple measurements in taxonomic prob- lems,

R. A. Fisher, “The use of multiple measurements in taxonomic prob- lems,” Annals of Eugenics , vol. 7, no. 2, pp. 179–188, 1936

work page 1936

[4] [4]

FreeViz—An intelligent multivari- ate visualization approach to explorative analysis of biomedical data,

J. Dem ˇsar, G. Leban, and B. Zupan, “FreeViz—An intelligent multivari- ate visualization approach to explorative analysis of biomedical data,” Journal of biomedical informatics , vol. 40, no. 6, pp. 661–671, 2007

work page 2007

[5] [5]

J. B. Kruskal and M. Wish, Multidimensional Scaling. SAGE Publica- tions, Inc., 1978

work page 1978

[6] [6]

Visualizing data using t-SNE,

L. Van der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research , vol. 9, no. 11, 2008

work page 2008

[7] [7]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,

L. McInnes, J. Healy, and J. Melville, “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,” ArXiv e- prints, 2018

work page 2018

[8] [8]

Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations,

E. Kandogan, “Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations,” in 2012 IEEE Conference on Visual Analytics Science and Technology (VAST) , pp. 73–82, 2012

work page 2012

[9] [9]

The Data Context Map: Fusing Data and Attributes into a Unified Display,

S. Cheng and K. Mueller, “The Data Context Map: Fusing Data and Attributes into a Unified Display,” IEEE Transactions on Visualization and Computer Graphics , vol. 22, pp. 121–130, jan 2016

work page 2016

[10] [10]

IXVC: An interac- tive pipeline for explaining visual clusters in dimensionality reduction visualizations with decision trees,

A. Bibal, A. Clarinval, B. Dumas, and B. Fr ´enay, “IXVC: An interac- tive pipeline for explaining visual clusters in dimensionality reduction visualizations with decision trees,” Array, vol. 11, p. 100080, 2021

work page 2021

[11] [11]

Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data,

G. C. Linderman, M. Rachh, J. G. Hoskins, S. Steinerberger, and Y . Kluger, “Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data,” Nature Methods, vol. 16, no. 3, pp. 243– 245, 2019

work page 2019

[12] [12]

SQuadMDS: A lean Stochastic Quartet MDS improving global structure preservation in neighbor embedding like t-SNE and UMAP,

P. Lambert, C. de Bodt, M. Verleysen, and J. A. Lee, “SQuadMDS: A lean Stochastic Quartet MDS improving global structure preservation in neighbor embedding like t-SNE and UMAP,” Neurocomputing, vol. 503, pp. 17–27, 2022

work page 2022

[13] [13]

The art of using t-sne for single-cell tran- scriptomics,

D. Kobak and P. Berens, “The art of using t-sne for single-cell tran- scriptomics,” Nature Communications, vol. 10, p. 5416, Nov 2019

work page 2019

[14] [14]

Attribute-based Visual Explanation of Multidimensional Projec- tions,

R. R. O. d. Silva, P. E. Rauber, R. M. Martins, R. Minghim, and A. C. Telea, “Attribute-based Visual Explanation of Multidimensional Projec- tions,” in EuroVis Workshop on Visual Analytics (EuroVA) (E. Bertini and J. C. Roberts, eds.), The Eurographics Association, 2015

work page 2015

[15] [15]

Enhanced Attribute-Based Explanations of Multidimensional Projections,

D. v. Driel, X. Zhai, Z. Tian, and A. Telea, “Enhanced Attribute-Based Explanations of Multidimensional Projections,” in EuroVis Workshop on Visual Analytics (EuroVA) (C. Turkay and K. Vrotsou, eds.), The Eurographics Association, 2020. 12

work page 2020

[16] [16]

Using multiple attribute-based explanations of multidimen- sional projections to explore high-dimensional data,

Z. Tian, X. Zhai, D. van Driel, G. van Steenpaal, M. Espadoto, and A. Telea, “Using multiple attribute-based explanations of multidimen- sional projections to explore high-dimensional data,” Computers & Graphics, vol. 98, pp. 93–104, 2021

work page 2021

[17] [17]

DimReader: Axis Lines That Explain Non-Linear Projections,

R. Faust, D. Glickenstein, and C. Scheidegger, “DimReader: Axis Lines That Explain Non-Linear Projections,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 481–490, 2019

work page 2019

[18] [18]

Explaining Groups of Points in Low-Dimensional Representations,

G. Plumb, J. Terhorst, S. Sankararaman, and A. Talwalkar, “Explaining Groups of Points in Low-Dimensional Representations,” in Proceedings of the 37th International Conference on Machine Learning (H. D. III and A. Singh, eds.), vol. 119 of Proceedings of Machine Learning Research, pp. 7762–7771, PMLR, 13–18 Jul 2020

work page 2020

[19] [19]

Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrich- ment,

L. G. Nonato and M. Aupetit, “Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrich- ment,” IEEE Transactions on Visualization and Computer Graphics , vol. 25, no. 8, pp. 2650–2673, 2019

work page 2019

[20] [20]

Techniques for precision-based visual analysis of projected data,

T. Schreck, T. von Landesberger, and S. Bremm, “Techniques for precision-based visual analysis of projected data,” Information Visual- ization, vol. 9, no. 3, pp. 181–193, 2010

work page 2010

[21] [21]

Stress Maps: Analysing Lo- cal Phenomena in Dimensionality Reduction Based Visualisations,

C. Seifert, V . Sabol, and W. Kienreich, “Stress Maps: Analysing Lo- cal Phenomena in Dimensionality Reduction Based Visualisations,” in EuroVAST 2010: International Symposium on Visual Analytics Science and Technology (J. Kohlhammer and D. Keim, eds.), The Eurographics Association, 2010

work page 2010

[22] [22]

Visualizing distortions and recovering topology in contin- uous projection techniques,

M. Aupetit, “Visualizing distortions and recovering topology in contin- uous projection techniques,” Neurocomputing, vol. 70, no. 7, pp. 1304– 1330, 2007

work page 2007

[23] [23]

Checkviz: Sanity check and topological clues for linear and non-linear mappings,

S. Lespinats and M. Aupetit, “Checkviz: Sanity check and topological clues for linear and non-linear mappings,” Computer Graphics Forum , vol. 30, no. 1, pp. 113–125, 2011

work page 2011

[24] [24]

Visual anal- ysis of dimensionality reduction quality for parameterized projections,

R. M. Martins, D. B. Coimbra, R. Minghim, and A. Telea, “Visual anal- ysis of dimensionality reduction quality for parameterized projections,” Computers & Graphics , vol. 41, pp. 26–42, 2014

work page 2014

[25] [25]

ProxiLens: Interactive Explo- ration of High-Dimensional Data using Projections,

N. Heulot, M. Aupetit, and J.-D. Fekete, “ProxiLens: Interactive Explo- ration of High-Dimensional Data using Projections,” in EuroVis Work- shop on Visual Analytics using Multidimensional Projections(M. Aupetit and L. van der Maaten, eds.), The Eurographics Association, 2013

work page 2013

[26] [26]

Probing projections: Interaction techniques for interpreting arrangements and errors of dimen- sionality reductions,

J. Stahnke, M. D ¨ork, B. M ¨uller, and A. Thom, “Probing projections: Interaction techniques for interpreting arrangements and errors of dimen- sionality reductions,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 629–638, 2016

work page 2016

[27] [27]

Natively Interpretable t-SNE,

E. Couplet, P. Lambert, M. Verleysen, D. Mulders, J. A. Lee, and C. De Bodt, “Natively Interpretable t-SNE,” in Proceedings of AIMLAI workshop, vol. 1, p. 1, 2023

work page 2023

[28] [28]

To- ward computing attributions for dimensionality reduction techniques,

M. Scicluna, J.-C. Grenier, R. Poujol, S. Lemieux, and J. G. Hussin, “To- ward computing attributions for dimensionality reduction techniques,” Bioinformatics advances, vol. 3, no. 1, p. vbad097, 2023

work page 2023

[29] [29]

Visual Analysis of Multi- Dimensional Categorical Data Sets,

B. Broeksema, A. C. Telea, and T. Baudel, “Visual Analysis of Multi- Dimensional Categorical Data Sets,” Computer Graphics Forum, 2013

work page 2013

[30] [30]

Explaining three-dimensional dimensionality reduction plots,

D. B. Coimbra, R. M. Martins, T. T. Neves, A. C. Telea, and F. V . Paulovich, “Explaining three-dimensional dimensionality reduction plots,” Information Visualization, vol. 15, no. 2, pp. 154–172, 2016

work page 2016

[31] [31]

Uncovering representative groups in multidimensional projections,

P. Joia, F. Petronetto, and L. Nonato, “Uncovering representative groups in multidimensional projections,” Computer Graphics Forum , vol. 34, no. 3, pp. 281–290, 2015

work page 2015

[32] [32]

Understanding attribute variability in multidimensional projections,

L. Pagliosa, P. Pagliosa, and L. G. Nonato, “Understanding attribute variability in multidimensional projections,” in 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) , pp. 297– 304, 2016

work page 2016

[33] [33]

Explanation in artificial intelligence: Insights from the social sciences,

T. Miller, “Explanation in artificial intelligence: Insights from the social sciences,” Artificial Intelligence, vol. 267, pp. 1–38, 2019

work page 2019

[34] [34]

Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning,

T. Fujiwara, O.-H. Kwon, and K.-L. Ma, “Supporting Analysis of Dimensionality Reduction Results with Contrastive Learning,” IEEE Transactions on Visualization and Computer Graphics , vol. 26, no. 1, pp. 45–55, 2020

work page 2020

[35] [35]

Contrastive analysis for scatterplot-based representations of dimensionality reduction,

W. E. Marc ´ılio-Jr, D. M. Eler, and R. E. Garcia, “Contrastive analysis for scatterplot-based representations of dimensionality reduction,” Com- puters & Graphics , vol. 101, pp. 46–58, 2021

work page 2021

[36] [36]

Explaining dimensionality reduction results using shapley values,

W. E. Marc ´ılio-Jr and D. M. Eler, “Explaining dimensionality reduction results using shapley values,”Expert Systems with Applications, vol. 178, p. 115020, 2021

work page 2021

[37] [37]

t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections,

A. Chatzimparmpas, R. M. Martins, and A. Kerren, “t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 8, pp. 2696–2714, 2020

work page 2020

[38] [38]

Visual explo- ration of relationships and structure in low-dimensional embeddings,

K. Eckelt, A. Hinterreiter, P. Adelberger, C. Walchshofer, V . Dhanoa, C. Humer, M. Heckmann, C. Steinparz, and M. Streit, “Visual explo- ration of relationships and structure in low-dimensional embeddings,” IEEE Transactions on Visualization and Computer Graphics , vol. 29, p. 3312–3326, mar 2022

work page 2022

[39] [39]

Explaining t-SNE embeddings locally by adapting LIME,

A. Bibal, V . Vu, G. Nanfack, and B. Fr ´enay, “Explaining t-SNE embeddings locally by adapting LIME,” in ESANN 2020, pp. 393–398, ESANN (i6doc.com), 2020

work page 2020

[40] [40]

Splatterplots: Overcoming overdraw in scatter plots,

A. Mayorga and M. Gleicher, “Splatterplots: Overcoming overdraw in scatter plots,” IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 9, pp. 1526–1538, 2013

work page 2013

[41] [41]

General projective maps for multidi- mensional data projection,

D. J. Lehmann and H. Theisel, “General projective maps for multidi- mensional data projection,” Computer Graphics Forum , vol. 35, no. 2, pp. 443–453, 2016

work page 2016

[42] [42]

The magical number seven, plus or minus two: Some lim- its on our capacity for processing information.,

G. A. Miller, “The magical number seven, plus or minus two: Some lim- its on our capacity for processing information.,” Psychological Review, vol. 63, no. 2, pp. 81–97, 1956

work page 1956

[43] [43]

The magical number 4 in short-term memory: A recon- sideration of mental storage capacity,

N. Cowan, “The magical number 4 in short-term memory: A recon- sideration of mental storage capacity,” Behavioral and Brain Sciences , vol. 24, no. 1, p. 87–114, 2001

work page 2001

[44] [44]

Chunks in expert memory: evidence for the magical number four ... or is it two?,

F. Gobet and G. Clarkson, “Chunks in expert memory: evidence for the magical number four ... or is it two?,” Memory, vol. 12, pp. 732–747, Nov. 2004

work page 2004

[45] [45]

Ibm employee attrition dataset

I. W. D. Scientists, “Ibm employee attrition dataset.”

work page

[46] [46]

openTSNE: A Modular Python Library for t-SNE Dimensionality Reduction and Embedding,

P. G. Poli ˇcar, M. Straˇzar, and B. Zupan, “openTSNE: A Modular Python Library for t-SNE Dimensionality Reduction and Embedding,” Journal of Statistical Software , vol. 109, no. 3, p. 1–30, 2024

work page 2024

[47] [47]

Orange: data mining toolbox in python,

J. Dem ˇsar, T. Curk, A. Erjavec, ˇC. Gorup, T. Ho ˇcevar, M. Milutinovi ˇc, M. Mo ˇzina, M. Polajnar, M. Toplak, A. Stari ˇc, et al. , “Orange: data mining toolbox in python,” The Journal of Machine Learning Research , vol. 14, no. 1, pp. 2349–2353, 2013

work page 2013

[48] [48]

The species problem in iris,

E. Anderson, “The species problem in iris,” Annals of the Missouri Botanical Garden, vol. 23, no. 3, pp. 457–509, 1936

work page 1936

[49] [49]

C ¸ inar, M

I. C ¸ inar, M. Koklu, and S. Tasdemir, “Raisin.” UCI Machine Learning Repository, 2023

work page 2023

[50] [50]

A. M. Horst, A. P. Hill, and K. B. Gorman, palmerpenguins: Palmer Archipelago (Antarctica) penguin data , 2020. R package version 0.1.0

work page 2020

[51] [51]

The “unusual episode

R. J. M. Dawson, “The “unusual episode” data revisited,” Journal of Statistics Education, vol. 3, no. 3, 1995

work page 1995

[52] [52]

Modeling wine preferences by data mining from physicochemical properties,

P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, “Modeling wine preferences by data mining from physicochemical properties,” Decision support systems , vol. 47, no. 4, pp. 547–553, 2009

work page 2009

[53] [53]

Language models are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert- V oss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amode...

work page 1901

[54] [54]

LLaMA: Open and Efficient Foundation Language Models,

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, F. Azhar, A. Rodriguez, A. Joulin, E. Grave, and G. Lample, “LLaMA: Open and Efficient Foundation Language Models,” 2023

work page 2023

[55] [55]

The Gene Ontology (GO) database and informatics resource,

G. O. Consortium, “The Gene Ontology (GO) database and informatics resource,” Nucleic Acids Research , vol. 32, pp. D258–D261, 01 2004

work page 2004

[56] [56]

Nation-wide eprescription data reveals landscape of physicians and their drug prescribing patterns in slovenia,

P. G. Poli ˇcar, D. Stanimirovi´c, and B. Zupan, “Nation-wide eprescription data reveals landscape of physicians and their drug prescribing patterns in slovenia,” in Artificial Intelligence in Medicine (J. M. Juarez, M. Mar- cos, G. Stiglic, and A. Tucker, eds.), (Cham), pp. 283–292, Springer Nature Switzerland, 2023

work page 2023