SmartIterator: Visual Analytics Workflows for Supervising Unsupervised Data Grouping

Gennady Andrienko; Natalia Andrienko

arxiv: 2605.28219 · v1 · pith:46OKCFB3new · submitted 2026-05-27 · 💻 cs.HC · cs.AI· cs.LG

SmartIterator: Visual Analytics Workflows for Supervising Unsupervised Data Grouping

Gennady Andrienko , Natalia Andrienko This is my paper

Pith reviewed 2026-06-29 10:16 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.LG

keywords visual analyticsunsupervised learningclusteringtopic modelingparameter explorationworkflowsdata groupingiterative analysis

0 comments

The pith

SmartIterator supplies six-phase workflows that convert unsupervised grouping parameter sweeps into cumulative understanding of data structure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SmartIterator as a visual analytics approach that treats sequences of grouping results from parameter sweeps as the primary object of study rather than any isolated output. It supplies method-specific six-phase workflows for density-based clustering, partition-based clustering, and topic modeling that move from quality overviews through stability checks, membership assessment, content review, archetype identification, and final decisions. These are realized in the IteraScope interface with linked charts, Sankey transition views, violin plots, and domain-linked displays. The central argument is that this structured process produces knowledge of how groupings form and persist across settings that no single configuration can deliver. A reader would care because unsupervised methods always require human oversight, and the workflows aim to make that oversight systematic and evidence-based.

Core claim

SmartIterator defines structured six-phase workflows for each of three unsupervised method families and operationalizes them in IteraScope, a coordinated visual system that combines quality-metric charts, 1D group embeddings with Sankey flows and confidence violins, 2D embeddings with HDBSCAN-detected recurrent archetypes, and domain-specific views. Demonstrations on simulated social-media messages, EU regional statistics, and thirty years of VIS papers show the workflows guiding analysts from metric overviews to informed choices while revealing how data structure evolves. The paper states that the workflows themselves constitute the main contribution because they yield knowledge about the d

What carries the argument

SmartIterator's six-phase workflows, realized through IteraScope's coordinated displays of quality metrics, transition flows, membership confidence, and recurrent archetypes.

If this is right

Analysts obtain systematic guidance for navigating parameter spaces specific to each clustering or topic-modeling family.
The process reveals how data groupings evolve, stabilize, or recur across configurations.
Membership confidence and persistent archetypes become explicit objects of inspection rather than implicit judgments.
Domain context is incorporated at each phase to ground interpretations.
The outcome is knowledge of data organization that exceeds what any isolated grouping supplies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same phased structure could be adapted to other parameter-sensitive unsupervised tasks such as dimensionality reduction or anomaly detection.
Visual analytics tools for machine learning might shift emphasis from single-result optimization toward explicit support for sequence exploration.
The approach implies that interfaces should surface transition patterns and archetype recurrence as primary navigation aids.
Deployment on streaming or very large collections would test whether the current visual encodings scale without loss of the claimed cumulative insight.

Load-bearing premise

That the six-phase workflow structure produces more reliable decisions about groupings than ad-hoc exploration of the same parameter sweeps.

What would settle it

A direct comparison in which one group of analysts follows the six-phase workflows on a shared dataset while another explores the same results without the structure, then measuring differences in the groupings they select or the justifications they provide.

Figures

Figures reproduced from arXiv: 2605.28219 by Gennady Andrienko, Natalia Andrienko.

**Figure 2.** Figure 2: Round 1: ε = 0.05 . . . 1.0, step 0.05, min_samples = 150. The metrics chart (top of b) shows silhouette peaking near ε = 0.1; the Sankey view (bottom of b) shows rapid cluster merging beyond ε = 0.15. The map and space-time cube (a) reveal temporal layering of the contamination event. See Appendix A, Section A.2 for additional views. grouping to the linked map and space-time cube confirms this assessment… view at source ↗

**Figure 3.** Figure 3: Round 3: sweeping min_samples from 100 to 200 (step 10, ε = 0.07). The analyst hides intermediate iterations, compares 130 and 150, and highlights messages transitioning to noise—they originate from cluster borders in the contamination-affected areas. See Appendix A, Section A.4. be obtained from any single clustering run. 6.2 Partition-Based Clustering: EU Population Structure The data comprise demographi… view at source ↗

**Figure 4.** Figure 4: Partition-based clustering of EU NUTS-3 population data. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Left: Map for K = 20—the violet cluster (20.10) forms a spatially contiguous block. Right: Parallel coordinates—the violet cluster (highlighted) shows low shares of female and young age groups. chart ( [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 7.** Figure 7: NMF topic modeling (ntopics = 5 . . . 25) after data cleaning. The bar chart shows topic counts per iteration with color-coded archetype membership; the 2D embedding (right) confirms tight archetype clusters. Only ∼11% of topic instances are HDBSCAN noise. See Appendix C, Section C.2 [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Sankey transitions between ntopics = 15 and 16 with violin plots in split mode. The new “method” topic (cyan) draws thin bands from multiple parents—a wastebasket pattern. Violins confirm higher membership confidence at 15 topics. See Appendix C, Section C.2. shrinks substantially at ntopics = 16. The Sankey view (Appendix C, Fig. C.9) reveals that spatio-temporal papers migrate to the spatial topic, while… view at source ↗

**Figure 6.** Figure 6: Transition and confidence analysis from cluster 20.10 ( [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 9.** Figure 9: Word clouds (frequency-weighted on the left, and term-weighted [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Term-weight word clouds for topic 12 across the 15-to-16 transi [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 11.** Figure 11: Temporal prevalence of the 15 topics (proportional stacked [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

read the original abstract

Unsupervised learning methods -- topic modeling, partition-based and density-based clustering -- produce data groupings without human guidance, yet choosing and evaluating those groupings should not itself be unsupervised. We present \emph{SmartIterator}~(SI), a visual analytics approach that treats the full sequence of grouping results across a parameter sweep as a first-class analytical object. For each method family, SI provides a structured six-phase workflow that guides the analyst through systematic exploration of grouping results -- from quality-metric overview through transition-stability assessment, membership-confidence evaluation, content and context inspection, and recurrent-archetype verification to an informed decision -- building cumulative understanding of data structure along the way. The workflows are operationalized through \emph{IteraScope}~(IS), a coordinated visual display combining quality-metric charts with semantic color encoding, a 1D group embedding with Sankey-style transition flows and violin plots of membership confidence, a 2D group embedding with HDBSCAN-detected recurrent archetypes that highlights iterations capturing all persistent patterns, and domain-specific linked views for contextualized interpretation. We demonstrate the three workflows on: (1)~simulated social-media messages from the VAST Challenge 2011 (density-based clustering, validated against ground truth), (2)~EU population statistics across ${\sim}1\,500$ NUTS-3 regions (partition-based clustering), and (3)~30 years of IEEE VIS papers (NMF topic modeling). The workflows constitute the main contribution: they provide actionable, method-specific guidance for navigating parameter spaces, studying how data structure evolves across configurations, and grounding analytical understanding in domain context -- yielding knowledge about the data that no single ``best'' result can provide.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SmartIterator lays out six-phase workflows and IteraScope views for exploring clustering parameter sweeps, but the demonstrations supply no evidence that the structure beats ad-hoc inspection.

read the letter

The paper's core offering is a set of method-specific six-phase workflows for walking through unsupervised grouping results across parameter ranges, implemented via the IteraScope coordinated display. The phases run from quality-metric overview to transition stability, membership confidence, content inspection, archetype verification, and final decision. The visuals include Sankey-style flows, violin plots, 2D embeddings with HDBSCAN for recurrent patterns, and linked domain views.

What is actually new is the explicit workflow structure itself and the way it treats the full sweep as the object of analysis rather than isolated results. The three cases (VAST 2011 density clustering, EU NUTS-3 partition clustering, and 30 years of VIS topic modeling) show the interface handling real data and highlighting persistent structures.

The paper does a reasonable job making the phases concrete and tying them to existing visual analytics components. An analyst following the sequence would at least have a repeatable path instead of guessing where to look next in a large parameter space.

The main limitation is the absence of any comparative evaluation. The cases illustrate the tool but include no user study, no timed task, and no metric showing that the structured path produces measurably better decisions or more reliable insights than simply scanning the same results without the phases. The claim that the workflow yields knowledge no single best result can provide is stated but not tested.

This work is aimed at visual analytics practitioners who already use clustering or topic models and want a more systematic way to inspect parameter effects. Readers looking for new algorithms or large-scale empirical results will find little here.

The paper deserves a serious referee. The idea is practical, the implementation is described clearly, and the gap in validation is fixable rather than fatal. I would send it to review.

Referee Report

2 major / 2 minor

Summary. The paper claims to introduce SmartIterator (SI), a visual analytics approach that treats the full sequence of unsupervised grouping results (from topic modeling, partition-based and density-based clustering) across parameter sweeps as a first-class object. It defines structured six-phase workflows (quality overview, transition stability, membership confidence, content inspection, archetype verification, decision) operationalized in the IteraScope interface with coordinated views including quality charts, Sankey flows, violin plots, 2D embeddings with HDBSCAN archetypes, and domain views. These are demonstrated on three cases (VAST 2011 density clustering with ground truth, EU NUTS-3 partition clustering, 30-year IEEE VIS NMF topics) to show how the workflows yield cumulative, method-specific understanding of data structure beyond any single 'best' result.

Significance. If validated, the contribution of method-specific structured workflows for parameter-space navigation in unsupervised grouping could meaningfully advance visual analytics practice in HC by shifting focus from result selection to evolutionary understanding of groupings. The coordinated multi-view design and emphasis on recurrent archetypes and transition stability offer a concrete operationalization that addresses a common pain point in exploratory analysis of clustering and topic models.

major comments (2)

[Abstract / Case studies] Abstract and demonstration sections: The central claim that the six-phase workflows 'provide actionable, method-specific guidance ... yielding knowledge about the data that no single "best" result can provide' and outperform ad-hoc exploration is load-bearing for the contribution, yet the three case studies supply only qualitative demonstrations with no user study, no controlled comparison against ad-hoc sweeps, and no quantitative metrics (e.g., insight quality, decision reliability, or time-to-insight) to support superiority.
[Abstract] Abstract: The soundness assessment is limited because no error analysis, validation metrics, or quantitative results are reported even for the VAST 2011 case that has ground truth; without these, it is impossible to verify whether the workflows actually improve grouping supervision as asserted.

minor comments (2)

The manuscript would benefit from explicit section numbering or subsection labels when referencing the six-phase workflow components and the three demonstration cases to aid navigation.
Notation for the IteraScope views (e.g., how semantic color encoding is computed or how HDBSCAN archetypes are thresholded) could be clarified with a small table or pseudocode for reproducibility.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and constructive feedback. We respond to each major comment below, focusing on the scope of our contribution as a design-oriented visual analytics paper.

read point-by-point responses

Referee: [Abstract / Case studies] Abstract and demonstration sections: The central claim that the six-phase workflows 'provide actionable, method-specific guidance ... yielding knowledge about the data that no single "best" result can provide' and outperform ad-hoc exploration is load-bearing for the contribution, yet the three case studies supply only qualitative demonstrations with no user study, no controlled comparison against ad-hoc sweeps, and no quantitative metrics (e.g., insight quality, decision reliability, or time-to-insight) to support superiority.

Authors: The primary contribution is the definition of the six-phase workflows themselves, which structure the supervision process for each method family and treat the full parameter sweep as the analytical object. The case studies illustrate the application of these workflows and the types of cumulative insights they enable (e.g., identifying recurrent archetypes and transition patterns). The manuscript does not assert quantitative superiority or 'outperform' claims via metrics; the language emphasizes guidance and knowledge beyond any single result. We agree a controlled comparison or user study would provide additional evidence but lies outside the current scope. We will revise the abstract and conclusion to remove any implication of outperformance and explicitly note the qualitative nature of the demonstrations. revision: partial
Referee: [Abstract] Abstract: The soundness assessment is limited because no error analysis, validation metrics, or quantitative results are reported even for the VAST 2011 case that has ground truth; without these, it is impossible to verify whether the workflows actually improve grouping supervision as asserted.

Authors: The VAST 2011 demonstration uses the ground truth to show how the workflow phases (particularly membership confidence, content inspection, and archetype verification) surface the parameter settings that recover the known structure. However, we acknowledge that no explicit quantitative validation (e.g., agreement metrics with ground truth or error rates across iterations) is reported. This omission weakens the soundness assessment. We will add a dedicated quantitative validation subsection for the VAST case in the revised manuscript, computing relevant metrics from the existing results. revision: yes

standing simulated objections not resolved

Absence of a formal user study or controlled comparison against ad-hoc exploration, as this would require new empirical data collection and analysis beyond the scope of the current work.

Circularity Check

0 steps flagged

No circularity: workflows are design contributions with no derivations or fitted quantities

full rationale

The paper describes a visual analytics system and six-phase workflows for exploring parameter sweeps in unsupervised grouping methods. No equations, parameter fitting, predictions, or mathematical derivations appear in the abstract or described content. The central claim is that the structured workflows (quality overview through archetype verification) yield cumulative understanding superior to ad-hoc exploration, but this is presented as a methodological proposal demonstrated via case studies rather than derived from prior results. No self-citation chains, ansatzes, or renamings of known results are invoked as load-bearing steps. The contribution is self-contained as a design artifact; absence of comparative user studies is a validation gap, not a circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a visual analytics methods paper; the abstract introduces no free parameters, mathematical axioms, or new postulated entities. The workflows and interface are the added artifacts.

pith-pipeline@v0.9.1-grok · 5844 in / 1196 out tokens · 25655 ms · 2026-06-29T10:16:35.794969+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

60 extracted references · 44 canonical work pages · 2 internal anchors

[1]

Alexander and M

E. Alexander and M. Gleicher. Task-driven comparison of topic models. IEEE Transactions on Visualization and Computer Graphics, 22(1):320– 329, 2016. doi: 10.1109/TVCG.2015.2467618 3, 11

work page doi:10.1109/tvcg.2015.2467618 2016
[2]

Alsallakh, W

B. Alsallakh, W. Aigner, S. Miksch, and H. Hauser. Reinventing the contingency wheel: Scalable visual analytics of large categorical data. IEEE Transactions on Visualization and Computer Graphics, 18(12):2849– 2858, 2012. doi: 10.1109/TVCG.2012.254 2

work page doi:10.1109/tvcg.2012.254 2012
[3]

Andrienko, N

G. Andrienko, N. Andrienko, P. Bak, D. Keim, and S. Wrobel.Visual Analytics of Movement. Springer, 2013. doi: 10.1007/978-3-642-37583-5 4

work page doi:10.1007/978-3-642-37583-5 2013
[4]

Andrienko and G

N. Andrienko and G. Andrienko.Exploratory Analysis of Spatial and Temporal Data: A Systematic Approach. Springer, 2006. doi: 10.1007/3 -540-31190-4 3

work page doi:10.1007/3 2006
[5]

Andrienko, G

N. Andrienko, G. Andrienko, L. Adilova, and S. Wrobel. Visual analytics for human-centered machine learning.IEEE Computer Graphics and Applications, 42(1):123–133, 2022. doi: 10.1109/MCG.2021.3130314 3

work page doi:10.1109/mcg.2021.3130314 2022
[6]

Ankerst, M

M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander. OPTICS: Ordering points to identify the clustering structure.ACM SIGMOD Record, 28(2):49–60, 1999. doi: 10.1145/304182.304187 4

work page doi:10.1145/304182.304187 1999
[7]

Ben-David, U

S. Ben-David, U. von Luxburg, and D. Pál. A sober look at clustering stability. InProceedings of the 19th Annual Conference on Learning Theory (COLT), pp. 5–19. Springer, 2006. doi: 10.1007/11776420_4 1, 3

work page doi:10.1007/11776420_4 2006
[8]

D. M. Blei, A. Y . Ng, and M. I. Jordan. Latent Dirichlet allocation.Journal of Machine Learning Research, 3:993–1022, 2003. 12

2003
[9]

R. J. G. B. Campello, D. Moulavi, and J. Sander. Density-based clustering based on hierarchical density estimates. InAdvances in Knowledge Discov- ery and Data Mining (PAKDD), vol. 7819 ofLecture Notes in Computer Science, pp. 160–172. Springer, 2013. doi: 10.1007/978-3-642-37456 -2_14 3

work page doi:10.1007/978-3-642-37456 2013
[10]

Cashman, S

D. Cashman, S. R. Humayoun, F. Heber, K. Park, S. Das, and R. Chang. A user-based visual analytics workflow for exploratory model analysis. Computer Graphics Forum, 38(3):185–199, 2019. doi: 10.1111/cgf.13681 2

work page doi:10.1111/cgf.13681 2019
[11]

Cavallo and Ç

M. Cavallo and Ç. Demiralp. Clustrophile 2: Guided visual clustering analysis.IEEE Transactions on Visualization and Computer Graphics, 25(1):267–276, 2019. doi: 10.1109/TVCG.2018.2864477 2, 11

work page doi:10.1109/tvcg.2018.2864477 2019
[12]

J. Choo, C. Lee, C. K. Reddy, and H. Park. UTOPIAN: User-driven topic modeling based on interactive nonnegative matrix factorization.IEEE Transactions on Visualization and Computer Graphics, 19(12):1992–2001,

1992
[13]

doi: 10.1109/TVCG.2013.212 2, 11

work page doi:10.1109/tvcg.2013.212 2013
[14]

Chuang, C

J. Chuang, C. D. Manning, and J. Heer. Termite: Visualization techniques for assessing textual topic models. InProceedings of the International Working Conference on Advanced Visual Interfaces (AVI), pp. 74–77, 2012. doi: 10.1145/2254556.2254572 2

work page doi:10.1145/2254556.2254572 2012
[15]

Chuang, D

J. Chuang, D. Ramage, C. Manning, and J. Heer. Interpretation and trust: Designing model-driven visualizations for text analysis. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 443–452. ACM, 2012. doi: 10.1145/2207676.2207738 2

work page doi:10.1145/2207676.2207738 2012
[16]

K. Cook, G. Grinstein, M. Whiting, M. Cooper, P. Havig, K. Liber et al. V AST challenge 2011. InProceedings of the IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 299–300, 2011. doi: 10. 1109/V AST.2011.6102478 5

work page arXiv 2011
[17]

W. Cui, S. Liu, L. Tan, C. Shi, Y . Song, Z. Gao et al. TextFlow: Towards better understanding of evolving topics in text.IEEE Transactions on Visualization and Computer Graphics, 17(12):2412–2421, 2011. doi: 10. 1109/TVCG.2011.239 3

2011
[18]

El-Assady, R

M. El-Assady, R. Sevastjanova, F. Sperrle, D. Keim, and C. Collins. Progressive learning of topic modeling parameters: A visual analytics framework.IEEE Transactions on Visualization and Computer Graphics, 24(1):382–391, 2018. doi: 10.1109/TVCG.2017.2745080 2

work page doi:10.1109/tvcg.2017.2745080 2018
[19]

Espadoto, R

M. Espadoto, R. M. Martins, A. Kerren, N. S. T. Hirata, and A. C. Telea. Toward a quantitative survey of dimensionality reduction techniques.IEEE Transactions on Visualization and Computer Graphics, 27(3):2153–2173,
[20]

doi: 10.1109/TVCG.2019.2944182 3

work page doi:10.1109/tvcg.2019.2944182 2019
[21]

Ester, H.-P

M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise.Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD), pp. 226–231, 1996. 4

1996
[22]

Ferreira, J

N. Ferreira, J. Poco, H. T. V o, J. Freire, and C. T. Silva. Visual exploration of big spatio-temporal urban data: A study of New York City taxi trips. IEEE Transactions on Visualization and Computer Graphics, 19(12):2149– 2158, 2013. doi: 10.1109/TVCG.2013.226 3

work page doi:10.1109/tvcg.2013.226 2013
[23]

A. L. N. Fred and A. K. Jain. Combining multiple clusterings using evi- dence accumulation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6):835–850, 2005. doi: 10.1109/TPAMI.2005.113 2, 3

work page doi:10.1109/tpami.2005.113 2005
[24]

Gleicher, D

M. Gleicher, D. Albers, R. Walker, I. Jusufi, C. D. Hansen, and J. C. Roberts. Visual comparison for information visualization.Information Visualization, 10(4):289–309, 2011. doi: 10.1177/1473871611416549 2

work page doi:10.1177/1473871611416549 2011
[25]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

M. Grootendorst. BERTopic: Neural topic modeling with a class-based TF-IDF procedure.arXiv preprint arXiv:2203.05794, 2022. doi: 10. 48550/arXiv.2203.05794 3, 12

work page internal anchor Pith review Pith/arXiv arXiv 2022
[26]

D. Guo, J. Chen, A. MacEachren, and K. Liao. A visualization system for space-time and multivariate patterns (vis-stamp).IEEE Transactions on Visualization and Computer Graphics, 12(6):1461–1474, 2006. doi: 10. 1109/TVCG.2006.84 3

2006
[27]

C. Hennig. Cluster-wise assessment of cluster stability.Computational Statistics & Data Analysis, 52(1):258–271, 2007. doi: 10.1016/j.csda. 2006.11.025 2, 3

work page doi:10.1016/j.csda 2007
[28]

Huang, Y

H. Huang, Y . Wang, and C. Rudin. Navigating the effect of parametrization for dimensionality reduction. InProceedings of the 38th International Conference on Neural Information Processing Systems, NIPS ’24, art. no. 413, 43 pages. Curran Associates Inc., Red Hook, NY , USA, 2024. 5

2024
[29]

Isenberg, F

P. Isenberg, F. Heimerl, S. Koch, T. Isenberg, P. Xu, C. D. Stolper et al. vispubdata.org: A metadata collection about IEEE visualization (VIS) publications.IEEE Transactions on Visualization and Computer Graphics, 23(9):2199–2206, 2017. doi: 10.1109/TVCG.2016.2615308 7

work page doi:10.1109/tvcg.2016.2615308 2017
[30]

Isenberg, P

T. Isenberg, P. Isenberg, J. Chen, M. Sedlmair, and T. Möller. A systematic review on the practice of evaluating visualization.IEEE Transactions on Visualization and Computer Graphics, 19(12):2818–2827, Dec 2013. doi: 10.1109/TVCG.2013.126 12

work page doi:10.1109/tvcg.2013.126 2013
[31]

D. A. Keim, G. Andrienko, J.-D. Fekete, C. Görg, J. Kohlhammer, and G. Melançon. Visual analytics: Definition, process, and challenges. In Information Visualization: Human-Centered Issues and Perspectives, pp. 154–175. Springer, 2008. doi: 10.1007/978-3-540-70956-5_7 3

work page doi:10.1007/978-3-540-70956-5_7 2008
[32]

M. Kim, K. Kang, D. Park, J. Choo, and N. Elmqvist. TopicLens: Efficient multi-level visual topic exploration of large-scale document collections. IEEE Transactions on Visualization and Computer Graphics, 23(1):151– 160, 2017. doi: 10.1109/TVCG.2016.2598445 2

work page doi:10.1109/tvcg.2016.2598445 2017
[33]

Kosara, F

R. Kosara, F. Bendix, and H. Hauser. Parallel sets: Interactive exploration and visual analysis of categorical data.IEEE Transactions on Visualization and Computer Graphics, 12(4):558–568, 2006. doi: 10.1109/TVCG.2006. 76 3

work page doi:10.1109/tvcg.2006 2006
[34]

B. C. Kwon, B. Eysenbach, J. Verma, K. Ng, C. De Filippi, W. F. Stewart et al. ClusterVision: Visual supervision of unsupervised clustering.IEEE Transactions on Visualization and Computer Graphics, 24(1):142–151,
[35]

doi: 10.1109/TVCG.2017.2745085 2, 11

work page doi:10.1109/tvcg.2017.2745085 2017
[36]

H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. Empirical studies in information visualization: Seven scenarios.IEEE Transactions on Visualization and Computer Graphics, 18(9):1520–1536, Sep. 2012. doi: 10.1109/TVCG.2011.279 12

work page doi:10.1109/tvcg.2011.279 2012
[37]

D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization.Nature, 401:788–791, 1999. doi: 10.1038/44565 4

work page doi:10.1038/44565 1999
[38]

McInnes, J

L. McInnes, J. Healy, and S. Astels. hdbscan: Hierarchical density based clustering.Journal of Open Source Software, 2(11):205, 2017. doi: 10. 21105/joss.00205 3, 4, 5

2017
[39]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

L. McInnes, J. Healy, and J. Melville. UMAP: Uniform manifold ap- proximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426, 2018. 5

work page internal anchor Pith review Pith/arXiv arXiv 2018
[40]

Monti, P

S. Monti, P. Tamayo, J. Mesirov, and T. Golub. Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data.Machine Learning, 52:91–118, 2003. doi: 10. 1023/A:1023949509487 2, 3

2003
[41]

L. G. Nonato and M. Aupetit. Multidimensional projection for visual ana- lytics: Linking techniques with distortions, tasks, and layout enrichment. IEEE Transactions on Visualization and Computer Graphics, 25(8):2650– 2673, 2019. doi: 10.1109/TVCG.2018.2846735 3

work page doi:10.1109/tvcg.2018.2846735 2019
[42]

Pister, P

A. Pister, P. Buono, J.-D. Fekete, C. Plaisant, and P. Valdivia. Integrating prior knowledge in mixed-initiative social network clustering.IEEE Transactions on Visualization and Computer Graphics, 27(2):1775–1785,
[43]

doi: 10.1109/TVCG.2020.3030347 2

work page doi:10.1109/tvcg.2020.3030347 2020
[44]

Riehmann, M

P. Riehmann, M. Hanfler, and B. Froehlich. Interactive Sankey diagrams. pp. 233–240, 2005. doi: 10.1109/INFVIS.2005.1532152 3

work page doi:10.1109/infvis.2005.1532152 2005
[45]

Röder, A

M. Röder, A. Both, and A. Hinneburg. Exploring the space of topic coherence measures. InProceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM), pp. 399–408, 2015. doi: 10.1145/2684822.2685324 4

work page doi:10.1145/2684822.2685324 2015
[46]

Rosvall and C

M. Rosvall and C. T. Bergstrom. Mapping change in large networks.PLoS ONE, 5(1):e8694, 2010. doi: 10.1371/journal.pone.0008694 3

work page doi:10.1371/journal.pone.0008694 2010
[47]

P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis.Journal of Computational and Applied Mathematics, 20:53–65, 1987. doi: 10.1016/0377-0427(87)90125-7 4

work page doi:10.1016/0377-0427(87)90125-7 1987
[48]

Sacha, A

D. Sacha, A. Stoffel, F. Stoffel, B. C. Kwon, G. Ellis, and D. A. Keim. Knowledge generation model for visual analytics.IEEE Transactions on Visualization and Computer Graphics, 20(12):1604–1613, 2014. doi: 10. 1109/TVCG.2014.2346481 3

work page arXiv 2014
[49]

Sedlmair, C

M. Sedlmair, C. Heinzl, S. Bruckner, H. Piringer, and T. Möller. Visual parameter space analysis: A conceptual framework.IEEE Transactions on Visualization and Computer Graphics, 20(12):2161–2170, 2014. doi: 10.1109/TVCG.2014.2346321 2

work page doi:10.1109/tvcg.2014.2346321 2014
[50]

Sedlmair, A

M. Sedlmair, A. Tatu, T. Munzner, and M. Tory. A taxonomy of visual cluster separation factors.Computer Graphics Forum, 31(3pt4):1335– 1344, 2012. doi: 10.1111/j.1467-8659.2012.03125.x 3

work page doi:10.1111/j.1467-8659.2012.03125.x 2012
[51]

Seo and B

J. Seo and B. Shneiderman. A rank-by-feature framework for interactive exploration of multidimensional data.Information Visualization, 4(2):96– 113, 2005. doi: 10.1057/palgrave.ivs.9500091 12

work page doi:10.1057/palgrave.ivs.9500091 2005
[52]

Sievert and K

C. Sievert and K. Shirley. LDAvis: A method for visualizing and inter- preting topics. InProceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, pp. 63–70, 2014. doi: 10.3115/ v1/W14-3110 2, 11

2014
[53]

C. D. Stolper, A. Perer, and D. Gotz. Progressive visual analytics: User- driven visual exploration of in-progress analytics.IEEE Transactions on Visualization and Computer Graphics, 20(12):1653–1662, 2014. doi: 10. 1109/TVCG.2014.2346574 2

work page arXiv 2014
[54]

Strehl and J

A. Strehl and J. Ghosh. Cluster ensembles — a knowledge reuse frame- work for combining multiple partitions.Journal of Machine Learning Research, 3:583–617, 2002. 2, 3

2002
[55]

van der Maaten and G

L. van der Maaten and G. Hinton. Visualizing data using t-SNE.Journal of Machine Learning Research, 9:2579–2605, 2008. 5

2008
[56]

von Luxburg

U. von Luxburg. Clustering stability: An overview.Foundations and Trends in Machine Learning, 2(3):235–274, 2010. doi: 10.1561/ 2200000008 1, 2, 3

2010
[57]

Y . Wang, H. Huang, C. Rudin, and Y . Shaposhnik. Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for data visualization.Journal of Machine Learning Research, 22(201):1–73, 2021. 3

2021
[58]

Y . Wang, Y . Sun, H. Huang, and C. Rudin. Dimension reduction with locally adjusted graphs. InProceedings of AAAI’25/IAAI’25/EAAI’25, art. no. 2382, 9 pages. AAAI Press, 2025. doi: 10.1609/aaai.v39i20.35436 5

work page doi:10.1609/aaai.v39i20.35436 2025
[59]

Wenskovitch, I

J. Wenskovitch, I. Crandell, N. Ramakrishnan, L. House, S. Leman, and C. North. Towards a systematic combination of dimension reduction and clustering in visual analytics.IEEE Transactions on Visualization and Computer Graphics, 24(1):131–141, 2018. doi: 10.1109/TVCG.2017. 2745258 2

work page doi:10.1109/tvcg.2017 2018
[60]

Z. Yu, X. Li, P. Liu, and J. Tao. Parallel clusters: Visual comparison of embeddings based on multi-scale neighborhood analysis.IEEE Transac- tions on Visualization and Computer Graphics, 32(3):2758–2772, 2026. doi: 10.1109/TVCG.2026.3654590 3, 11

work page doi:10.1109/tvcg.2026.3654590 2026

[1] [1]

Alexander and M

E. Alexander and M. Gleicher. Task-driven comparison of topic models. IEEE Transactions on Visualization and Computer Graphics, 22(1):320– 329, 2016. doi: 10.1109/TVCG.2015.2467618 3, 11

work page doi:10.1109/tvcg.2015.2467618 2016

[2] [2]

Alsallakh, W

B. Alsallakh, W. Aigner, S. Miksch, and H. Hauser. Reinventing the contingency wheel: Scalable visual analytics of large categorical data. IEEE Transactions on Visualization and Computer Graphics, 18(12):2849– 2858, 2012. doi: 10.1109/TVCG.2012.254 2

work page doi:10.1109/tvcg.2012.254 2012

[3] [3]

Andrienko, N

G. Andrienko, N. Andrienko, P. Bak, D. Keim, and S. Wrobel.Visual Analytics of Movement. Springer, 2013. doi: 10.1007/978-3-642-37583-5 4

work page doi:10.1007/978-3-642-37583-5 2013

[4] [4]

Andrienko and G

N. Andrienko and G. Andrienko.Exploratory Analysis of Spatial and Temporal Data: A Systematic Approach. Springer, 2006. doi: 10.1007/3 -540-31190-4 3

work page doi:10.1007/3 2006

[5] [5]

Andrienko, G

N. Andrienko, G. Andrienko, L. Adilova, and S. Wrobel. Visual analytics for human-centered machine learning.IEEE Computer Graphics and Applications, 42(1):123–133, 2022. doi: 10.1109/MCG.2021.3130314 3

work page doi:10.1109/mcg.2021.3130314 2022

[6] [6]

Ankerst, M

M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander. OPTICS: Ordering points to identify the clustering structure.ACM SIGMOD Record, 28(2):49–60, 1999. doi: 10.1145/304182.304187 4

work page doi:10.1145/304182.304187 1999

[7] [7]

Ben-David, U

S. Ben-David, U. von Luxburg, and D. Pál. A sober look at clustering stability. InProceedings of the 19th Annual Conference on Learning Theory (COLT), pp. 5–19. Springer, 2006. doi: 10.1007/11776420_4 1, 3

work page doi:10.1007/11776420_4 2006

[8] [8]

D. M. Blei, A. Y . Ng, and M. I. Jordan. Latent Dirichlet allocation.Journal of Machine Learning Research, 3:993–1022, 2003. 12

2003

[9] [9]

R. J. G. B. Campello, D. Moulavi, and J. Sander. Density-based clustering based on hierarchical density estimates. InAdvances in Knowledge Discov- ery and Data Mining (PAKDD), vol. 7819 ofLecture Notes in Computer Science, pp. 160–172. Springer, 2013. doi: 10.1007/978-3-642-37456 -2_14 3

work page doi:10.1007/978-3-642-37456 2013

[10] [10]

Cashman, S

D. Cashman, S. R. Humayoun, F. Heber, K. Park, S. Das, and R. Chang. A user-based visual analytics workflow for exploratory model analysis. Computer Graphics Forum, 38(3):185–199, 2019. doi: 10.1111/cgf.13681 2

work page doi:10.1111/cgf.13681 2019

[11] [11]

Cavallo and Ç

M. Cavallo and Ç. Demiralp. Clustrophile 2: Guided visual clustering analysis.IEEE Transactions on Visualization and Computer Graphics, 25(1):267–276, 2019. doi: 10.1109/TVCG.2018.2864477 2, 11

work page doi:10.1109/tvcg.2018.2864477 2019

[12] [12]

J. Choo, C. Lee, C. K. Reddy, and H. Park. UTOPIAN: User-driven topic modeling based on interactive nonnegative matrix factorization.IEEE Transactions on Visualization and Computer Graphics, 19(12):1992–2001,

1992

[13] [13]

doi: 10.1109/TVCG.2013.212 2, 11

work page doi:10.1109/tvcg.2013.212 2013

[14] [14]

Chuang, C

J. Chuang, C. D. Manning, and J. Heer. Termite: Visualization techniques for assessing textual topic models. InProceedings of the International Working Conference on Advanced Visual Interfaces (AVI), pp. 74–77, 2012. doi: 10.1145/2254556.2254572 2

work page doi:10.1145/2254556.2254572 2012

[15] [15]

Chuang, D

J. Chuang, D. Ramage, C. Manning, and J. Heer. Interpretation and trust: Designing model-driven visualizations for text analysis. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 443–452. ACM, 2012. doi: 10.1145/2207676.2207738 2

work page doi:10.1145/2207676.2207738 2012

[16] [16]

K. Cook, G. Grinstein, M. Whiting, M. Cooper, P. Havig, K. Liber et al. V AST challenge 2011. InProceedings of the IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 299–300, 2011. doi: 10. 1109/V AST.2011.6102478 5

work page arXiv 2011

[17] [17]

W. Cui, S. Liu, L. Tan, C. Shi, Y . Song, Z. Gao et al. TextFlow: Towards better understanding of evolving topics in text.IEEE Transactions on Visualization and Computer Graphics, 17(12):2412–2421, 2011. doi: 10. 1109/TVCG.2011.239 3

2011

[18] [18]

El-Assady, R

M. El-Assady, R. Sevastjanova, F. Sperrle, D. Keim, and C. Collins. Progressive learning of topic modeling parameters: A visual analytics framework.IEEE Transactions on Visualization and Computer Graphics, 24(1):382–391, 2018. doi: 10.1109/TVCG.2017.2745080 2

work page doi:10.1109/tvcg.2017.2745080 2018

[19] [19]

Espadoto, R

M. Espadoto, R. M. Martins, A. Kerren, N. S. T. Hirata, and A. C. Telea. Toward a quantitative survey of dimensionality reduction techniques.IEEE Transactions on Visualization and Computer Graphics, 27(3):2153–2173,

[20] [20]

doi: 10.1109/TVCG.2019.2944182 3

work page doi:10.1109/tvcg.2019.2944182 2019

[21] [21]

Ester, H.-P

M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise.Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD), pp. 226–231, 1996. 4

1996

[22] [22]

Ferreira, J

N. Ferreira, J. Poco, H. T. V o, J. Freire, and C. T. Silva. Visual exploration of big spatio-temporal urban data: A study of New York City taxi trips. IEEE Transactions on Visualization and Computer Graphics, 19(12):2149– 2158, 2013. doi: 10.1109/TVCG.2013.226 3

work page doi:10.1109/tvcg.2013.226 2013

[23] [23]

A. L. N. Fred and A. K. Jain. Combining multiple clusterings using evi- dence accumulation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6):835–850, 2005. doi: 10.1109/TPAMI.2005.113 2, 3

work page doi:10.1109/tpami.2005.113 2005

[24] [24]

Gleicher, D

M. Gleicher, D. Albers, R. Walker, I. Jusufi, C. D. Hansen, and J. C. Roberts. Visual comparison for information visualization.Information Visualization, 10(4):289–309, 2011. doi: 10.1177/1473871611416549 2

work page doi:10.1177/1473871611416549 2011

[25] [25]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

M. Grootendorst. BERTopic: Neural topic modeling with a class-based TF-IDF procedure.arXiv preprint arXiv:2203.05794, 2022. doi: 10. 48550/arXiv.2203.05794 3, 12

work page internal anchor Pith review Pith/arXiv arXiv 2022

[26] [26]

D. Guo, J. Chen, A. MacEachren, and K. Liao. A visualization system for space-time and multivariate patterns (vis-stamp).IEEE Transactions on Visualization and Computer Graphics, 12(6):1461–1474, 2006. doi: 10. 1109/TVCG.2006.84 3

2006

[27] [27]

C. Hennig. Cluster-wise assessment of cluster stability.Computational Statistics & Data Analysis, 52(1):258–271, 2007. doi: 10.1016/j.csda. 2006.11.025 2, 3

work page doi:10.1016/j.csda 2007

[28] [28]

Huang, Y

H. Huang, Y . Wang, and C. Rudin. Navigating the effect of parametrization for dimensionality reduction. InProceedings of the 38th International Conference on Neural Information Processing Systems, NIPS ’24, art. no. 413, 43 pages. Curran Associates Inc., Red Hook, NY , USA, 2024. 5

2024

[29] [29]

Isenberg, F

P. Isenberg, F. Heimerl, S. Koch, T. Isenberg, P. Xu, C. D. Stolper et al. vispubdata.org: A metadata collection about IEEE visualization (VIS) publications.IEEE Transactions on Visualization and Computer Graphics, 23(9):2199–2206, 2017. doi: 10.1109/TVCG.2016.2615308 7

work page doi:10.1109/tvcg.2016.2615308 2017

[30] [30]

Isenberg, P

T. Isenberg, P. Isenberg, J. Chen, M. Sedlmair, and T. Möller. A systematic review on the practice of evaluating visualization.IEEE Transactions on Visualization and Computer Graphics, 19(12):2818–2827, Dec 2013. doi: 10.1109/TVCG.2013.126 12

work page doi:10.1109/tvcg.2013.126 2013

[31] [31]

D. A. Keim, G. Andrienko, J.-D. Fekete, C. Görg, J. Kohlhammer, and G. Melançon. Visual analytics: Definition, process, and challenges. In Information Visualization: Human-Centered Issues and Perspectives, pp. 154–175. Springer, 2008. doi: 10.1007/978-3-540-70956-5_7 3

work page doi:10.1007/978-3-540-70956-5_7 2008

[32] [32]

M. Kim, K. Kang, D. Park, J. Choo, and N. Elmqvist. TopicLens: Efficient multi-level visual topic exploration of large-scale document collections. IEEE Transactions on Visualization and Computer Graphics, 23(1):151– 160, 2017. doi: 10.1109/TVCG.2016.2598445 2

work page doi:10.1109/tvcg.2016.2598445 2017

[33] [33]

Kosara, F

R. Kosara, F. Bendix, and H. Hauser. Parallel sets: Interactive exploration and visual analysis of categorical data.IEEE Transactions on Visualization and Computer Graphics, 12(4):558–568, 2006. doi: 10.1109/TVCG.2006. 76 3

work page doi:10.1109/tvcg.2006 2006

[34] [34]

B. C. Kwon, B. Eysenbach, J. Verma, K. Ng, C. De Filippi, W. F. Stewart et al. ClusterVision: Visual supervision of unsupervised clustering.IEEE Transactions on Visualization and Computer Graphics, 24(1):142–151,

[35] [35]

doi: 10.1109/TVCG.2017.2745085 2, 11

work page doi:10.1109/tvcg.2017.2745085 2017

[36] [36]

H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. Empirical studies in information visualization: Seven scenarios.IEEE Transactions on Visualization and Computer Graphics, 18(9):1520–1536, Sep. 2012. doi: 10.1109/TVCG.2011.279 12

work page doi:10.1109/tvcg.2011.279 2012

[37] [37]

D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization.Nature, 401:788–791, 1999. doi: 10.1038/44565 4

work page doi:10.1038/44565 1999

[38] [38]

McInnes, J

L. McInnes, J. Healy, and S. Astels. hdbscan: Hierarchical density based clustering.Journal of Open Source Software, 2(11):205, 2017. doi: 10. 21105/joss.00205 3, 4, 5

2017

[39] [39]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

L. McInnes, J. Healy, and J. Melville. UMAP: Uniform manifold ap- proximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426, 2018. 5

work page internal anchor Pith review Pith/arXiv arXiv 2018

[40] [40]

Monti, P

S. Monti, P. Tamayo, J. Mesirov, and T. Golub. Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data.Machine Learning, 52:91–118, 2003. doi: 10. 1023/A:1023949509487 2, 3

2003

[41] [41]

L. G. Nonato and M. Aupetit. Multidimensional projection for visual ana- lytics: Linking techniques with distortions, tasks, and layout enrichment. IEEE Transactions on Visualization and Computer Graphics, 25(8):2650– 2673, 2019. doi: 10.1109/TVCG.2018.2846735 3

work page doi:10.1109/tvcg.2018.2846735 2019

[42] [42]

Pister, P

A. Pister, P. Buono, J.-D. Fekete, C. Plaisant, and P. Valdivia. Integrating prior knowledge in mixed-initiative social network clustering.IEEE Transactions on Visualization and Computer Graphics, 27(2):1775–1785,

[43] [43]

doi: 10.1109/TVCG.2020.3030347 2

work page doi:10.1109/tvcg.2020.3030347 2020

[44] [44]

Riehmann, M

P. Riehmann, M. Hanfler, and B. Froehlich. Interactive Sankey diagrams. pp. 233–240, 2005. doi: 10.1109/INFVIS.2005.1532152 3

work page doi:10.1109/infvis.2005.1532152 2005

[45] [45]

Röder, A

M. Röder, A. Both, and A. Hinneburg. Exploring the space of topic coherence measures. InProceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM), pp. 399–408, 2015. doi: 10.1145/2684822.2685324 4

work page doi:10.1145/2684822.2685324 2015

[46] [46]

Rosvall and C

M. Rosvall and C. T. Bergstrom. Mapping change in large networks.PLoS ONE, 5(1):e8694, 2010. doi: 10.1371/journal.pone.0008694 3

work page doi:10.1371/journal.pone.0008694 2010

[47] [47]

P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis.Journal of Computational and Applied Mathematics, 20:53–65, 1987. doi: 10.1016/0377-0427(87)90125-7 4

work page doi:10.1016/0377-0427(87)90125-7 1987

[48] [48]

Sacha, A

D. Sacha, A. Stoffel, F. Stoffel, B. C. Kwon, G. Ellis, and D. A. Keim. Knowledge generation model for visual analytics.IEEE Transactions on Visualization and Computer Graphics, 20(12):1604–1613, 2014. doi: 10. 1109/TVCG.2014.2346481 3

work page arXiv 2014

[49] [49]

Sedlmair, C

M. Sedlmair, C. Heinzl, S. Bruckner, H. Piringer, and T. Möller. Visual parameter space analysis: A conceptual framework.IEEE Transactions on Visualization and Computer Graphics, 20(12):2161–2170, 2014. doi: 10.1109/TVCG.2014.2346321 2

work page doi:10.1109/tvcg.2014.2346321 2014

[50] [50]

Sedlmair, A

M. Sedlmair, A. Tatu, T. Munzner, and M. Tory. A taxonomy of visual cluster separation factors.Computer Graphics Forum, 31(3pt4):1335– 1344, 2012. doi: 10.1111/j.1467-8659.2012.03125.x 3

work page doi:10.1111/j.1467-8659.2012.03125.x 2012

[51] [51]

Seo and B

J. Seo and B. Shneiderman. A rank-by-feature framework for interactive exploration of multidimensional data.Information Visualization, 4(2):96– 113, 2005. doi: 10.1057/palgrave.ivs.9500091 12

work page doi:10.1057/palgrave.ivs.9500091 2005

[52] [52]

Sievert and K

C. Sievert and K. Shirley. LDAvis: A method for visualizing and inter- preting topics. InProceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, pp. 63–70, 2014. doi: 10.3115/ v1/W14-3110 2, 11

2014

[53] [53]

C. D. Stolper, A. Perer, and D. Gotz. Progressive visual analytics: User- driven visual exploration of in-progress analytics.IEEE Transactions on Visualization and Computer Graphics, 20(12):1653–1662, 2014. doi: 10. 1109/TVCG.2014.2346574 2

work page arXiv 2014

[54] [54]

Strehl and J

A. Strehl and J. Ghosh. Cluster ensembles — a knowledge reuse frame- work for combining multiple partitions.Journal of Machine Learning Research, 3:583–617, 2002. 2, 3

2002

[55] [55]

van der Maaten and G

L. van der Maaten and G. Hinton. Visualizing data using t-SNE.Journal of Machine Learning Research, 9:2579–2605, 2008. 5

2008

[56] [56]

von Luxburg

U. von Luxburg. Clustering stability: An overview.Foundations and Trends in Machine Learning, 2(3):235–274, 2010. doi: 10.1561/ 2200000008 1, 2, 3

2010

[57] [57]

Y . Wang, H. Huang, C. Rudin, and Y . Shaposhnik. Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for data visualization.Journal of Machine Learning Research, 22(201):1–73, 2021. 3

2021

[58] [58]

Y . Wang, Y . Sun, H. Huang, and C. Rudin. Dimension reduction with locally adjusted graphs. InProceedings of AAAI’25/IAAI’25/EAAI’25, art. no. 2382, 9 pages. AAAI Press, 2025. doi: 10.1609/aaai.v39i20.35436 5

work page doi:10.1609/aaai.v39i20.35436 2025

[59] [59]

Wenskovitch, I

J. Wenskovitch, I. Crandell, N. Ramakrishnan, L. House, S. Leman, and C. North. Towards a systematic combination of dimension reduction and clustering in visual analytics.IEEE Transactions on Visualization and Computer Graphics, 24(1):131–141, 2018. doi: 10.1109/TVCG.2017. 2745258 2

work page doi:10.1109/tvcg.2017 2018

[60] [60]

Z. Yu, X. Li, P. Liu, and J. Tao. Parallel clusters: Visual comparison of embeddings based on multi-scale neighborhood analysis.IEEE Transac- tions on Visualization and Computer Graphics, 32(3):2758–2772, 2026. doi: 10.1109/TVCG.2026.3654590 3, 11

work page doi:10.1109/tvcg.2026.3654590 2026