arxiv: 2605.11428 · v1 · submitted 2026-05-12 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

FastUMAP: Scalable Dimensionality Reduction via Bipartite Landmark Sampling

Hongmin Li

Authors on Pith no claims yet

Pith reviewed 2026-05-13 02:20 UTC · model grok-4.3

classification 💻 cs.LG

keywords dimensionality reductionUMAPlandmark samplingbipartite graphscalable embeddingmanifold learningNystrom extensionexploratory analysis

0 comments

The pith

FastUMAP embeds large datasets in seconds by sampling landmarks and restricting the graph to point-landmark connections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents FastUMAP as a method for repeated exploratory dimensionality reduction on high-dimensional data. It samples a modest number of landmarks, builds a sparse bipartite fuzzy graph linking every point only to those landmarks, obtains a Nystrom spectral initialization from the landmark affinities, and then refines the embedding with a UMAP-style objective. This design lets the landmark ratio control the speed-accuracy trade-off directly. On MNIST and Fashion-MNIST with 70,000 samples the method completes in about 4.6 seconds while preserving 91.4 percent mean kNN accuracy, compared with 73-75 seconds and 94.6 percent for the strongest baseline. The approach is positioned as a practical option when analysts must generate many embeddings under changing conditions rather than a single highest-accuracy result.

Core claim

FastUMAP constructs a sparse bipartite fuzzy simplicial set between all points and a sampled landmark subset, computes a Nystrom extension of the landmark affinities to produce a spectral warm start, and optimizes all coordinates by minimizing a UMAP cross-entropy loss defined solely on the point-landmark edges. The landmark ratio r = m/n provides a single parameter that trades runtime against fidelity. On nine benchmarks spanning 178 to 70,000 points it records the lowest runtime on seven datasets in a default-implementation comparison.

What carries the argument

The bipartite point-landmark fuzzy graph, which restricts all similarity information to connections between points and landmarks and thereby enables both the Nystrom initialization and the subsequent efficient optimization.

If this is right

Analysts can rerun embeddings after changing preprocessing, subsets, or hyperparameters without incurring long delays.
Runtime scales primarily with the number of landmarks rather than the full set of pairwise similarities.
The single landmark-ratio parameter gives a continuous control over the speed-accuracy trade-off across datasets from hundreds to tens of thousands of points.
The method achieves its reported runtimes on standard image and tabular benchmarks while remaining within a few percentage points of stronger but slower nonlinear baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The bipartite approximation could be substituted into other graph-based manifold learners to improve their scalability for repeated runs.
When data density varies sharply across regions, landmark selection may need to be density-aware to avoid under-representing sparse clusters.
The reported speed gains suggest that interactive visualization tools could update embeddings in near real time by reusing the same landmark set across parameter sweeps.

Load-bearing premise

That sampling a modest number of landmarks and restricting the graph to point-landmark connections preserves enough local manifold structure for the subsequent UMAP-style optimization to produce faithful low-dimensional embeddings.

What would settle it

On a dataset with known nonlinear manifold structure, if mean kNN accuracy drops more than five percentage points below the accuracy of full UMAP while runtime remains much lower, the landmark approximation has lost critical neighborhood information.

Figures

Figures reproduced from arXiv: 2605.11428 by Hongmin Li.

**Figure 1.** Figure 1: FastUMAP replaces the full-neighborhood graph bottleneck with an explicit landmark approximation. Standard UMAP-style pipelines build a neighborhood graph over all n data points, making repeated exploratory runs slow. FastUMAP samples m ≪ n landmarks, builds a sparse bipartite fuzzy graph B ∈ R n×m, forms a landmark affinity W = B⊤D−1 x B, computes a Nyström spectral warm start, and refines the embedding w… view at source ↗

**Figure 2.** Figure 2: Quantitative comparison of FastUMAP against standard baselines. FastUMAP sits in the fast part of the accuracy–runtime trade-off while retaining much of the neighborhood quality on the medium-to-large datasets. 4.3 Ablation and Controllable Trade-off The ablations are designed to answer three separate questions. First, is the landmark ratio r = m/n the dominant runtime–fidelity control? Second, does the sp… view at source ↗

**Figure 3.** Figure 3: Spectral versus random initialization on Mfeat. Spectral initialization reaches useful accuracy much earlier than random initialization, which matters when embeddings need to be produced quickly. engineering choice that keeps the spectral stage feasible on commodity hardware; increasing the landmark ratio improves quality, but correspondingly moves the method away from its fastest operating regime. The emp… view at source ↗

**Figure 4.** Figure 4: Runtime–quality trade-off induced by the landmark ratio r = m/n. Accuracy improves smoothly as more landmarks are retained, at a moderate runtime cost. 6 Conclusion We introduced FastUMAP, a UMAP-inspired bipartite landmark method for fast nonlinear dimensionality reduction. The method makes its main trade-off explicit: it replaces the full neighborhood graph with a sparse point-landmark graph, uses a Nys… view at source ↗

**Figure 5.** Figure 5: Single-cell RNA-seq embedding example. FastUMAP preserves the major retinal cell populations while remaining close to UMAP and BH-t-SNE on downstream kNN accuracy. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

read the original abstract

Exploratory analysis of high-dimensional data rarely stops at a single embedding. In practice, analysts rerun dimensionality reduction after changing preprocessing, subsets, or hyperparameters, and standard nonlinear methods can quickly become the bottleneck. We introduce FastUMAP (Bipartite Manifold Approximation and Projection), a landmark-based method designed for this repeated-use setting. FastUMAP builds a sparse point-landmark fuzzy graph, computes a Nystrom spectral warm start from the induced landmark affinity, and then refines all sample coordinates with a UMAP-style objective on the bipartite graph. The landmark ratio r = m/n provides a direct way to trade runtime against fidelity. On 9 benchmark datasets spanning 178 to 70,000 samples, FastUMAP has the lowest runtime on 7 datasets in our reported default-implementation comparison on one workstation. On MNIST and Fashion-MNIST (n=70000), it runs in about 4.6 seconds, compared with about 73--75 seconds for Barnes--Hut t-SNE, while reaching 91.4% mean kNN accuracy versus 94.6% for the strongest accuracy baseline. FastUMAP is therefore best viewed as a fast option for repeated exploratory embedding, rather than as a replacement for accuracy-first methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FastUMAP gives a practical speed boost for repeated UMAP runs on large datasets through bipartite landmark sampling and Nystrom warm-start, with solid but not perfect accuracy on benchmarks.

read the letter

FastUMAP is worth a look if you rerun nonlinear embeddings often after tweaking preprocessing or subsets. The core idea is to sample a fraction r of landmarks, build a sparse bipartite fuzzy graph between all points and those landmarks, get a Nystrom spectral initialization from the landmark affinities, and then refine the full embedding with a UMAP-style objective. This pipeline is scoped explicitly to the repeated-use case rather than one-shot maximum fidelity.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces FastUMAP, a landmark-based approximation to UMAP for scalable dimensionality reduction in repeated exploratory settings. It constructs a sparse bipartite point-landmark fuzzy graph, applies a Nystrom spectral warm-start from the landmark affinity matrix, and refines all point coordinates via UMAP-style optimization on the bipartite graph. The single tunable parameter is the landmark ratio r = m/n. On nine benchmarks (n from 178 to 70,000), FastUMAP reports the lowest runtime on seven datasets; on MNIST and Fashion-MNIST it achieves ~4.6 s wall-clock time versus ~73-75 s for Barnes-Hut t-SNE while attaining 91.4% mean kNN accuracy versus 94.6% for the strongest baseline.

Significance. If the reported speed-accuracy trade-off holds under the stated experimental conditions, FastUMAP supplies a practical, tunable tool for iterative embedding workflows where analysts rerun dimensionality reduction after preprocessing or subset changes. The method re-uses standard techniques (bipartite sampling, Nystrom initialization) whose fidelity is demonstrated empirically rather than asserted universally. Concrete wall-clock numbers, direct baseline comparisons, and an explicit single free parameter (r) are strengths that support usability claims.

major comments (2)

[§4.2, Table 3] §4.2 and Table 3: kNN accuracy is reported only as a mean (91.4% vs. 94.6%); without per-run standard deviations, number of random seeds, or a statistical test, it is impossible to judge whether the 3.2-point gap is distinguishable from noise and therefore whether the fidelity claim is robust.
[§3.1] §3.1: The landmark sampling procedure is stated to be random, yet no ablation or sensitivity analysis is supplied showing how alternative landmark selection (e.g., k-means or leverage-score) alters the downstream kNN accuracy or runtime on the same datasets; this directly affects the claim that r alone controls the runtime-fidelity trade-off.

minor comments (3)

[Abstract, §4.1] Abstract and §4.1: Hardware, exact library versions, and compiler flags for all competing implementations (UMAP, t-SNE, etc.) should be stated explicitly so that the reported 15-fold speedups can be reproduced.
[Figure 2] Figure 2: Axis labels and legend entries are too small for print; the color scale for the bipartite graph visualization is not defined in the caption.
[§2] §2: The notation for the fuzzy simplicial set construction on the bipartite graph re-uses symbols from the original UMAP paper without re-definition; a short self-contained recap would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and the recommendation of minor revision. We address each of the major comments point by point below.

read point-by-point responses

Referee: [§4.2, Table 3] §4.2 and Table 3: kNN accuracy is reported only as a mean (91.4% vs. 94.6%); without per-run standard deviations, number of random seeds, or a statistical test, it is impossible to judge whether the 3.2-point gap is distinguishable from noise and therefore whether the fidelity claim is robust.

Authors: We agree that the absence of variability measures makes it difficult to assess the robustness of the reported accuracy differences. To address this, we will update Table 3 and the text in §4.2 to include standard deviations for the kNN accuracy, based on multiple runs with different random seeds. We will also specify the number of runs performed. This revision will allow for a better evaluation of whether the observed gaps are statistically meaningful. revision: yes
Referee: [§3.1] §3.1: The landmark sampling procedure is stated to be random, yet no ablation or sensitivity analysis is supplied showing how alternative landmark selection (e.g., k-means or leverage-score) alters the downstream kNN accuracy or runtime on the same datasets; this directly affects the claim that r alone controls the runtime-fidelity trade-off.

Authors: The manuscript presents FastUMAP with random landmark sampling as a deliberate design choice to provide a simple, single-parameter method suitable for repeated exploratory use. Alternative landmark selection strategies would require additional computational steps and hyperparameters, potentially undermining the speed advantage. We will revise the text in §3.1 to clarify that the trade-off controlled by r is for the random sampling approach described. This makes the scope of the claim explicit without requiring new experiments on other sampling methods, which fall outside the current method definition. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper defines FastUMAP explicitly as a sequence of construction steps: bipartite point-landmark fuzzy graph, Nystrom spectral warm-start from landmark affinities, followed by UMAP-style optimization on the bipartite graph, with landmark ratio r as a free user parameter. No central quantity (embedding, affinity, or objective) is defined in terms of itself or a fitted parameter that would render reported runtimes or kNN accuracies tautological. Claims rest on concrete empirical timings and accuracy proxies across nine datasets rather than on any self-referential derivation or load-bearing self-citation chain. The method is therefore self-contained as a practical approximation whose performance is measured externally.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach inherits the fuzzy simplicial set construction and cross-entropy objective from prior UMAP work; the only new control is the user-selected landmark ratio.

free parameters (1)

landmark ratio r = m/n
User-chosen fraction of landmarks that directly controls the speed-fidelity trade-off.

axioms (1)

domain assumption UMAP fuzzy graph construction and optimization objective remain valid when restricted to a bipartite point-landmark graph
The paper assumes the standard UMAP loss and graph construction can be applied without modification to the reduced bipartite structure.

pith-pipeline@v0.9.0 · 5520 in / 1326 out tokens · 48343 ms · 2026-05-13T02:20:44.494058+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

FastUMAP builds a sparse point-landmark fuzzy graph, computes a Nystrom spectral warm start from the induced landmark affinity, and then refines all sample coordinates with a UMAP-style objective on the bipartite graph.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The landmark ratio r = m/n provides a direct way to trade runtime against fidelity.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

94 extracted references · 94 canonical work pages · 2 internal anchors

[1]

Nature Machine Intelligence , year=

SUDE: Integrating local and global structure for dimensionality reduction , author=. Nature Machine Intelligence , year=

work page
[2]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , author=. arXiv preprint arXiv:1802.03426 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[3]

Journal of Machine Learning Research , volume=

Visualizing data using t-SNE , author=. Journal of Machine Learning Research , volume=

work page
[4]

Journal of Machine Learning Research , volume=

Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization , author=. Journal of Machine Learning Research , volume=

work page
[5]

arXiv preprint arXiv:1910.00204 , year=

TriMap: Large-scale Dimensionality Reduction Using Triplets , author=. arXiv preprint arXiv:1910.00204 , year=

work page arXiv 1910
[6]

Nature Biotechnology , volume=

Visualizing structure and transitions in high-dimensional biological data , author=. Nature Biotechnology , volume=. 2019 , publisher=

work page 2019
[7]

2007 , publisher=

Nonlinear Dimensionality Reduction , author=. 2007 , publisher=

work page 2007
[8]

Science , volume=

Nonlinear dimensionality reduction by locally linear embedding , author=. Science , volume=. 2000 , publisher=

work page 2000
[9]

Science , volume=

A global geometric framework for nonlinear dimensionality reduction , author=. Science , volume=. 2000 , publisher=

work page 2000
[10]

Distill , year=

Understanding UMAP , author=. Distill , year=

work page
[11]

Neural Networks , volume=

Local multidimensional scaling , author=. Neural Networks , volume=

work page
[12]

Neurocomputing , volume=

Quality measures for dimensionality reduction , author=. Neurocomputing , volume=

work page
[13]

Nature Biotechnology , volume=

Spatial reconstruction of single-cell gene expression data , author=. Nature Biotechnology , volume=

work page
[14]

Genome Biology , volume=

SCANPY: large-scale single-cell gene expression data analysis , author=. Genome Biology , volume=

work page
[15]

Nature Biotechnology , volume=

The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , author=. Nature Biotechnology , volume=

work page
[16]

Proceedings of the IEEE , volume=

Gradient-based learning applied to document recognition , author=. Proceedings of the IEEE , volume=

work page
[17]

2017 , archivePrefix=

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , author=. 2017 , archivePrefix=

work page 2017
[18]

Nature Communications , volume=

Massively parallel digital transcriptional profiling of single cells , author=. Nature Communications , volume=

work page
[19]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

work page
[20]

IEEE Transactions on Big Data , year=

Billion-scale similarity search with GPUs , author=. IEEE Transactions on Big Data , year=

work page
[21]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Proceedings of COMPSTAT'2010 , pages=

Large-scale machine learning with stochastic gradient descent , author=. Proceedings of COMPSTAT'2010 , pages=. 2010 , publisher=

work page 2010
[23]

Nature Communications , volume=

The art of using t-SNE for single-cell transcriptomics , author=. Nature Communications , volume=

work page
[24]

Nature Methods , volume=

Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data , author=. Nature Methods , volume=

work page
[25]

Neural Computation , volume=

Laplacian eigenmaps for dimensionality reduction and data representation , author=. Neural Computation , volume=

work page
[26]

Applied and Computational Harmonic Analysis , volume=

Diffusion maps , author=. Applied and Computational Harmonic Analysis , volume=

work page
[27]

The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science , volume=

On lines and planes of closest fit to systems of points in space , author=. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science , volume=

work page
[28]

Journal of Educational Psychology , volume=

Analysis of a complex of statistical variables into principal components , author=. Journal of Educational Psychology , volume=

work page
[29]

Proceedings of the National Academy of Sciences , volume=

Hypoxia tolerance in the Norrin-deficient retina and the chronically hypoxic brain studied at single-cell resolution , author=. Proceedings of the National Academy of Sciences , volume=. 2019 , publisher=

work page 2019
[30]

Nature Communications , volume=

Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity , author=. Nature Communications , volume=. 2022 , publisher=

work page 2022
[31]

Cell , volume=

Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis , author=. Cell , volume=. 2015 , publisher=

work page 2015
[32]

eLife , volume=

The Human Cell Atlas , author=. eLife , volume=. 2017 , publisher=

work page 2017
[33]

Proceedings of the 25th International Conference on World Wide Web , pages=

Visualizing Large-scale and High-dimensional Data , author=. Proceedings of the 25th International Conference on World Wide Web , pages=

work page
[34]

Nature Biotechnology , volume=

Assessing single-cell transcriptomic variability through density-preserving data visualization , author=. Nature Biotechnology , volume=

work page
[35]

bioRxiv , pages=

openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding , author=. bioRxiv , pages=

work page
[36]

Nature Biotechnology , volume=

Dimensionality reduction for visualizing single-cell data using UMAP , author=. Nature Biotechnology , volume=

work page
[37]

Advances in Neural Information Processing Systems , volume=

Global versus local methods in nonlinear dimensionality reduction , author=. Advances in Neural Information Processing Systems , volume=

work page
[38]

Advances in Neural Information Processing Systems , volume=

Using the Nyström method to speed up kernel machines , author=. Advances in Neural Information Processing Systems , volume=

work page
[39]

Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics , pages=

FastMap, MetricMap, and Landmark MDS are all Nyström algorithms , author=. Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics , pages=

work page
[40]

Journal of Machine Learning Research , volume=

On the Nyström method for approximating a Gram matrix for improved kernel-based learning , author=. Journal of Machine Learning Research , volume=

work page
[41]

Machine Learning and Knowledge Discovery in Databases , pages=

Locally linear landmarks for large-scale manifold learning , author=. Machine Learning and Knowledge Discovery in Databases , pages=

work page
[42]

Nature , volume=

The single-cell transcriptional landscape of mammalian organogenesis , author=. Nature , volume=

work page
[43]

Nature Methods , volume=

Fast, sensitive and accurate integration of single-cell data with Harmony , author=. Nature Methods , volume=

work page
[44]

Nature Biotechnology , volume=

Efficient integration of heterogeneous single-cell transcriptomes using Scanorama , author=. Nature Biotechnology , volume=

work page
[45]

Cell , volume=

Comprehensive integration of single-cell data , author=. Cell , volume=

work page
[46]

Molecular Systems Biology , volume=

Current best practices in single-cell RNA-seq analysis: a tutorial , author=. Molecular Systems Biology , volume=

work page
[47]

1997 , publisher=

Spectral graph theory , author=. 1997 , publisher=

work page 1997
[48]

Statistics and Computing , volume=

A tutorial on spectral clustering , author=. Statistics and Computing , volume=

work page
[49]

Advances in Neural Information Processing Systems , volume=

On spectral clustering: Analysis and an algorithm , author=. Advances in Neural Information Processing Systems , volume=

work page
[50]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Normalized cuts and image segmentation , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

work page
[51]

Information and Control , volume=

Fuzzy sets , author=. Information and Control , volume=

work page
[52]

Self-published manuscript , year=

Metric realization of fuzzy simplicial sets , author=. Self-published manuscript , year=

work page
[53]

2010 , publisher=

Computational topology: an introduction , author=. 2010 , publisher=

work page 2010
[54]

Neurocomputing , volume=

Quality assessment of dimensionality reduction: Rank-based criteria , author=. Neurocomputing , volume=

work page
[55]

Journal of the American Statistical Association , volume=

Local multidimensional scaling for nonlinear dimension reduction, graph drawing, and proximity analysis , author=. Journal of the American Statistical Association , volume=

work page
[56]

IEEE Transactions on Visualization and Computer Graphics , volume=

Toward a quantitative survey of dimension reduction techniques , author=. IEEE Transactions on Visualization and Computer Graphics , volume=

work page
[57]

PLoS Computational Biology , volume=

Ten simple rules for reproducible computational research , author=. PLoS Computational Biology , volume=

work page
[58]

Science , volume=

Reproducible research in computational science , author=. Science , volume=

work page
[59]

The Annals of Mathematical Statistics , pages=

A stochastic approximation method , author=. The Annals of Mathematical Statistics , pages=

work page
[60]

Doklady USSR , volume=

A method for unconstrained convex minimization problem with the rate of convergence O(1/k^2) , author=. Doklady USSR , volume=

work page
[61]

Nature , volume=

RNA velocity of single cells , author=. Nature , volume=

work page
[62]

Nature Biotechnology , volume=

Generalizing RNA velocity to transient cell states through dynamical modeling , author=. Nature Biotechnology , volume=

work page
[63]

Science , volume=

Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq , author=. Science , volume=

work page
[64]

Using the Nystr

Williams, Christopher KI and Seeger, Matthias , booktitle=. Using the Nystr

work page
[65]

Spectral grouping using the Nystr

Fowlkes, Charless and Belongie, Serge and Chung, Fan and Malik, Jitendra , journal=. Spectral grouping using the Nystr

work page
[66]

Molecular systems biology , volume=

Current best practices in single-cell RNA-seq analysis: a tutorial , author=. Molecular systems biology , volume=

work page
[67]

Algorithms for manifold learning , author=. Univ. of California at San Diego Tech. Rep , volume=

work page
[68]

Journal of machine learning research , volume=

Think globally, fit locally: unsupervised learning of low dimensional manifolds , author=. Journal of machine learning research , volume=

work page
[69]

Advances in neural information processing systems , volume=

Hidden technical debt in machine learning systems , author=. Advances in neural information processing systems , volume=

work page
[70]

Journal of Machine Learning Research , volume=

Improving reproducibility in machine learning research , author=. Journal of Machine Learning Research , volume=

work page
[71]

IEEE transactions on information theory , volume=

Nearest neighbor pattern classification , author=. IEEE transactions on information theory , volume=

work page
[72]

Proceedings of the 20th international conference on World wide web , pages=

Efficient k-nearest neighbor graph construction for generic similarity measures , author=. Proceedings of the 20th international conference on World wide web , pages=

work page
[73]

IEEE transactions on visualization and computer graphics , volume=

Toward a quantitative survey of dimension reduction techniques , author=. IEEE transactions on visualization and computer graphics , volume=

work page
[74]

IEEE Transactions on Visualization and Computer Graphics , volume=

Revisiting dimensionality reduction techniques for visual cluster analysis: An empirical study , author=. IEEE Transactions on Visualization and Computer Graphics , volume=

work page
[75]

Neurocomputing , volume=

Divide-and-conquer based large-scale spectral clustering , author=. Neurocomputing , volume=. 2022 , publisher=

work page 2022
[76]

Proceedings of the 25th international conference on world wide web , pages=

Visualizing large-scale and high-dimensional data , author=. Proceedings of the 25th international conference on world wide web , pages=. 2016 , publisher=

work page 2016
[77]

Neural Computation , volume=

Parametric UMAP embeddings for representation and semisupervised learning , author=. Neural Computation , volume=. 2021 , publisher=

work page 2021
[78]

Using the Nystr

Williams, Christopher KI and Seeger, Matthias , journal=. Using the Nystr

work page
[79]

IEEE transactions on pattern analysis and machine intelligence , volume=

Spectral grouping using the Nystrom method , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2004 , publisher=

work page 2004
[80]

Advances in neural information processing systems , volume=

Global versus local methods in nonlinear dimensionality reduction , author=. Advances in neural information processing systems , volume=

work page

Showing first 80 references.