{"paper":{"title":"Dimensionality Reduction of Massive Sparse Datasets Using Coresets","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":[],"primary_cat":"cs.DS","authors_text":"Dan Feldman, Daniela Rus, Mikhail Volkov","submitted_at":"2015-03-05T15:39:49Z","abstract_excerpt":"In this paper we present a practical solution with performance guarantees to the problem of dimensionality reduction for very large scale sparse matrices. We show applications of our approach to computing the low rank approximation (reduced SVD) of such matrices. Our solution uses coresets, which is a subset of $O(k/\\eps^2)$ scaled rows from the $n\\times d$ input matrix, that approximates the sub of squared distances from its rows to every $k$-dimensional subspace in $\\REAL^d$, up to a factor of $1\\pm\\eps$. An open theoretical problem has been whether we can compute such a coreset that is inde"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"1503.01663","kind":"arxiv","version":1},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}