pith. sign in

arxiv: 1610.00239 · v4 · pith:Y5J4EHYLnew · submitted 2016-10-02 · 🧮 math.MG · math.CO

Optimal compression of approximate inner products and dimension reduction

classification 🧮 math.MG math.CO
keywords varepsilonpointsadditivedimensiondistanceerrorfracproducts
0
0 comments X
read the original abstract

Let $X$ be a set of $n$ points of norm at most $1$ in the Euclidean space $R^k$, and suppose $\varepsilon>0$. An $\varepsilon$-distance sketch for $X$ is a data structure that, given any two points of $X$ enables one to recover the square of the (Euclidean) distance between them up to an {\em additive} error of $\varepsilon$. Let $f(n,k,\varepsilon)$ denote the minimum possible number of bits of such a sketch. Here we determine $f(n,k,\varepsilon)$ up to a constant factor for all $n \geq k \geq 1$ and all $\varepsilon \geq \frac{1}{n^{0.49}}$. Our proof is algorithmic, and provides an efficient algorithm for computing a sketch of size $O(f(n,k,\varepsilon)/n)$ for each point, so that the square of the distance between any two points can be computed from their sketches up to an additive error of $\varepsilon$ in time linear in the length of the sketches. We also discuss the case of smaller $\varepsilon>2/\sqrt n$ and obtain some new results about dimension reduction in this range. In particular, we show that for any such $\varepsilon$ and any $k \leq t=\frac{\log (2+\varepsilon^2 n)}{\varepsilon^2}$ there are configurations of $n$ points in $R^k$ that cannot be embedded in $R^{\ell}$ for $\ell < ck$ with $c$ a small absolute positive constant, without distorting some inner products (and distances) by more than $\varepsilon$. On the positive side, we provide a randomized polynomial time algorithm for a bipartite variant of the Johnson-Lindenstrauss lemma in which scalar products are approximated up to an additive error of at most $\varepsilon$. This variant allows a reduction of the dimension down to $O(\frac{\log (2+\varepsilon^2 n)}{\varepsilon^2})$, where $n$ is the number of points.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Semantic Concurrency Limits in Large Language Models

    quant-ph 2025-04 unverdicted novelty 5.0

    High-dimensional geometry imposes concurrency limits on semantic directions in LLM embeddings via residual interference, with N < exp(c d_eff ε²) for coexistence and σ_int ~ √(k/d_eff) for readout noise.