Is Dimensionality a Barrier for Retrieval Models?

Guy Bresler; Jonathan Kogan; Kiril Bangachev; Yury Polyanskiy

arxiv: 2605.23556 · v1 · pith:GQOTS5VHnew · submitted 2026-05-22 · 💻 cs.LG · cs.IR· math.CO

Is Dimensionality a Barrier for Retrieval Models?

Kiril Bangachev , Guy Bresler , Jonathan Kogan , Yury Polyanskiy This is my paper

Pith reviewed 2026-05-25 04:53 UTC · model grok-4.3

classification 💻 cs.LG cs.IRmath.CO

keywords retrieval embeddingsmaximal margindimensionalityinner-product separationrelevance matrixcommunication complexityInfoNCE loss

0 comments

The pith

The infinite-dimension maximal margin for any relevance matrix A is nearly achieved already in dimension O(m^{-2} log n).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether low embedding dimension limits the quality of retrieval models that scale to billions or trillions of items. It proves that for any 0-1 relevance matrix A, the largest margin m achievable with unlimited dimension can be matched up to small factors using dimension only O(m^{-2} log n). This improves an earlier bound and is shown to be tight for the case of all k-sparse query rows, where d = O(k log(n/k)) is both necessary and sufficient to reach the optimal margin of order 1/sqrt(k). The work also supplies constructions for margins when dimension falls below that threshold and compares sigmoid versus InfoNCE losses on synthetic data.

Core claim

For any relevance matrix A the quantity m^rd(+∞, A) can be nearly recovered by unit-norm embeddings whose dimension is only O(m^rd(+∞, A)^{-2} log n). In the special case where A contains every possible k-sparse row exactly once, dimension O(k log(n/k)) is necessary and sufficient to attain the optimal margin Θ(k^{-1/2}).

What carries the argument

The margin m^rd(d, A) is the largest value m such that there exist unit-norm query vectors U_j and document vectors V_i satisfying signed inner-product separation exactly according to the entries of A.

If this is right

For all-k-sparse queries the dimension O(k log(n/k)) is both necessary and sufficient to reach margin Θ(k^{-1/2}).
Modern embedding sizes around 1000 already suffice for near-optimal margins on data sets with trillions of documents under the inner-product model.
Explicit constructions exist that produce large margins even when dimension is o(k log(n/k)).
Sigmoid loss yields larger empirical margins than InfoNCE on the tested synthetic instances.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If margin governs robustness and compositional generalization, then training objectives that directly target signed inner-product separation should remain effective at moderate dimensions.
The dimension lower bound for the k-sparse case may be used to derive concrete sample-complexity requirements for learning retrieval representations.
The communication-complexity origin of the model suggests analogous dimension bounds could apply to other separation tasks such as nearest-neighbor search under Hamming or Euclidean distance.

Load-bearing premise

The retrieval model requires exact signed inner-product separation with a uniform margin between unit-norm vectors.

What would settle it

An explicit matrix A for which every embedding achieving margin within a constant factor of m^rd(+∞, A) requires dimension larger than C m^{-2} log n for any fixed C would falsify the main upper bound.

Figures

Figures reproduced from arXiv: 2605.23556 by Guy Bresler, Jonathan Kogan, Kiril Bangachev, Yury Polyanskiy.

**Figure 1.** Figure 1: Minimal dimension needed to achieve a non-zero margin after 100000 training steps for the InfoNCE and sigmoid losses. The sigmoid succeeds in much smaller dimensions. 20 40 60 80 100 120 140 160 180 200 220 240 n 0.00 0.05 0.10 0.15 0.20 0.25 0.30 m (margin) Largest positive margin achieved during training, k=2 dimension d=6 d=18 d=30 loss InfoNCE Sigmoid [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 3.** Figure 3: Minimal dimension needed to achieve a non-zero margin after 100000 training steps for the InfoNCE and sigmoid losses. The sigmoid succeeds in much smaller dimensions. 20 40 60 80 100 120 n 0.00 0.05 0.10 0.15 0.20 0.25 m (margin) Largest positive margin achieved during training, k=3 dimension d=10 d=20 d=30 loss InfoNCE Sigmoid [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗

read the original abstract

Why does the low dimensionality of representations, typically $d\approx 1000$, not prevent modern embedding-based retrieval models from scaling to billions, or even trillions, of data points? To answer this question, we study maximal-margin embeddings in the following retrieval model, classically studied in communication complexity [PS86] and more recently in embedding-based retrieval [WBNL26]. Let $A\in \{0,1\}^{N\times n}$ be a matrix indicating whether each of $N$ queries is relevant to each of $n$ documents. We are interested in the largest margin $m>0,$ denoted by $\mathsf{m}^{\mathsf{rd}}(d, A),$ for which there exist unit norm embeddings of the queries and documents $\{U_j\}_{j = 1}^N, \{V_i\}_{i = 1}^n$ with the following property. $\langle U_j, V_i\rangle \ge m$ whenever $A_{ji} = 1$ and $\langle U_j, V_i\rangle \le -m$ otherwise. A large margin is a key proxy for representation quality: it controls both robustness to perturbations and compositional generalization across queries. Our main theorem establishes that the best possible margin without a restriction on the dimension, $\mathsf{m}^{\mathsf{rd}}(+\infty, A),$ can be nearly achieved in dimension $d = O(\mathsf{m}^{\mathsf{rd}}(+\infty, A)^{-2}\log n)$ which improves a theorem of [BDES02]. Together with a matching lower bound in Theorem 1.5, we conclude that when $A\in \{0,1\}^{\binom{n}{k}\times n}$ is the matrix containing all possible $k$-sparse rows once, dimension $d = O(k\log (n/k))$ is necessary and sufficient for the maximal possible margin $\mathsf{m}^{\mathsf{rd}}(+\infty, A) = \Theta(k^{-1/2})$ in this setting. This fully resolves the setup of [WBNL26]. We also give several constructions for large margins when $d = o(k\log (n/k)).$ Finally, we empirically test the InfoNCE and sigmoid losses for producing large margin embeddings and demonstrate a clear advantage of the sigmoid loss.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tightens the dimension needed for near-optimal margins in the classical inner-product retrieval model and pins down the exact requirement for the all-k-sparse case.

read the letter

The core contribution is an improved dimension reduction result: any margin m achievable in infinite dimensions can be recovered up to small loss in dimension d = O(m^{-2} log n). This beats the earlier bound in BDES02. For the specific matrix of all k-sparse rows they also prove a matching lower bound, showing that d = O(k log(n/k)) is necessary and sufficient to reach the infinite-dimensional margin of Theta(1/sqrt(k)). That part fully closes the setup left open in WBNL26. They add some explicit constructions that work below that dimension threshold and a short empirical comparison of InfoNCE versus sigmoid loss on margin size. The theoretical claims line up with standard random-projection style arguments and the lower bound is constructed directly for the sparse-row matrix, so the tightness result is on solid ground. The empirical section is mentioned only in passing, so its weight is limited without the actual numbers or controls. The modeling assumptions are the usual signed inner-product separation from the communication-complexity literature, which is consistent with the papers they cite and does not introduce circularity. Overall the work is a clean resolution of one concrete scaling question rather than a broad new framework. It is aimed at researchers who care about the theoretical limits of embedding dimension in retrieval and recommendation systems. A reader already following the WBNL26 line of work will find the tight bounds useful; others may skim the constructions. The paper is coherent on its own terms and deserves a serious referee.

Referee Report

0 major / 3 minor

Summary. The paper studies maximal-margin retrieval embeddings for a 0-1 relevance matrix A, where unit-norm query and document vectors must satisfy signed inner-product separation of margin m. It proves that the infinite-dimensional optimum m^rd(+∞, A) can be recovered up to (1-o(1)) factors in dimension d = O(m^{-2} log n), improving the dimension bound of [BDES02]. For the all-k-sparse query matrix it supplies a matching lower bound showing that d = O(k log(n/k)) is necessary and sufficient to achieve the optimal margin Θ(k^{-1/2}). Additional constructions are given for d = o(k log(n/k)) and an empirical comparison of InfoNCE versus sigmoid loss is reported.

Significance. If the stated theorems hold, the work supplies a tight, parameter-free characterization of the dimension needed for optimal-margin retrieval embeddings and resolves the open question posed in [WBNL26]. The dimension-reduction result is obtained via an explicit construction together with a matching lower bound for the sparse-query case; the empirical section provides a concrete, falsifiable comparison between two standard losses. These elements together give a self-contained theoretical and practical account of why modest embedding dimensions suffice at scale.

minor comments (3)

[Abstract] Abstract: the statement that the sigmoid loss shows 'a clear advantage' is not accompanied by any numerical values or statistical test; the claim should be supported by the specific margins or loss curves reported in the experimental section.
[Abstract] The notation m^rd(d, A) is introduced without an explicit reference to the section containing its formal definition; a forward pointer would improve readability.
[§1] The improvement over [BDES02] is stated only in the abstract; the introduction or related-work section should contain a one-sentence comparison of the new dimension exponent versus the prior bound.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript, the accurate summary of our results on dimension bounds for maximal-margin retrieval embeddings, and the recommendation for minor revision. The report correctly identifies the resolution of the open question from [WBNL26] via matching upper and lower bounds. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; mathematical theorems are self-contained

full rationale

The paper's core claims consist of mathematical theorems establishing dimension bounds for achieving near-optimal margins m^rd(∞, A) via constructions that improve on the cited result of [BDES02], together with a matching lower bound for the k-sparse case. These rest on the classical inner-product retrieval model from [PS86] and [WBNL26] but do not reduce any claimed margin or dimension to a fitted quantity, self-definition, or load-bearing self-citation chain. The derivation is independent of the present paper's own inputs and is externally verifiable as a margin-preserving embedding result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on the standard communication-complexity retrieval model and the existence of optimal infinite-dimension embeddings; no free parameters or new entities are introduced.

axioms (1)

domain assumption Unit-norm query and document embeddings exist that realize the optimal margin m^rd(+∞, A) for any fixed relevance matrix A
This defines the target margin that the finite-dimension construction must approximate.

pith-pipeline@v0.9.0 · 5967 in / 1272 out tokens · 32536 ms · 2026-05-25T04:53:43.667922+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our main theorem establishes that the best possible margin without a restriction on the dimension, m^rd(+∞, A), can be nearly achieved in dimension d = O(m^rd(+∞, A)^{-2} log n) which improves a theorem of [BDES02].
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our proofs are based on connections between this problem and the literature on compressed sensing and the restricted isometry property.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

182 extracted references · 182 canonical work pages · 13 internal anchors

[1]

Combinatorics, Probability and Computing , volume=

Learning complexity vs communication complexity , author=. Combinatorics, Probability and Computing , volume=. 2009 , publisher=

work page 2009
[2]

International conference on machine learning , pages=

A simple framework for contrastive learning of visual representations , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020
[3]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

With a little help from my friends: Nearest-neighbor contrastive learning of visual representations , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[4]

Machine Learning , volume=

An algorithmic theory of learning: Robust concepts and random projection , author=. Machine Learning , volume=. 2006 , publisher=

work page 2006
[5]

Proceedings of the Symposium on the Mathematical Theory of Automata , volume=

On convergence proofs on perceptrons , author=. Proceedings of the Symposium on the Mathematical Theory of Automata , volume=. 1962 , organization=

work page 1962
[6]

Machine Learning , volume=

Kernels as features: On kernels, margins, and low-dimensional mappings , author=. Machine Learning , volume=. 2006 , publisher=

work page 2006
[7]

IEEE Transactions on Information Theory , volume=

Compressed sensing , author=. IEEE Transactions on Information Theory , volume=. 2006 , publisher=

work page 2006
[8]

IEEE Transactions on Big Data , volume=

Billion-scale similarity search with GPUs , author=. IEEE Transactions on Big Data , volume=. 2019 , publisher=

work page 2019
[9]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Product Quantization for Nearest Neighbor Search , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2011 , publisher=

work page 2011
[10]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2020 , publisher=

work page 2020
[11]

DINOv3

Dinov3 , author=. arXiv preprint arXiv:2508.10104 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[12]

Laurent and P

B. Laurent and P. Massart , title =. The Annals of Statistics , number =. 2000 , doi =

work page 2000
[13]

Lower bounds in communication complexity and learning theory via analytic methods , author=

work page
[14]

Geometrical realization of set systems and probabilistic communication complexity , booktitle =

Noga Alon and Peter Frankl and Vojt. Geometrical realization of set systems and probabilistic communication complexity , booktitle =. 1985 , publisher =

work page 1985
[15]

Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces , journal =

J. Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces , journal =. 2003 , doi =

work page 2003
[16]

Machine Learning , volume =

Corinna Cortes and Vladimir Vapnik , title =. Machine Learning , volume =. 1995 , doi =

work page 1995
[17]

Bartlett , title =

Peter L. Bartlett , title =. IEEE Transactions on Information Theory , volume =

work page
[18]

2018 , issue_date =

Barman, Siddharth , title =. 2018 , issue_date =. doi:10.1137/15M1050574 , journal =

work page doi:10.1137/15m1050574 2018
[19]

Conference on Learning Theory , pages=

Approximate is good enough: Probabilistic variants of dimensional and margin complexity , author=. Conference on Learning Theory , pages=. 2020 , organization=

work page 2020
[20]

Schapire and Yoav Freund and Peter Bartlett and Wee Sun Lee , title =

Robert E. Schapire and Yoav Freund and Peter Bartlett and Wee Sun Lee , title =. The Annals of Statistics , volume =. 1998 , doi =

work page 1998
[21]

, title =

Freund, Yoav and Schapire, Robert E. , title =. Machine Learning , volume =. 1999 , doi =

work page 1999
[22]

Annals of the Institute of Statistical Mathematics , volume=

Some geometric applications of the beta distribution , author=. Annals of the Institute of Statistical Mathematics , volume=. 1990 , publisher=

work page 1990
[23]

Bartlett and Dylan J

Peter L. Bartlett and Dylan J. Foster and Matus J. Telgarsky , title =. Advances in Neural Information Processing Systems 30 , editor =

work page
[24]

Contemporary mathematics , volume=

Extensions of Lipschitz mappings into a Hilbert space , author=. Contemporary mathematics , volume=

work page
[25]

Journal of Machine Learning Research , volume=

Limitations of learning via embeddings in Euclidean half spaces , author=. Journal of Machine Learning Research , volume=

work page
[26]

Conference on Learning Theory , pages=

Sign rank versus VC dimension , author=. Conference on Learning Theory , pages=. 2016 , organization=

work page 2016
[27]

1996 , isbn =

Kushilevitz, Eyal and Nisan, Noam , title =. 1996 , isbn =

work page 1996
[28]

Journal of Machine Learning Research , volume =

Daniel Soudry and Elad Hoffer and Mor Shpigel Nacson and Suriya Gunasekar and Nathan Srebro , title =. Journal of Machine Learning Research , volume =. 2018 , url =

work page 2018
[29]

Robust Large Margin Deep Neural Networks , journal =

Jure Sokoli. Robust Large Margin Deep Neural Networks , journal =. 2017 , doi =

work page 2017
[30]

Proceedings of the 55th Annual ACM Symposium on Theory of Computing , pages=

Local and global expansion in random geometric graphs , author=. Proceedings of the 55th Annual ACM Symposium on Theory of Computing , pages=

work page
[31]

An exponential improvement for Ramsey lower bounds

An exponential improvement for Ramsey lower bounds , author=. arXiv preprint arXiv:2507.12926 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[32]

Communications on Pure and Applied Mathematics , volume=

Explicit Construction of RIP Matrices Is Ramsey-Hard , author=. Communications on Pure and Applied Mathematics , volume=. 2020 , publisher=

work page 2020
[33]

Elsayed and Dilip Krishnan and Hossein Mobahi and Kevin Regan and Samy Bengio , title =

Gamaleldin F. Elsayed and Dilip Krishnan and Hossein Mobahi and Kevin Regan and Samy Bengio , title =. Advances in Neural Information Processing Systems 31 , editor =

work page
[34]

Machine Learning , volume =

Maria-Florina Balcan and Avrim Blum and Nathan Srebro , title =. Machine Learning , volume =. 2008 , doi =

work page 2008
[35]

Weinberger and Lawrence K

Kilian Q. Weinberger and Lawrence K. Saul , title =. Journal of Machine Learning Research , volume =. 2009 , url =

work page 2009
[36]

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) , volume =

Raia Hadsell and Sumit Chopra and Yann LeCun , title =. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) , volume =. 2006 , doi =

work page 2006
[37]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages =

Florian Schroff and Dmitry Kalenichenko and James Philbin , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages =. 2015 , doi =

work page 2015
[38]

Proceedings of the 33rd International Conference on Machine Learning , series =

Weiyang Liu and Yandong Wen and Zhiding Yu and Meng Yang , title =. Proceedings of the 33rd International Conference on Machine Learning , series =

work page
[39]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Jiankang Deng and Jia Guo and Niannan Xue and Stefanos Zafeiriou , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

work page
[40]

arXiv preprint arXiv:2408.00995 , year =

Kiril Bangachev and Guy Bresler , title =. arXiv preprint arXiv:2408.00995 , year =

work page arXiv
[41]

J. H. Conway and N. J. A. Sloane , title =. 1999 , edition =

work page 1999
[42]

Emmanuel J. Cand. Decoding by Linear Programming , journal =

work page
[43]

2013 , publisher =

Simon Foucart and Holger Rauhut , title =. 2013 , publisher =

work page 2013
[44]

Fiduccia and Edward R

Charles M. Fiduccia and Edward R. Scheinerman and Ann N. Trenk and Jennifer S. Zito , title =. Discrete Mathematics , volume =

work page
[45]

A Linear Lower Bound on the Unbounded Error Probabilistic Communication Complexity , journal =

J. A Linear Lower Bound on the Unbounded Error Probabilistic Communication Complexity , journal =

work page
[46]

Dense Passage Retrieval for Open-Domain Question Answering , booktitle =

Vladimir Karpukhin and Barlas O. Dense Passage Retrieval for Open-Domain Question Answering , booktitle =

work page
[47]

Journal of Functional Analysis , volume =

Bo'az Klartag and Shahar Mendelson , title =. Journal of Functional Analysis , volume =

work page
[48]

What You Will Gain By Rounding: Theory and Algorithms for Rounding Rank

Stefan Neumann and Rainer Gemulla and Pauli Miettinen , title =. arXiv preprint arXiv:1609.05034 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[50]

Transactions on Machine Learning Research , year =

Maxime Oquab and others , title =. Transactions on Machine Learning Research , year =

work page
[51]

Proceedings of the 41st International Conference on Machine Learning (ICML) , pages =

Kiho Park and Yo Joong Choe and Victor Veitch , title =. Proceedings of the 41st International Conference on Machine Learning (ICML) , pages =

work page
[52]

Journal of Computer and System Sciences , volume =

Ramamohan Paturi and Janos Simon , title =. Journal of Computer and System Sciences , volume =

work page
[53]

Proceedings of the 38th International Conference on Machine Learning (ICML) , pages =

Alec Radford and Jong Wook Kim and Chris Hallacy and Aditya Ramesh and Gabriel Goh and Sandhini Agarwal and Girish Sastry and Amanda Askell and Pamela Mishkin and Jack Clark and Gretchen Krueger and Ilya Sutskever , title =. Proceedings of the 38th International Conference on Machine Learning (ICML) , pages =

work page
[54]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP) , pages =

Nils Reimers and Iryna Gurevych , title =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP) , pages =

work page 2019
[55]

Sherstov , title =

Alexander A. Sherstov , title =. Computational Complexity , volume =

work page
[56]

Constructive approximation , volume=

A simple proof of the restricted isometry property for random matrices , author=. Constructive approximation , volume=. 2008 , publisher=

work page 2008
[57]

Representation Learning with Contrastive Predictive Coding

Representation learning with contrastive predictive coding , author=. arXiv preprint arXiv:1807.03748 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[58]

Roman Vershynin , title =

work page
[59]

arXiv preprint arXiv:2601.20844 , year =

Zihao Wang and Hang Yin and Lihui Liu and Hanghang Tong and Yangqiu Song and Ginny Wong and Simon See , title =. arXiv preprint arXiv:2601.20844 , year =

work page arXiv
[60]

Young and Edward R

Stephen J. Young and Edward R. Scheinerman , title =. Lecture Notes in Computer Science , volume =

work page
[61]

Sigmoid Loss for Language Image Pre-Training

Xiaohua Zhai and Basil Mustafa and Alexander Kolesnikov and Lucas Beyer , title =. arXiv preprint arXiv:2303.15343 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[62]

Proceedings of the 57th annual meeting of the association for computational linguistics , pages=

Latent retrieval for weakly supervised open domain question answering , author=. Proceedings of the 57th annual meeting of the association for computational linguistics , pages=

work page
[63]

Proceedings of the 37th International Conference on Machine Learning , articleno =

Wang, Tongzhou and Isola, Phillip , title =. Proceedings of the 37th International Conference on Machine Learning , articleno =. 2020 , publisher =

work page 2020
[64]

Journal of Approximation Theory , volume=

Optimal recovery and n-widths for convex classes of functions , author=. Journal of Approximation Theory , volume=. 1995 , publisher=

work page 1995
[65]

Xi Chen and Xiao Wang and Soravit Changpinyo and AJ Piergiovanni and Piotr Padlewski and Daniel Salz and Sebastian Goodman and Adam Grycner and Basil Mustafa and Lucas Beyer and Alexander Kolesnikov and Joan Puigcerver and Nan Ding and Keran Rong and Hassan Akbari and Gaurav Mishra and Linting Xue and Ashish V Thapliyal and James Bradbury and Weicheng Kuo...

work page 2023
[66]

Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles

Noroozi, Mehdi and Favaro, Paolo. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. Computer Vision -- ECCV 2016. 2016

work page 2016
[67]

Wyner, A. D. , title =. Bell System Technical Journal , volume =. doi:https://doi.org/10.1002/j.1538-7305.1968.tb00062.x , year =

work page doi:10.1002/j.1538-7305.1968.tb00062.x 1968
[68]

Proceedings of the Glasgow Mathematical Association , author=

The Closest Packing of Spherical Caps in n Dimensions , volume=. Proceedings of the Glasgow Mathematical Association , author=. 1955 , pages=. doi:10.1017/S2040618500033219 , number=

work page doi:10.1017/s2040618500033219 1955
[69]

, title =

Shannon, Claude E. , title =. Bell System Technical Journal , volume =. doi:https://doi.org/10.1002/j.1538-7305.1959.tb03905.x , year =

work page doi:10.1002/j.1538-7305.1959.tb03905.x 1959
[70]

On the origin of number and arrangement of the places of exit on the surface of pollen-grains

Tammes, Pieter Merkus Lambertus. On the origin of number and arrangement of the places of exit on the surface of pollen-grains. 1930

work page 1930
[71]

Sphere packings, lattices and groups , volume =

Conway, J.H and Sloane, N.J.A and Bannai, E and Borcherds, R.E and Leech, J and Norton, S.P and Odlyzko, A.M and Parker, R.A and Queen, L and Venkov, B.B , edition =. Sphere packings, lattices and groups , volume =

work page
[72]

Zhai, Xiaohua and Mustafa, Basil and Kolesnikov, Alexander and Beyer, Lucas , year =

work page
[73]

A Theoretical Analysis of Contrastive Unsupervised Representation Learning , booktitle =

Nikunj Saunshi and Orestis Plevrakis and Sanjeev Arora and Mikhail Khodak and Hrishikesh Khandeparkar , editor =. A Theoretical Analysis of Contrastive Unsupervised Representation Learning , booktitle =. 2019 , url =

work page 2019
[74]

CoRR , volume =

Anna Van Elst and Debarghya Ghoshdastidar , title =. CoRR , volume =. 2024 , url =. doi:10.48550/ARXIV.2412.03486 , eprinttype =. 2412.03486 , timestamp =

work page doi:10.48550/arxiv.2412.03486 2024
[75]

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

Srinivasan, Krishna and Raman, Karthik and Chen, Jiecao and Bendersky, Michael and Najork, Marc , title =. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 2021 , isbn =. doi:10.1145/3404835.3463257 , abstract =

work page doi:10.1145/3404835.3463257 2021
[76]

International Conference on Learning Representations , year=

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , author=. International Conference on Learning Representations , year=

work page
[77]

2019 , eprint=

Representation Learning with Contrastive Predictive Coding , author=. 2019 , eprint=

work page 2019
[78]

Yiwei Lu and Guojun Zhang and Sun Sun and Hongyu Guo and Yaoliang Yu , journal=. \ f\ -. 2023 , url=

work page 2023
[79]

2020 , isbn =

Khattab, Omar and Zaharia, Matei , title =. 2020 , isbn =. doi:10.1145/3397271.3401075 , booktitle =

work page doi:10.1145/3397271.3401075 2020
[80]

Journal of Computer and System Sciences , volume=

Probabilistic communication complexity , author=. Journal of Computer and System Sciences , volume=. 1986 , publisher=

work page 1986
[81]

Random Structures & Algorithms , volume=

Testing for high-dimensional geometry in random graphs , author=. Random Structures & Algorithms , volume=. 2016 , publisher=

work page 2016

Showing first 80 references.

[1] [1]

Combinatorics, Probability and Computing , volume=

Learning complexity vs communication complexity , author=. Combinatorics, Probability and Computing , volume=. 2009 , publisher=

work page 2009

[2] [2]

International conference on machine learning , pages=

A simple framework for contrastive learning of visual representations , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020

[3] [3]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

With a little help from my friends: Nearest-neighbor contrastive learning of visual representations , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page

[4] [4]

Machine Learning , volume=

An algorithmic theory of learning: Robust concepts and random projection , author=. Machine Learning , volume=. 2006 , publisher=

work page 2006

[5] [5]

Proceedings of the Symposium on the Mathematical Theory of Automata , volume=

On convergence proofs on perceptrons , author=. Proceedings of the Symposium on the Mathematical Theory of Automata , volume=. 1962 , organization=

work page 1962

[6] [6]

Machine Learning , volume=

Kernels as features: On kernels, margins, and low-dimensional mappings , author=. Machine Learning , volume=. 2006 , publisher=

work page 2006

[7] [7]

IEEE Transactions on Information Theory , volume=

Compressed sensing , author=. IEEE Transactions on Information Theory , volume=. 2006 , publisher=

work page 2006

[8] [8]

IEEE Transactions on Big Data , volume=

Billion-scale similarity search with GPUs , author=. IEEE Transactions on Big Data , volume=. 2019 , publisher=

work page 2019

[9] [9]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Product Quantization for Nearest Neighbor Search , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2011 , publisher=

work page 2011

[10] [10]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2020 , publisher=

work page 2020

[11] [11]

DINOv3

Dinov3 , author=. arXiv preprint arXiv:2508.10104 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[12] [12]

Laurent and P

B. Laurent and P. Massart , title =. The Annals of Statistics , number =. 2000 , doi =

work page 2000

[13] [13]

Lower bounds in communication complexity and learning theory via analytic methods , author=

work page

[14] [14]

Geometrical realization of set systems and probabilistic communication complexity , booktitle =

Noga Alon and Peter Frankl and Vojt. Geometrical realization of set systems and probabilistic communication complexity , booktitle =. 1985 , publisher =

work page 1985

[15] [15]

Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces , journal =

J. Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces , journal =. 2003 , doi =

work page 2003

[16] [16]

Machine Learning , volume =

Corinna Cortes and Vladimir Vapnik , title =. Machine Learning , volume =. 1995 , doi =

work page 1995

[17] [17]

Bartlett , title =

Peter L. Bartlett , title =. IEEE Transactions on Information Theory , volume =

work page

[18] [18]

2018 , issue_date =

Barman, Siddharth , title =. 2018 , issue_date =. doi:10.1137/15M1050574 , journal =

work page doi:10.1137/15m1050574 2018

[19] [19]

Conference on Learning Theory , pages=

Approximate is good enough: Probabilistic variants of dimensional and margin complexity , author=. Conference on Learning Theory , pages=. 2020 , organization=

work page 2020

[20] [20]

Schapire and Yoav Freund and Peter Bartlett and Wee Sun Lee , title =

Robert E. Schapire and Yoav Freund and Peter Bartlett and Wee Sun Lee , title =. The Annals of Statistics , volume =. 1998 , doi =

work page 1998

[21] [21]

, title =

Freund, Yoav and Schapire, Robert E. , title =. Machine Learning , volume =. 1999 , doi =

work page 1999

[22] [22]

Annals of the Institute of Statistical Mathematics , volume=

Some geometric applications of the beta distribution , author=. Annals of the Institute of Statistical Mathematics , volume=. 1990 , publisher=

work page 1990

[23] [23]

Bartlett and Dylan J

Peter L. Bartlett and Dylan J. Foster and Matus J. Telgarsky , title =. Advances in Neural Information Processing Systems 30 , editor =

work page

[24] [24]

Contemporary mathematics , volume=

Extensions of Lipschitz mappings into a Hilbert space , author=. Contemporary mathematics , volume=

work page

[25] [25]

Journal of Machine Learning Research , volume=

Limitations of learning via embeddings in Euclidean half spaces , author=. Journal of Machine Learning Research , volume=

work page

[26] [26]

Conference on Learning Theory , pages=

Sign rank versus VC dimension , author=. Conference on Learning Theory , pages=. 2016 , organization=

work page 2016

[27] [27]

1996 , isbn =

Kushilevitz, Eyal and Nisan, Noam , title =. 1996 , isbn =

work page 1996

[28] [28]

Journal of Machine Learning Research , volume =

Daniel Soudry and Elad Hoffer and Mor Shpigel Nacson and Suriya Gunasekar and Nathan Srebro , title =. Journal of Machine Learning Research , volume =. 2018 , url =

work page 2018

[29] [29]

Robust Large Margin Deep Neural Networks , journal =

Jure Sokoli. Robust Large Margin Deep Neural Networks , journal =. 2017 , doi =

work page 2017

[30] [30]

Proceedings of the 55th Annual ACM Symposium on Theory of Computing , pages=

Local and global expansion in random geometric graphs , author=. Proceedings of the 55th Annual ACM Symposium on Theory of Computing , pages=

work page

[31] [31]

An exponential improvement for Ramsey lower bounds

An exponential improvement for Ramsey lower bounds , author=. arXiv preprint arXiv:2507.12926 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[32] [32]

Communications on Pure and Applied Mathematics , volume=

Explicit Construction of RIP Matrices Is Ramsey-Hard , author=. Communications on Pure and Applied Mathematics , volume=. 2020 , publisher=

work page 2020

[33] [33]

Elsayed and Dilip Krishnan and Hossein Mobahi and Kevin Regan and Samy Bengio , title =

Gamaleldin F. Elsayed and Dilip Krishnan and Hossein Mobahi and Kevin Regan and Samy Bengio , title =. Advances in Neural Information Processing Systems 31 , editor =

work page

[34] [34]

Machine Learning , volume =

Maria-Florina Balcan and Avrim Blum and Nathan Srebro , title =. Machine Learning , volume =. 2008 , doi =

work page 2008

[35] [35]

Weinberger and Lawrence K

Kilian Q. Weinberger and Lawrence K. Saul , title =. Journal of Machine Learning Research , volume =. 2009 , url =

work page 2009

[36] [36]

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) , volume =

Raia Hadsell and Sumit Chopra and Yann LeCun , title =. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) , volume =. 2006 , doi =

work page 2006

[37] [37]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages =

Florian Schroff and Dmitry Kalenichenko and James Philbin , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages =. 2015 , doi =

work page 2015

[38] [38]

Proceedings of the 33rd International Conference on Machine Learning , series =

Weiyang Liu and Yandong Wen and Zhiding Yu and Meng Yang , title =. Proceedings of the 33rd International Conference on Machine Learning , series =

work page

[39] [39]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Jiankang Deng and Jia Guo and Niannan Xue and Stefanos Zafeiriou , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

work page

[40] [40]

arXiv preprint arXiv:2408.00995 , year =

Kiril Bangachev and Guy Bresler , title =. arXiv preprint arXiv:2408.00995 , year =

work page arXiv

[41] [41]

J. H. Conway and N. J. A. Sloane , title =. 1999 , edition =

work page 1999

[42] [42]

Emmanuel J. Cand. Decoding by Linear Programming , journal =

work page

[43] [43]

2013 , publisher =

Simon Foucart and Holger Rauhut , title =. 2013 , publisher =

work page 2013

[44] [44]

Fiduccia and Edward R

Charles M. Fiduccia and Edward R. Scheinerman and Ann N. Trenk and Jennifer S. Zito , title =. Discrete Mathematics , volume =

work page

[45] [45]

A Linear Lower Bound on the Unbounded Error Probabilistic Communication Complexity , journal =

J. A Linear Lower Bound on the Unbounded Error Probabilistic Communication Complexity , journal =

work page

[46] [46]

Dense Passage Retrieval for Open-Domain Question Answering , booktitle =

Vladimir Karpukhin and Barlas O. Dense Passage Retrieval for Open-Domain Question Answering , booktitle =

work page

[47] [47]

Journal of Functional Analysis , volume =

Bo'az Klartag and Shahar Mendelson , title =. Journal of Functional Analysis , volume =

work page

[48] [48]

What You Will Gain By Rounding: Theory and Algorithms for Rounding Rank

Stefan Neumann and Rainer Gemulla and Pauli Miettinen , title =. arXiv preprint arXiv:1609.05034 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[49] [50]

Transactions on Machine Learning Research , year =

Maxime Oquab and others , title =. Transactions on Machine Learning Research , year =

work page

[50] [51]

Proceedings of the 41st International Conference on Machine Learning (ICML) , pages =

Kiho Park and Yo Joong Choe and Victor Veitch , title =. Proceedings of the 41st International Conference on Machine Learning (ICML) , pages =

work page

[51] [52]

Journal of Computer and System Sciences , volume =

Ramamohan Paturi and Janos Simon , title =. Journal of Computer and System Sciences , volume =

work page

[52] [53]

Proceedings of the 38th International Conference on Machine Learning (ICML) , pages =

Alec Radford and Jong Wook Kim and Chris Hallacy and Aditya Ramesh and Gabriel Goh and Sandhini Agarwal and Girish Sastry and Amanda Askell and Pamela Mishkin and Jack Clark and Gretchen Krueger and Ilya Sutskever , title =. Proceedings of the 38th International Conference on Machine Learning (ICML) , pages =

work page

[53] [54]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP) , pages =

Nils Reimers and Iryna Gurevych , title =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP) , pages =

work page 2019

[54] [55]

Sherstov , title =

Alexander A. Sherstov , title =. Computational Complexity , volume =

work page

[55] [56]

Constructive approximation , volume=

A simple proof of the restricted isometry property for random matrices , author=. Constructive approximation , volume=. 2008 , publisher=

work page 2008

[56] [57]

Representation Learning with Contrastive Predictive Coding

Representation learning with contrastive predictive coding , author=. arXiv preprint arXiv:1807.03748 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[57] [58]

Roman Vershynin , title =

work page

[58] [59]

arXiv preprint arXiv:2601.20844 , year =

Zihao Wang and Hang Yin and Lihui Liu and Hanghang Tong and Yangqiu Song and Ginny Wong and Simon See , title =. arXiv preprint arXiv:2601.20844 , year =

work page arXiv

[59] [60]

Young and Edward R

Stephen J. Young and Edward R. Scheinerman , title =. Lecture Notes in Computer Science , volume =

work page

[60] [61]

Sigmoid Loss for Language Image Pre-Training

Xiaohua Zhai and Basil Mustafa and Alexander Kolesnikov and Lucas Beyer , title =. arXiv preprint arXiv:2303.15343 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[61] [62]

Proceedings of the 57th annual meeting of the association for computational linguistics , pages=

Latent retrieval for weakly supervised open domain question answering , author=. Proceedings of the 57th annual meeting of the association for computational linguistics , pages=

work page

[62] [63]

Proceedings of the 37th International Conference on Machine Learning , articleno =

Wang, Tongzhou and Isola, Phillip , title =. Proceedings of the 37th International Conference on Machine Learning , articleno =. 2020 , publisher =

work page 2020

[63] [64]

Journal of Approximation Theory , volume=

Optimal recovery and n-widths for convex classes of functions , author=. Journal of Approximation Theory , volume=. 1995 , publisher=

work page 1995

[64] [65]

Xi Chen and Xiao Wang and Soravit Changpinyo and AJ Piergiovanni and Piotr Padlewski and Daniel Salz and Sebastian Goodman and Adam Grycner and Basil Mustafa and Lucas Beyer and Alexander Kolesnikov and Joan Puigcerver and Nan Ding and Keran Rong and Hassan Akbari and Gaurav Mishra and Linting Xue and Ashish V Thapliyal and James Bradbury and Weicheng Kuo...

work page 2023

[65] [66]

Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles

Noroozi, Mehdi and Favaro, Paolo. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. Computer Vision -- ECCV 2016. 2016

work page 2016

[66] [67]

Wyner, A. D. , title =. Bell System Technical Journal , volume =. doi:https://doi.org/10.1002/j.1538-7305.1968.tb00062.x , year =

work page doi:10.1002/j.1538-7305.1968.tb00062.x 1968

[67] [68]

Proceedings of the Glasgow Mathematical Association , author=

The Closest Packing of Spherical Caps in n Dimensions , volume=. Proceedings of the Glasgow Mathematical Association , author=. 1955 , pages=. doi:10.1017/S2040618500033219 , number=

work page doi:10.1017/s2040618500033219 1955

[68] [69]

, title =

Shannon, Claude E. , title =. Bell System Technical Journal , volume =. doi:https://doi.org/10.1002/j.1538-7305.1959.tb03905.x , year =

work page doi:10.1002/j.1538-7305.1959.tb03905.x 1959

[69] [70]

On the origin of number and arrangement of the places of exit on the surface of pollen-grains

Tammes, Pieter Merkus Lambertus. On the origin of number and arrangement of the places of exit on the surface of pollen-grains. 1930

work page 1930

[70] [71]

Sphere packings, lattices and groups , volume =

Conway, J.H and Sloane, N.J.A and Bannai, E and Borcherds, R.E and Leech, J and Norton, S.P and Odlyzko, A.M and Parker, R.A and Queen, L and Venkov, B.B , edition =. Sphere packings, lattices and groups , volume =

work page

[71] [72]

Zhai, Xiaohua and Mustafa, Basil and Kolesnikov, Alexander and Beyer, Lucas , year =

work page

[72] [73]

A Theoretical Analysis of Contrastive Unsupervised Representation Learning , booktitle =

Nikunj Saunshi and Orestis Plevrakis and Sanjeev Arora and Mikhail Khodak and Hrishikesh Khandeparkar , editor =. A Theoretical Analysis of Contrastive Unsupervised Representation Learning , booktitle =. 2019 , url =

work page 2019

[73] [74]

CoRR , volume =

Anna Van Elst and Debarghya Ghoshdastidar , title =. CoRR , volume =. 2024 , url =. doi:10.48550/ARXIV.2412.03486 , eprinttype =. 2412.03486 , timestamp =

work page doi:10.48550/arxiv.2412.03486 2024

[74] [75]

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

Srinivasan, Krishna and Raman, Karthik and Chen, Jiecao and Bendersky, Michael and Najork, Marc , title =. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 2021 , isbn =. doi:10.1145/3404835.3463257 , abstract =

work page doi:10.1145/3404835.3463257 2021

[75] [76]

International Conference on Learning Representations , year=

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval , author=. International Conference on Learning Representations , year=

work page

[76] [77]

2019 , eprint=

Representation Learning with Contrastive Predictive Coding , author=. 2019 , eprint=

work page 2019

[77] [78]

Yiwei Lu and Guojun Zhang and Sun Sun and Hongyu Guo and Yaoliang Yu , journal=. \ f\ -. 2023 , url=

work page 2023

[78] [79]

2020 , isbn =

Khattab, Omar and Zaharia, Matei , title =. 2020 , isbn =. doi:10.1145/3397271.3401075 , booktitle =

work page doi:10.1145/3397271.3401075 2020

[79] [80]

Journal of Computer and System Sciences , volume=

Probabilistic communication complexity , author=. Journal of Computer and System Sciences , volume=. 1986 , publisher=

work page 1986

[80] [81]

Random Structures & Algorithms , volume=

Testing for high-dimensional geometry in random graphs , author=. Random Structures & Algorithms , volume=. 2016 , publisher=

work page 2016