Asymmetric Random Projections
Pith reviewed 2026-05-25 17:55 UTC · model grok-4.3
The pith
By using statistics from a known data set one can build asymmetric random projections that better estimate inner products between two different vector sources than standard oblivious projections.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a light preprocessing step can compute statistics from a given data set and produce asymmetric random projections whose rows are chosen to respect the empirical second-moment structure of each source separately. These projections then yield unbiased inner-product estimators whose variance is strictly smaller than that of oblivious random projections whenever the two sources differ in their second moments.
What carries the argument
Asymmetric random projection matrix whose rows are scaled and signed according to per-source second-moment estimates extracted before projection.
If this is right
- Matrix multiplication can be approximated faster or more accurately by first projecting the two factors with source-specific projections.
- Linear regression or classification on data from two sources can use lower-dimensional representations while retaining more accurate inner-product geometry.
- Any algorithm whose cost scales with inner-product computations between two distinct collections benefits from the reduced variance.
- The same preprocessing can be reused across multiple projection dimensions or multiple downstream tasks on the same pair of sources.
Where Pith is reading between the lines
- The same idea of precomputing source statistics could be applied to other linear sketching methods that currently treat all vectors identically.
- In streaming or distributed settings where one source arrives first, its statistics could be stored and later combined with a second source.
- The approach suggests testing whether similar per-source adjustments improve other geometry-preserving maps such as Johnson-Lindenstrauss embeddings when sources are known to be heterogeneous.
Load-bearing premise
The full data set is available ahead of time so that source-specific statistics can be computed before the projection is chosen.
What would settle it
A controlled experiment on synthetic or real pairs of matrices in which the mean squared error of inner-product estimates using the data-dependent projections is not smaller than the error using standard oblivious projections of the same dimension.
Figures
read the original abstract
Random projections (RP) are a popular tool for reducing dimensionality while preserving local geometry. In many applications the data set to be projected is given to us in advance, yet the current RP techniques do not make use of information about the data. In this paper, we provide a computationally light way to extract statistics from the data that allows designing a data dependent RP with superior performance compared to data-oblivious RP. We tackle scenarios such as matrix multiplication and linear regression/classification in which we wish to estimate inner products between pairs of vectors from two possibly different sources. Our technique takes advantage of the difference between the sources and is provably superior to oblivious RPs. Additionally, we provide extensive experiments comparing RPs with our approach showing significant performance lifts in fast matrix multiplication, regression and classification problems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes asymmetric random projections that extract statistics from two data sources available in advance to construct data-dependent projections for inner-product estimation. It claims a computationally light method that is provably superior to oblivious random projections, with applications to fast matrix multiplication and linear regression/classification, supported by theoretical analysis and extensive experiments showing performance improvements.
Significance. If the central construction and superiority claims hold, the work provides a practical enhancement to random projections by exploiting data availability and source differences, with potential utility in paired-source estimation tasks. The explicit data-dependent construction and reported experimental gains constitute strengths.
minor comments (2)
- [Abstract] Abstract: the phrase 'provably superior' would benefit from a one-sentence qualifier on the precise metric (e.g., variance of the estimator) and the assumptions under which the guarantee holds.
- The experimental section would be clearer if it included a brief statement of the random-projection dimension relative to the original data dimension and the number of trials used for averaging.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work on asymmetric random projections and for recommending minor revision. No major comments were provided in the report.
Circularity Check
No significant circularity
full rationale
The paper describes a data-dependent asymmetric random projection constructed by extracting statistics from a pre-available data set, then using source differences to achieve provably superior inner-product estimation compared to oblivious RPs. The abstract presents this as an independent construction and empirical method without any derivation step that reduces by definition or self-citation to its own inputs. No fitted parameter is renamed as a prediction, no uniqueness theorem is imported from the authors' prior work, and the superiority claim is scoped to the setting where full data is known in advance. The central argument remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
The fast johnson--lindenstrauss transform and approximate nearest neighbors
Nir Ailon and Bernard Chazelle. The fast johnson--lindenstrauss transform and approximate nearest neighbors. SIAM Journal on computing , 39(1):302--322, 2009
work page 2009
-
[2]
Database-friendly random projections: Johnson-lindenstrauss with binary coins
Dimitris Achlioptas. Database-friendly random projections: Johnson-lindenstrauss with binary coins. Journal of computer and System Sciences , 66(4):671--687, 2003
work page 2003
-
[3]
Reuters rcv1 rcv2 multilingual, multiview text categorization test collection data set
Massih-Reza Amini and Cyril Goutte. Reuters rcv1 rcv2 multilingual, multiview text categorization test collection data set. UCI Machine Learning Repository , 2013
work page 2013
-
[4]
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Foundations of Computer Science, 2006. FOCS'06. 47th Annual IEEE Symposium on , pages 459--468. IEEE, 2006
work page 2006
-
[5]
The space complexity of approximating the frequency moments
Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and system sciences , 58(1):137--147, 1999
work page 1999
-
[6]
Dimensionality reduction for k-means clustering and low rank approximation
Michael B Cohen, Sam Elder, Cameron Musco, Christopher Musco, and Madalina Persu. Dimensionality reduction for k-means clustering and low rank approximation. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing , pages 163--172. ACM, 2015
work page 2015
-
[7]
Ron Cole and Mark Fanty. Isolet data set. UCI Machine Learning Repository , 1994
work page 1994
-
[8]
Linear dimensionality reduction: Survey, insights, and generalizations
John P Cunningham and Zoubin Ghahramani. Linear dimensionality reduction: Survey, insights, and generalizations. The Journal of Machine Learning Research , 16(1):2859--2900, 2015
work page 2015
-
[9]
Numerical linear algebra in the streaming model
Kenneth L Clarkson and David P Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the forty-first annual ACM symposium on Theory of computing , pages 205--214. ACM, 2009
work page 2009
-
[10]
An elementary proof of a theorem of johnson and lindenstrauss
Sanjoy Dasgupta and Anupam Gupta. An elementary proof of a theorem of johnson and lindenstrauss. Random Structures & Algorithms , 22(1):60--65, 2003
work page 2003
-
[11]
Adaptive subgradient methods for online learning and stochastic optimization
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research , 12(Jul):2121--2159, 2011
work page 2011
-
[12]
Repeat consumption matrices data set
Moshe Lichman Dimitrios Kotzias and Padhraic Smyth. Repeat consumption matrices data set. UCI Machine Learning Repository , 2018
work page 2018
-
[13]
M. Schubert S. Poelsterl A. Cavallaro F. Graf, H.P. Kriegel. Relative location of ct slices on axial axis data set. UCI Machine Learning Repository , 2011
work page 2011
-
[14]
Experiments with random projections for machine learning
Dmitriy Fradkin and David Madigan. Experiments with random projections for machine learning. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , pages 517--522. ACM, 2003
work page 2003
-
[15]
Isabelle Guyon. Arcene data set. UCI Machine Learning Repository , 2008
work page 2008
-
[16]
Extensions of lipschitz mappings into a hilbert space
William B Johnson and Joram Lindenstrauss. Extensions of lipschitz mappings into a hilbert space. Contemporary mathematics , 26(189-206):1, 1984
work page 1984
-
[17]
Control variates as a variance reduction technique for random projections
Keegan Kang and Giles Hooker. Control variates as a variance reduction technique for random projections. In International Conference on Pattern Recognition Applications and Methods , pages 1--20. Springer, 2017
work page 2017
-
[18]
Online pca with spectral bounds
Zohar Karnin and Edo Liberty. Online pca with spectral bounds. In Conference on Learning Theory , pages 1129--1140, 2015
work page 2015
-
[19]
Alex Krizhevsky. The cifar-10 dataset. Alex Krizhevsky's Personal Webpage , 2009
work page 2009
-
[20]
Improving random projections using marginal information
Ping Li, Trevor J Hastie, and Kenneth W Church. Improving random projections using marginal information. In International Conference on Computational Learning Theory , pages 635--649. Springer, 2006
work page 2006
-
[21]
Linear regression with random projections
Odalric-Ambrym Maillard and R \'e mi Munos. Linear regression with random projections. Journal of Machine Learning Research , 13(Sep):2735--2772, 2012
work page 2012
-
[22]
Osnap: Faster numerical linear algebra algorithms via sparser subspace embeddings
Jelani Nelson and Huy L Nguy \^e n. Osnap: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on , pages 117--126. IEEE, 2013
work page 2013
-
[23]
Random features for large-scale kernel machines
Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. In Advances in neural information processing systems , pages 1177--1184, 2008
work page 2008
-
[24]
Improved approximation algorithms for large matrices via random projections
Tamas Sarlos. Improved approximation algorithms for large matrices via random projections. FOCS , pages 143---152, 2006
work page 2006
-
[25]
Informed weighted random projection for dimension reduction
Jaydeep Sen and Harish Karnick. Informed weighted random projection for dimension reduction. In International Conference on Advanced Data Mining and Applications , pages 433--442. Springer, 2013
work page 2013
- [26]
- [27]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.