Oblivious Subspace Injection Is Not Enough for Relative Error

Alex Townsend; Chris Wang

arxiv: 2604.10215 · v1 · submitted 2026-04-11 · 🧮 math.NA · cs.DS· cs.NA

Oblivious Subspace Injection Is Not Enough for Relative Error

Alex Townsend , Chris Wang This is my paper

Pith reviewed 2026-05-10 15:29 UTC · model grok-4.3

classification 🧮 math.NA cs.DScs.NA

keywords oblivious subspace injectionrelative errorsketchingleast squaresrandomized SVDlow-rank approximationoblivious subspace embedding

0 comments

The pith

Oblivious subspace injection alone cannot guarantee relative-error bounds for sketching in least squares or low-rank approximation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Oblivious subspace injection (OSI) was introduced as a weaker sketching condition than oblivious subspace embedding (OSE) that still delivers constant-factor approximation guarantees for randomized low-rank approximation and sketch-and-solve least-squares regression. A natural follow-up question is whether OSI sketches also deliver the stronger relative-error guarantees typical of OSE, with failure probability controlled solely by the OSI failure parameter. The paper constructs explicit counterexamples showing that OSI alone permits sketches with large optimal residuals, so relative-error bounds fail even when the OSI property holds with high probability. Adding an explicit upper bound on the optimal residual or tail component recovers a near-relative-error guarantee. The same separation appears for both sketch-and-solve least squares and randomized SVD measured in the Frobenius norm; an ℓ_p analogue of OSI is also shown to yield constant-factor sketch-and-solve bounds.

Core claim

From a theoretical standpoint, OSI alone does not yield OSE-style relative-error guarantees whose failure probability is controlled solely by the OSI failure parameter. Counterexamples are given for sketch-and-solve least squares and for randomized SVD in the Frobenius norm; the missing ingredient is upper control on the optimal residual or tail component, and when this additional property is ensured the near-relative-error bound is recovered.

What carries the argument

Oblivious subspace injection (OSI), a sketching property weaker than oblivious subspace embedding that still controls the range of the sketched matrix but does not automatically bound the optimal residual.

If this is right

OSI sketches suffice for constant-factor but not relative-error guarantees in sketch-and-solve least squares.
OSI sketches suffice for constant-factor but not relative-error guarantees in randomized SVD measured in the Frobenius norm.
Adding an upper bound on the optimal residual or tail component to an OSI sketch recovers a near-relative-error guarantee.
A natural ℓ_p analogue of OSI yields constant-factor sketch-and-solve bounds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practical random sketches that satisfy OSI may also satisfy the missing residual bound with high probability, explaining their observed good performance.
The separation between OSI and relative error may extend to other sketching tasks that rely on tail control rather than range preservation alone.
Designing sketches that enforce both OSI and residual control simultaneously could be a concrete way to obtain relative-error guarantees with weaker assumptions than full OSE.

Load-bearing premise

There exist matrices and random sketches that satisfy the OSI definition yet produce large optimal residuals that prevent relative-error bounds.

What would settle it

A concrete matrix A and distribution over sketches S such that the OSI property holds with probability at least 1-δ, but the sketched least-squares or SVD residual exceeds (1+ε) times the optimal residual with probability not bounded by any function of δ alone.

read the original abstract

Oblivious subspace injection (OSI) was introduced by Cama\~no, Epperly, Meyer, and Tropp in 2025 as a much weaker sketching property than oblivious subspace embedding (OSE) that still yields constant-factor guarantees for randomized low-rank approximation and sketch-and-solve least-squares regression. At the Simons Institute in Berkeley during a workshop in October 2025, it was asked whether OSIs also imply relative error bounds rather than just constant-factor guarantees. We show that, from a theoretical standpoint, OSI alone does not yield OSE-style relative-error guarantees whose failure probability is controlled solely by the OSI failure parameter, even though OSI sketches often perform extremely well in practice. We provide counterexamples showing this for sketch-and-solve least squares and for randomized SVD in the Frobenius norm. The missing ingredient from a sketch satisfying only OSI is upper control on the optimal residual or tail component, and when one ensures the sketch has this additional property, a near-relative-error bound is recovered. We also show that there is a natural $\ell_p$ analogue of OSI giving constant-factor sketch-and-solve bounds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

OSI alone does not deliver relative-error sketching bounds, and the counterexamples make the separation clear.

read the letter

The main takeaway is that oblivious subspace injection gives constant-factor guarantees for sketch-and-solve least squares and randomized SVD but does not extend to relative-error bounds whose failure probability depends only on the OSI parameter. The authors back this with explicit counterexamples and note that adding residual control recovers a near-relative bound. They also give an ℓ_p version of OSI for constant-factor results. This sharpens the boundary between the two properties introduced in the 2025 OSI paper. The counterexamples target the right gap by constructing sketches that inject the relevant subspace yet leave large optimal residuals, so the sketched error stays far from relative. If the constructions keep the OSI success probability independent of residual size, the negative claim holds. The paper does a clean job of isolating the missing ingredient and showing how to fix it. A minor soft spot is that the counterexamples may be somewhat specialized; it is not obvious how often they arise in matrices where OSI is typically applied. The proofs will need checking for any hidden dependence between the injection property and the tail norm. This work is aimed at people who design or analyze sketching algorithms in randomized linear algebra. It is not broad but it clarifies a practical distinction that affects when to reach for stronger embeddings. Readers who care about the exact conditions for relative-error guarantees will find it useful. I would send it to peer review. The negative result is worth recording with concrete examples, and the overall argument looks worth referee time even if revisions are needed on the constructions.

Referee Report

2 major / 3 minor

Summary. The manuscript shows that oblivious subspace injection (OSI), a weaker sketching property than oblivious subspace embedding (OSE), yields only constant-factor guarantees for sketch-and-solve least squares and randomized low-rank approximation but does not imply OSE-style relative-error bounds whose failure probability depends solely on the OSI failure parameter. Explicit counterexamples are given for both the least-squares and Frobenius-norm SVD settings; the paper identifies the missing ingredient as upper control on the optimal residual (or tail component) and proves that adding this control recovers near-relative-error guarantees. An ℓ_p analogue of OSI is also introduced that suffices for constant-factor sketch-and-solve bounds.

Significance. If the counterexamples hold, the result usefully separates OSI from stronger embedding properties and explains the gap between the constant-factor theory and the relative-error performance often observed in practice. The explicit constructions, the recovery theorem, and the ℓ_p extension are concrete contributions that sharpen the theoretical toolkit for sketching algorithms in numerical linear algebra.

major comments (2)

[§3] §3 (least-squares counterexample): the construction must be checked to confirm that the OSI success probability remains a function only of the OSI parameter δ and is independent of the residual size; the provided matrix and sketch distribution appear to satisfy this, but an explicit sentence stating the independence would eliminate any residual doubt about circularity.
[§4] §4 (Frobenius SVD counterexample): the argument that the sketched residual exceeds any fixed relative-error target with probability bounded away from zero relies on the tail component being large; a short calculation showing that this probability is at least 1/2 (or any positive constant) independent of δ would make the separation from relative-error claims fully rigorous.

minor comments (3)

[Introduction] The OSI definition (Eq. (2) or wherever first stated) should include an explicit remark that the injection property is required only on the column space of A (or row space, as appropriate) rather than on the whole ambient space.
[Figure 1] Figure 1 (or the schematic comparing OSI and OSE) would benefit from a caption that directly references the counterexample matrices rather than generic notation.
[Conclusion] A brief comparison table listing the failure-probability dependence for OSI, OSI+residual-control, and OSE would help readers quickly see the hierarchy of guarantees.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the positive assessment. The two major comments identify opportunities for minor clarifications that will strengthen the rigor of the counterexamples; we agree with both suggestions and will incorporate them.

read point-by-point responses

Referee: [§3] §3 (least-squares counterexample): the construction must be checked to confirm that the OSI success probability remains a function only of the OSI parameter δ and is independent of the residual size; the provided matrix and sketch distribution appear to satisfy this, but an explicit sentence stating the independence would eliminate any residual doubt about circularity.

Authors: We confirm that the OSI success probability in the least-squares counterexample depends only on the parameter δ and is independent of the residual size. The matrix and sketch distribution were constructed precisely so that the failure event is governed solely by the OSI definition. We will add an explicit sentence in §3 stating this independence to eliminate any residual doubt about circularity. revision: yes
Referee: [§4] §4 (Frobenius SVD counterexample): the argument that the sketched residual exceeds any fixed relative-error target with probability bounded away from zero relies on the tail component being large; a short calculation showing that this probability is at least 1/2 (or any positive constant) independent of δ would make the separation from relative-error claims fully rigorous.

Authors: We agree that an explicit lower bound on the failure probability will make the separation fully rigorous. We will insert a short calculation in §4 showing that the probability that the sketched residual exceeds any fixed relative-error target is at least 1/2, and that this bound is independent of δ. revision: yes

Circularity Check

0 steps flagged

Negative result via counterexamples is self-contained against external OSI definition

full rationale

The paper takes the OSI property as an externally defined notion from prior work (Camaño et al., 2025) and constructs explicit matrices and sketch distributions that satisfy the OSI injection condition with failure probability controlled solely by the OSI parameter, yet produce large optimal residuals that violate relative-error bounds for sketch-and-solve and randomized SVD. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or redefinition of the input; the separation is exhibited by direct construction rather than by algebraic equivalence or ansatz smuggling. The central claim is therefore independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard definitions of matrix norms, sketching operators, and probability over random sketches. No free parameters or new invented entities are introduced.

axioms (2)

standard math Standard properties of matrix norms, subspaces, and sketching operators in linear algebra
Invoked throughout the analysis of approximation guarantees and counterexamples.
domain assumption Existence of matrices and sketches satisfying OSI but violating relative-error bounds
Central to the negative result stated in the abstract.

pith-pipeline@v0.9.0 · 5497 in / 1254 out tokens · 76886 ms · 2026-05-10T15:29:11.092130+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[1]

Ailon and B

N. Ailon and B. Chazelle , The fast Johnson–Lindenstrauss transform and approximate near- est neighbors , SIAM J. Comput., 39 (2009), pp. 302–322

work page 2009
[2]

Amsel, Y

N. Amsel, Y. Baumann, P. Beckman, P. B ¨urgisser, C. Cama ˜no, T. Chen, E. Chow, A. Damle, M. Derezi ´nski, M. Embree, E. N. Epperly, R. F algout, M. Fornace, A. Greenbaum, C. Greif, D. Halikias, Z. Huang, E. Jarlebring, Y. K outis, D. Kress- ner, R. Kyng, J. Liesen, J. Lok, R. A. Meyer, Y. Nakatsukasa, K. Pe arce, R. Peng, D. Persson, E. Rebrova, R. Schn...

work page arXiv 2026
[3]

A vron, P

H. A vron, P. Maymounkov, and S. Toledo , Blendenpik: Supercharging LAPACK’s least- squares solver , SIAM J. Sci. Comput., 32 (2010), pp. 1217–1236

work page 2010
[4]

Cama ˜no, E

C. Cama ˜no, E. N. Epperly, R. A. Meyer, and J. A. Tropp , Faster linear algebra algorithms with structured random matrices , arXiv:2508.21189, (2025)

work page arXiv 2025
[5]

Cazeaux, M.-S

P. Cazeaux, M.-S. Dupuy, and R. F. Justiniano , Linear-scaling tensor train sketching , arXiv:2603.11009, (2026)

work page arXiv 2026
[6]

K. L. Clarkson and D. P. Woodruff , Low-rank approximation and regression in input sparsity time, in Proceedings of the 45th Annual ACM Symposium on Theory of Computing (STOC), 2013, pp. 81–90

work page 2013
[7]

M. B. Cohen and R. Peng , ℓp row sampling by Lewis weights , in Proceedings of the Forty- Seventh Annual ACM Symposium on Theory of Computing, 2015, p p. 183–192

work page 2015
[8]

Drineas, M

P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarl ´os, Faster least squares approximation, Numer. Math., 117 (2011), pp. 219–249

work page 2011
[9]

Halko, P.-G

N. Halko, P.-G. Martinsson, and J. A. Tropp , Finding structure with randomness: Prob- abilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217–288

work page 2011
[10]

M. W. Mahoney , Randomized algorithms for matrices and data , Found. Trends Mach. Learn., 3 (2011), pp. 123–224

work page 2011
[11]

Martinsson and J

P.-G. Martinsson and J. A. Tropp , Randomized numerical linear algebra: Foundations and algorithms, Acta Numer., 29 (2020), pp. 403–572. 1Since the OSI deﬁnition is for every ﬁxed r-dimensional subspace, one technically needs to extend range( A) to a r-dimensional subspace V if rank( A) < r . The injectivity event on V implies the same inequality on range...

work page 2020
[12]

Meng and M

X. Meng and M. W. Mahoney , Low-distortion subspace embeddings in input-sparsity tim e and applications to robust linear regression , in Proceedings of the 45th Annual ACM Sym- posium on Theory of Computing (STOC), 2013, pp. 91–100

work page 2013
[13]

Nelson and H

J. Nelson and H. L. Nguy ˜ ˆen, OSNAP: Faster numerical linear algebra algorithms via spar ser subspace embeddings, in Proceedings of the 2013 IEEE 54th Annual Symposium on Fou n- dations of Computer Science, 2013, pp. 117–126

work page 2013
[14]

Pham and R

N. Pham and R. Pagh , Fast and scalable polynomial kernels via explicit feature m aps, in Proceedings of the 19th ACM SIGKDD International Conferenc e on Knowledge Discovery and Data Mining, 2013, pp. 239–247

work page 2013
[15]

Rokhlin, A

V. Rokhlin, A. Szlam, and M. Tygert , A randomized algorithm for principal component analysis, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 1100–1124

work page 2009
[16]

Rokhlin and M

V. Rokhlin and M. Tygert , A fast randomized algorithm for overdetermined linear leas t- squares regression, Proc. Natl. Acad. Sci. USA, 105 (2008), pp. 13212–13217

work page 2008
[17]

T. Sarl ´os, Improved approximation algorithms for large matrices via r andom projections, in Proceedings of the 47th Annual IEEE Symposium on Foundation s of Computer Science (FOCS), 2006, pp. 143–152

work page 2006
[18]

Sohler and D

C. Sohler and D. P. Woodruff , Subspace embeddings for the ℓ1-norm with applications , in Proceedings of the 43rd Annual ACM Symposium on Theory of Com puting (STOC), 2011, pp. 755–764

work page 2011
[19]

J. A. Tropp , Improved analysis of the subsampled randomized Hadamard tr ansform, Adv. Adapt. Data Anal., 3 (2011), pp. 115–126

work page 2011
[20]

J. A. Tropp, A. Yurtsever, M. Udell, and V. Cevher , Practical sketching algorithms for low-rank matrix approximation , SIAM J. Matrix Anal. Appl., 38 (2017), pp. 1454–1485

work page 2017
[21]

W ang and A

C. W ang and A. Townsend , Beyond singular value gaps in randomized subspace approxim a- tion, arXiv:2603.01191, (2026)

work page arXiv 2026
[22]

D. P. Woodruff , Sketching as a tool for numerical linear algebra , Found. Trends Theor. Comput. Sci., 10 (2014), pp. 1–157

work page 2014
[23]

D. P. Woodruff and Q. Zhang , Subspace embeddings and ℓp-regression using exponential ran- dom variables , in Proceedings of the 26th Annual Conference on Learning Th eory (COLT), vol. 30 of Proceedings of Machine Learning Research, PMLR, 2 013, pp. 546–567

work page

[1] [1]

Ailon and B

N. Ailon and B. Chazelle , The fast Johnson–Lindenstrauss transform and approximate near- est neighbors , SIAM J. Comput., 39 (2009), pp. 302–322

work page 2009

[2] [2]

Amsel, Y

N. Amsel, Y. Baumann, P. Beckman, P. B ¨urgisser, C. Cama ˜no, T. Chen, E. Chow, A. Damle, M. Derezi ´nski, M. Embree, E. N. Epperly, R. F algout, M. Fornace, A. Greenbaum, C. Greif, D. Halikias, Z. Huang, E. Jarlebring, Y. K outis, D. Kress- ner, R. Kyng, J. Liesen, J. Lok, R. A. Meyer, Y. Nakatsukasa, K. Pe arce, R. Peng, D. Persson, E. Rebrova, R. Schn...

work page arXiv 2026

[3] [3]

A vron, P

H. A vron, P. Maymounkov, and S. Toledo , Blendenpik: Supercharging LAPACK’s least- squares solver , SIAM J. Sci. Comput., 32 (2010), pp. 1217–1236

work page 2010

[4] [4]

Cama ˜no, E

C. Cama ˜no, E. N. Epperly, R. A. Meyer, and J. A. Tropp , Faster linear algebra algorithms with structured random matrices , arXiv:2508.21189, (2025)

work page arXiv 2025

[5] [5]

Cazeaux, M.-S

P. Cazeaux, M.-S. Dupuy, and R. F. Justiniano , Linear-scaling tensor train sketching , arXiv:2603.11009, (2026)

work page arXiv 2026

[6] [6]

K. L. Clarkson and D. P. Woodruff , Low-rank approximation and regression in input sparsity time, in Proceedings of the 45th Annual ACM Symposium on Theory of Computing (STOC), 2013, pp. 81–90

work page 2013

[7] [7]

M. B. Cohen and R. Peng , ℓp row sampling by Lewis weights , in Proceedings of the Forty- Seventh Annual ACM Symposium on Theory of Computing, 2015, p p. 183–192

work page 2015

[8] [8]

Drineas, M

P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarl ´os, Faster least squares approximation, Numer. Math., 117 (2011), pp. 219–249

work page 2011

[9] [9]

Halko, P.-G

N. Halko, P.-G. Martinsson, and J. A. Tropp , Finding structure with randomness: Prob- abilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217–288

work page 2011

[10] [10]

M. W. Mahoney , Randomized algorithms for matrices and data , Found. Trends Mach. Learn., 3 (2011), pp. 123–224

work page 2011

[11] [11]

Martinsson and J

P.-G. Martinsson and J. A. Tropp , Randomized numerical linear algebra: Foundations and algorithms, Acta Numer., 29 (2020), pp. 403–572. 1Since the OSI deﬁnition is for every ﬁxed r-dimensional subspace, one technically needs to extend range( A) to a r-dimensional subspace V if rank( A) < r . The injectivity event on V implies the same inequality on range...

work page 2020

[12] [12]

Meng and M

X. Meng and M. W. Mahoney , Low-distortion subspace embeddings in input-sparsity tim e and applications to robust linear regression , in Proceedings of the 45th Annual ACM Sym- posium on Theory of Computing (STOC), 2013, pp. 91–100

work page 2013

[13] [13]

Nelson and H

J. Nelson and H. L. Nguy ˜ ˆen, OSNAP: Faster numerical linear algebra algorithms via spar ser subspace embeddings, in Proceedings of the 2013 IEEE 54th Annual Symposium on Fou n- dations of Computer Science, 2013, pp. 117–126

work page 2013

[14] [14]

Pham and R

N. Pham and R. Pagh , Fast and scalable polynomial kernels via explicit feature m aps, in Proceedings of the 19th ACM SIGKDD International Conferenc e on Knowledge Discovery and Data Mining, 2013, pp. 239–247

work page 2013

[15] [15]

Rokhlin, A

V. Rokhlin, A. Szlam, and M. Tygert , A randomized algorithm for principal component analysis, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 1100–1124

work page 2009

[16] [16]

Rokhlin and M

V. Rokhlin and M. Tygert , A fast randomized algorithm for overdetermined linear leas t- squares regression, Proc. Natl. Acad. Sci. USA, 105 (2008), pp. 13212–13217

work page 2008

[17] [17]

T. Sarl ´os, Improved approximation algorithms for large matrices via r andom projections, in Proceedings of the 47th Annual IEEE Symposium on Foundation s of Computer Science (FOCS), 2006, pp. 143–152

work page 2006

[18] [18]

Sohler and D

C. Sohler and D. P. Woodruff , Subspace embeddings for the ℓ1-norm with applications , in Proceedings of the 43rd Annual ACM Symposium on Theory of Com puting (STOC), 2011, pp. 755–764

work page 2011

[19] [19]

J. A. Tropp , Improved analysis of the subsampled randomized Hadamard tr ansform, Adv. Adapt. Data Anal., 3 (2011), pp. 115–126

work page 2011

[20] [20]

J. A. Tropp, A. Yurtsever, M. Udell, and V. Cevher , Practical sketching algorithms for low-rank matrix approximation , SIAM J. Matrix Anal. Appl., 38 (2017), pp. 1454–1485

work page 2017

[21] [21]

W ang and A

C. W ang and A. Townsend , Beyond singular value gaps in randomized subspace approxim a- tion, arXiv:2603.01191, (2026)

work page arXiv 2026

[22] [22]

D. P. Woodruff , Sketching as a tool for numerical linear algebra , Found. Trends Theor. Comput. Sci., 10 (2014), pp. 1–157

work page 2014

[23] [23]

D. P. Woodruff and Q. Zhang , Subspace embeddings and ℓp-regression using exponential ran- dom variables , in Proceedings of the 26th Annual Conference on Learning Th eory (COLT), vol. 30 of Proceedings of Machine Learning Research, PMLR, 2 013, pp. 546–567

work page