pith. sign in

arxiv: 2403.05529 · v2 · pith:HPOT5QEOnew · submitted 2024-03-08 · 💻 cs.LG · stat.ML

Computational-Statistical Gaps in Gaussian Single-Index Models

classification 💻 cs.LG stat.ML
keywords starstatisticalclasscomplexitygenerativehigh-dimensionalmodelssample
0
0 comments X
read the original abstract

Single-Index Models are high-dimensional regression problems with planted structure, whereby labels depend on an unknown one-dimensional projection of the input via a generic, non-linear, and potentially non-deterministic transformation. As such, they encompass a broad class of statistical inference tasks, and provide a rich template to study statistical and computational trade-offs in the high-dimensional regime. While the information-theoretic sample complexity to recover the hidden direction is linear in the dimension $d$, we show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $\Omega(d^{k^\star/2})$ samples, where $k^\star$ is a "generative" exponent associated with the model that we explicitly characterize. Moreover, we show that this sample complexity is also sufficient, by establishing matching upper bounds using a partial-trace algorithm. Therefore, our results provide evidence of a sharp computational-to-statistical gap (under both the SQ and LDP class) whenever $k^\star>2$. To complete the study, we provide examples of smooth and Lipschitz deterministic target functions with arbitrarily large generative exponents $k^\star$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Statistically and Computationally Optimal Estimation and Inference of Common Subspaces

    math.ST 2026-06 unverdicted novelty 7.0

    Establishes statistical and computational optimality thresholds for common subspace estimation and inference under varying SNR regimes, including an impossibility result for adaptive confidence intervals below strong ...

  2. The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge

    stat.ML 2026-05 unverdicted novelty 7.0

    In two-layer networks, weak-to-strong training elicits the target feature direction from pre-trained subspaces and preserves correlated off-target features, unlike standard fine-tuning.

  3. How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability

    cs.CL 2026-01 unverdicted novelty 7.0

    Transformer weights at early training stages are closed-form compositions of bigram, token-interchangeability, and context mappings that directly reflect text-corpus statistics and explain the emergence of semantic as...

  4. A Fourier perspective on the learning dynamics of neural networks: from sample complexities to mechanistic insights

    stat.ML 2026-05 conditional novelty 6.0

    Neural networks prioritize amplitude over phase in Fourier space during training on translation-invariant data; power-law spectra accelerate phase learning despite not aiding classification.

  5. Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks

    cond-mat.dis-nn 2026-04 unverdicted novelty 6.0

    In overparameterized quadratic networks, one-pass SGD escapes generalization plateaus only modestly faster and selects the initialization-closest zero-loss solution due to a conserved quantity in the overlap ODEs.

  6. There Will Be a Scientific Theory of Deep Learning

    stat.ML 2026-04 unverdicted novelty 2.0

    A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universa...