pith. sign in

Title resolution pending

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

years

2026 6

representative citing papers

Equivalence of Coarse and Fine-Grained Models for Learning with Distribution Shift

cs.DS · 2026-05-07 · unverdicted · novelty 8.0 · 2 refs

An efficient black-box reduction from PQ to TDS learning for any Boolean concept class in the distribution-free setting implies hardness for TDS learning of halfspaces, while membership queries enable efficient PQ learning of halfspaces via iterative Forster transforms.

Sharp Spectral Thresholds for Multi-View Spiked Wigner Models

math.PR · 2026-05-19 · unverdicted · novelty 7.0

The spectral weak-recovery threshold for linearized AMP in the multi-view spiked Wigner model is SNR(λ,B)=1, where SNR is the largest eigenvalue of Diag(√λ)(B⊙B)Diag(√λ), and this coincides with the information-theoretic threshold for a broad class of spike priors.

On efficient robust regression with subquadratic samples

cs.DS · 2026-05-18 · unverdicted · novelty 6.0

Near-linear time algorithm for robust regression under Gaussian covariates achieves O(sqrt(ε κ)) error with Õ(d/ε⁴) samples when ε κ ≲ 1, plus SQ and low-degree lower bounds.

citing papers explorer

Showing 6 of 6 citing papers.

  • Equivalence of Coarse and Fine-Grained Models for Learning with Distribution Shift cs.DS · 2026-05-07 · unverdicted · none · ref 45 · 2 links

    An efficient black-box reduction from PQ to TDS learning for any Boolean concept class in the distribution-free setting implies hardness for TDS learning of halfspaces, while membership queries enable efficient PQ learning of halfspaces via iterative Forster transforms.

  • Sharp Spectral Thresholds for Multi-View Spiked Wigner Models math.PR · 2026-05-19 · unverdicted · none · ref 160

    The spectral weak-recovery threshold for linearized AMP in the multi-view spiked Wigner model is SNR(λ,B)=1, where SNR is the largest eigenvalue of Diag(√λ)(B⊙B)Diag(√λ), and this coincides with the information-theoretic threshold for a broad class of spike priors.

  • Feature Learning in Linear-Width Two-Layer Networks: Two vs. One Step of Gradient Descent stat.ML · 2026-05-18 · unverdicted · none · ref 164 · 2 links

    Two steps of gradient descent on first-layer weights in linear-width two-layer networks produce a spiked random matrix with floor(alpha2/(1/2-alpha1)) outliers, each a learned direction, and batch reuse allows capturing directions with information exponent exceeding one.

  • Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model stat.ML · 2026-05-14 · accept · none · ref 26

    A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.

  • On efficient robust regression with subquadratic samples cs.DS · 2026-05-18 · unverdicted · none · ref 33

    Near-linear time algorithm for robust regression under Gaussian covariates achieves O(sqrt(ε κ)) error with Õ(d/ε⁴) samples when ε κ ≲ 1, plus SQ and low-degree lower bounds.

  • On the Blessing of Pre-training in Weak-to-Strong Generalization cs.LG · 2026-05-07 · unverdicted · none · ref 144

    Pre-training provides a geometric warm start in a single-index model that enables weak-to-strong generalization up to a supervisor-limited bound, with empirical phase-transition evidence in LLMs.