SIAM Journal on Mathematics of Data Science , volume =

· 2020 · arXiv 1903.07571

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Double Descent in Quantum Kernel Ridge Regression

quant-ph · 2026-04-19 · unverdicted · novelty 6.0

Quantum kernel ridge regression shows double descent in test risk, with the interpolation peak suppressible by regularization, via random matrix theory asymptotics in the high-dimensional limit.

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

cs.LG · 2024-01-02 · unverdicted · novelty 6.0

SPIN lets weak LLMs become strong by self-generating training data from previous model versions and training to prefer human-annotated responses over its own outputs, outperforming DPO even with extra GPT-4 data on benchmarks.

Asymmetric Scaling Laws from Sparse Features

stat.ML · 2026-05-22 · unverdicted · novelty 5.0

A sparse-activation model predicts double-descent loss with distinct under- and over-parameterized scaling exponents set by sparsity, plus a compute-optimal frontier favoring dataset growth.

citing papers explorer

Showing 3 of 3 citing papers.

Double Descent in Quantum Kernel Ridge Regression quant-ph · 2026-04-19 · unverdicted · none · ref 3
Quantum kernel ridge regression shows double descent in test risk, with the interpolation peak suppressible by regularization, via random matrix theory asymptotics in the high-dimensional limit.
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models cs.LG · 2024-01-02 · unverdicted · none · ref 102
SPIN lets weak LLMs become strong by self-generating training data from previous model versions and training to prefer human-annotated responses over its own outputs, outperforming DPO even with extra GPT-4 data on benchmarks.
Asymmetric Scaling Laws from Sparse Features stat.ML · 2026-05-22 · unverdicted · none · ref 47
A sparse-activation model predicts double-descent loss with distinct under- and over-parameterized scaling exponents set by sparsity, plus a compute-optimal frontier favoring dataset growth.

SIAM Journal on Mathematics of Data Science , volume =

fields

years

verdicts

representative citing papers

citing papers explorer