Recognition: no theorem link
The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge
Pith reviewed 2026-05-14 19:02 UTC · model grok-4.3
The pith
A strong neural network learns a target task from weak-model outputs by eliciting its own pre-trained feature direction rather than overwriting it.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the setting of reward-model learning with two-layer neural networks, the strong model whose pre-trained representations lie in low-dimensional subspaces V_k acquires the target feature direction for task kappa through multi-step SGD under weak-model supervision, thereby learning the task while retaining general capabilities and preserving off-target features even when those features are correlated with the target.
What carries the argument
Low-dimensional subspaces V_k organizing the strong model's pre-trained representations, which the weak-to-strong training uses to elicit the target feature direction kappa.
If this is right
- The strong model acquires the target feature direction through W2S training rather than receiving it a priori.
- W2S training preserves pre-trained off-target features even when they correlate with the target direction.
- Standard supervised fine-tuning produces catastrophic forgetting of correlated off-target features.
- W2S generalization holds in the feature-learning regime for two-layer networks under reward-model learning.
Where Pith is reading between the lines
- If deeper networks maintain comparable subspace organization, the same elicitation mechanism could scale beyond two-layer models.
- Alignment procedures that rely on weak supervisors may reduce capability loss by eliciting rather than overwriting latent features.
- Experiments that deliberately entangle feature directions would test whether the low-dimensional subspace assumption is necessary for the observed preservation effect.
Load-bearing premise
The strong model's pre-trained representations are organized into distinct low-dimensional subspaces separating target and off-target features.
What would settle it
A simulation in which the strong model's representations lack low-dimensional subspace structure would show either failure to acquire the target feature or loss of off-target capabilities under the same weak-to-strong training.
Figures
read the original abstract
Weak-to-strong (W2S) generalization, in which a strong model is fine-tuned on outputs of a weaker, task-specialized model, has been proposed as an approach to aligning superhuman AI systems. Existing theoretical analyses either fix the student's representations or operate in restricted settings. Whether multi-step SGD can succeed in feature learning while preserving diverse pre-trained capabilities remains open. We study W2S in the setting of reward-model learning with two-layer neural networks. The strong model has pre-trained representations organized into low-dimensional subspaces $V_k$, and is fine-tuned under the supervision of a weak model specialized on task $\kappa$. We prove that the strong model efficiently learns task $\kappa$, eliciting its pre-trained knowledge while retaining general capabilities. This establishes W2S generalization in the feature-learning regime, in the sense that the strong model acquires the target feature direction through W2S training, rather than having it given a priori. Moreover, W2S preserves pre-trained off-target features, whereas standard supervised fine-tuning causes catastrophic forgetting when off-target feature directions are correlated with the target's. Numerical experiments on synthetic data confirm our theoretical results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to prove that, in reward-model learning with two-layer neural networks, a strong model whose pre-trained representations are organized into low-dimensional subspaces V_k can be fine-tuned via multi-step SGD under supervision from a weak model specialized on task κ. The strong model thereby acquires the target feature direction from its latent knowledge (establishing W2S generalization in the feature-learning regime) while preserving off-target directions; standard supervised fine-tuning, by contrast, produces catastrophic forgetting when off-target directions are correlated with the target. Synthetic experiments are said to confirm the theoretical predictions.
Significance. If the central derivation holds, the work supplies a concrete, parameter-free mechanism explaining how W2S training elicits task-relevant features from pre-organized subspaces without a priori provision of the target direction, while simultaneously protecting general capabilities—an issue left open by prior analyses that either fix representations or restrict the setting. The explicit contrast with catastrophic forgetting under standard SFT is a clear strength, and the restriction to two-layer networks and reward-model learning is stated up front, making the scope transparent. The result therefore offers a useful foundation for understanding alignment of superhuman models, provided the two-layer analysis can be lifted.
minor comments (3)
- [Abstract] The abstract and introduction should state the two-layer and reward-model assumptions more explicitly at the outset, as these delimit the entire analysis.
- [Experiments] The synthetic experiments are referenced as confirmation but lack sufficient detail on data generation, exact subspace construction, and quantitative metrics (e.g., cosine similarity to target direction or off-target retention); adding these would strengthen reproducibility.
- [Setup] Notation for the subspaces V_k and the specialization of the weak model on κ could be accompanied by a small illustrative diagram in the setup section to aid readers unfamiliar with the geometric picture.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript and for recommending minor revision. The referee's summary correctly captures our central claims regarding weak-to-strong generalization in the feature-learning regime for two-layer networks. We address the report below.
- Extending the two-layer analysis to deeper networks or general architectures, as the proofs rely on the specific low-dimensional subspace structure and update dynamics available only in the two-layer setting.
Circularity Check
Derivation is self-contained with no circular reductions
full rationale
The paper's central result is a proof that multi-step SGD on a two-layer network under weak supervision elicits the target feature direction from explicitly assumed pre-organized low-dimensional subspaces V_k while preserving off-target directions. The derivation begins from the stated model architecture, weak-model specialization on task κ, and reward-model learning dynamics, then produces the feature-acquisition guarantee directly from those inputs. No step reduces by the paper's own equations to a fitted quantity renamed as prediction, no self-citation chain is load-bearing for the uniqueness or existence claim, and the argument remains scoped to the given assumptions without self-definition or ansatz smuggling. This is the normal case of an internally consistent theoretical derivation.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Pre-trained representations of the strong model are organized into low-dimensional subspaces V_k
- domain assumption The weak model is specialized on task κ
Reference graph
Works this paper leans on
-
[1]
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe, Enric Boix Adsera, and Theodor Misiakiewicz. SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics. In Conference on Learning Theory (COLT), volume 195, pages 2552--2623. PMLR, 2023
2023
-
[2]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. GPT -4 technical report, 2023. arXiv:2303.08774
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
Repetita iuvant: Data repetition allows SGD to learn high-dimensional multi-index functions
Luca Arnaboldi, Yatin Dandi, Florent Krzaka a, Luca Pesce, and Ludovic Stephan. Repetita iuvant: Data repetition allows SGD to learn high-dimensional multi-index functions. In High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning, 2024
2024
-
[4]
A latent variable model approach to PMI -based word embeddings
Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. A latent variable model approach to PMI -based word embeddings. Transactions of the Association for Computational Linguistics, 4: 0 385--399, 2016
2016
-
[5]
High-dimensional asymptotics of feature learning: How one gradient step improves the representation
Jimmy Ba, Murat A Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu, and Greg Yang. High-dimensional asymptotics of feature learning: How one gradient step improves the representation. In Advances in Neural Information Processing Systems (NeurIPS), volume 35, pages 37932--37946, 2022
2022
-
[6]
Learning in the presence of low-dimensional structure: A spiked random matrix perspective
Jimmy Ba, Murat A Erdogdu, Taiji Suzuki, Zhichao Wang, and Denny Wu. Learning in the presence of low-dimensional structure: A spiked random matrix perspective. In Advances in Neural Information Processing Systems (NeurIPS), 2023
2023
-
[7]
Toward universal steering and monitoring of ai models
Daniel Beaglehole, Adityanarayanan Radhakrishnan, Enric Boix-Adserà, and Mikhail Belkin. Toward universal steering and monitoring of ai models. Science, 391 0 (6787): 0 787--792, 2026
2026
-
[8]
Online stochastic gradient descent on non-convex losses from high-dimensional inference
G \'e rard Ben Arous, Reza Gheissari, and Aukosh Jagannath. Online stochastic gradient descent on non-convex losses from high-dimensional inference. Journal of Machine Learning Research, 22 0 (106): 0 1--51, 2021
2021
-
[9]
Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws
G \'e rard Ben Arous, Murat A Erdogdu, Nuri Mert Vural, and Denny Wu. Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws. In Advances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[10]
Learning time-scales in two-layers neural networks
Rapha \"e l Berthier, Andrea Montanari, and Kangjie Zhou. Learning time-scales in two-layers neural networks. Foundations of Computational Mathematics, pages 1--84, 2024
2024
-
[11]
Learning single-index models with shallow neural networks
Alberto Bietti, Joan Bruna, Clayton Sanford, and Min Jae Song. Learning single-index models with shallow neural networks. In Advances in Neural Information Processing Systems (NeurIPS), 2022
2022
-
[12]
On using extended statistical queries to avoid membership queries
Nader H Bshouty and Vitaly Feldman. On using extended statistical queries to avoid membership queries. Journal of Machine Learning Research, 2 0 (Feb): 0 359--395, 2002
2002
-
[13]
Weak-to-strong generalization: Eliciting strong capabilities with weak supervision
Collin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, Ilya Sutskever, and Jeffrey Wu. Weak-to-Strong Generalization : Eliciting strong capabilities with weak supervision, 2023. arXiv:2312.09390
-
[14]
Chernoff-type bounds for the G aussian error function
Seok-Ho Chang, Pamela C Cosman, and Laurence B Milstein. Chernoff-type bounds for the G aussian error function. IEEE Transactions on Communications, 59 0 (11): 0 2939--2944, 2011
2011
-
[15]
Quantifying the gain in Weak-to-Strong Generalization
Moses Charikar, Chirag Pabbaraju, and Kirankumar Shiragur. Quantifying the gain in Weak-to-Strong Generalization . In Advances in Neural Information Processing Systems (NeurIPS), volume 37, pages 126474--126499, 2024
2024
-
[16]
Learning polynomials in few relevant dimensions
Sitan Chen and Raghu Meka. Learning polynomials in few relevant dimensions. In Conference on Learning Theory (COLT), volume 125, pages 1161--1227. PMLR, 2020
2020
-
[17]
Knowledge neurons in pretrained transformers
Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, and Furu Wei. Knowledge neurons in pretrained transformers. In Annual Meeting of the Association for Computational Linguistics (ACL), 2022
2022
-
[18]
Smoothing the landscape boosts the signal for SGD : Optimal sample complexity for learning single index models
Alex Damian, Eshaan Nichani, Rong Ge, and Jason D Lee. Smoothing the landscape boosts the signal for SGD : Optimal sample complexity for learning single index models. In Advances in Neural Information Processing Systems (NeurIPS), volume 36, 2024 a
2024
-
[19]
Computational-statistical gaps in G aussian single-index models (extended abstract)
Alex Damian, Loucas Pillaud-Vivien, Jason Lee, and Joan Bruna. Computational-statistical gaps in G aussian single-index models (extended abstract). In Conference on Learning Theory (COLT), volume 247 of Proceedings of Machine Learning Research, pages 1262--1262, 30 Jun--03 Jul 2024 b . Full version available at arXiv:2403.05529
-
[20]
Lee, and Mahdi Soltanolkotabi
Alexandru Damian, Jason D. Lee, and Mahdi Soltanolkotabi. Neural networks can learn representations with gradient descent. In Conference on Learning Theory (COLT), volume 178, pages 5413--5452. PMLR, 2022
2022
-
[21]
The benefits of reusing batches for gradient descent in two-layer networks: Breaking the curse of information and leap exponents
Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborov \'a , and Florent Krzaka a. The benefits of reusing batches for gradient descent in two-layer networks: Breaking the curse of information and leap exponents. In International Conference on Machine Learning (ICML), 2024
2024
-
[22]
Lee, and Qi Lei
Yijun Dong, Yicheng Li, Yunai Li, Jason D. Lee, and Qi Lei. Discrepancies are virtue: Weak-to-Strong Generalization through lens of intrinsic dimension. In International Conference on Machine Learning (ICML), 2025
2025
-
[23]
Learning single-index models in G aussian space
Rishabh Dudeja and Daniel Hsu. Learning single-index models in G aussian space. In Conference on Learning Theory (COLT), volume 75, pages 1887--1930, 2018
1930
-
[24]
Statistical-computational trade-offs in tensor PCA and related problems via communication complexity
Rishabh Dudeja and Daniel Hsu. Statistical-computational trade-offs in tensor PCA and related problems via communication complexity. The Annals of Statistics, 52 0 (1): 0 131--156, 2024
2024
-
[25]
Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan, Dario Amodei, Martin Wattenberg, and Christopher Olah. Toy models of superposition, 2022. arXiv:2209.10652
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[26]
Propagation of chaos in one-hidden-layer neural networks beyond logarithmic time, 2025
Margalit Glasgow, Denny Wu, and Joan Bruna. Propagation of chaos in one-hidden-layer neural networks beyond logarithmic time, 2025. arXiv:2504.13110
-
[27]
Language models represent space and time
Wes Gurnee and Max Tegmark. Language models represent space and time. In International Conference on Learning Representations (ICLR), 2024
2024
-
[28]
Linearity of relation decoding in transformer language models
Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, and David Bau. Linearity of relation decoding in transformer language models. In International Conference on Learning Representations (ICLR), 2024
2024
-
[29]
Disentangling and mitigating the impact of task similarity for continual learning
Naoki Hiratani. Disentangling and mitigating the impact of task similarity for continual learning. In Advances in Neural Information Processing Systems (NeurIPS), 2024
2024
-
[30]
High-dimensional analysis of knowledge distillation: Weak-to-Strong Generalization and scaling laws
Muhammed Emrullah Ildiz, Halil Alperen Gozeten, Ege Onur Taga, Marco Mondelli, and Samet Oymak. High-dimensional analysis of knowledge distillation: Weak-to-Strong Generalization and scaling laws. In International Conference on Learning Representations (ICLR), 2025
2025
-
[31]
Aligner: Efficient alignment by learning to correct
Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Tianyi Qiu, Juntao Dai, and Yaodong Yang. Aligner: Efficient alignment by learning to correct. In Advances in Neural Information Processing Systems (NeurIPS), 2024
2024
-
[32]
On the complexity of learning sparse functions with statistical and gradient queries
Nirmit Joshi, Theodor Misiakiewicz, and Nathan Srebro. On the complexity of learning sparse functions with statistical and gradient queries. In Advances in Neural Information Processing Systems (NeurIPS), 2024
2024
-
[33]
Understanding catastrophic forgetting in language models via implicit inference
Suhas Kotha, Jacob Mitchell Springer, and Aditi Raghunathan. Understanding catastrophic forgetting in language models via implicit inference. In International Conference on Learning Representations (ICLR), 2024
2024
-
[34]
Theoretical analysis of Weak-to-Strong Generalization
Hunter Lang, David Sontag, and Aravindan Vijayaraghavan. Theoretical analysis of Weak-to-Strong Generalization . In Advances in Neural Information Processing Systems (NeurIPS), volume 37, pages 46837--46880, 2024
2024
-
[35]
Lee, Kazusato Oko, Taiji Suzuki, and Denny Wu
Jason D. Lee, Kazusato Oko, Taiji Suzuki, and Denny Wu. Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit. In Advances in Neural Information Processing Systems (NeurIPS), volume 37, pages 58716--58756, 2024. doi:10.52202/079017-1872
-
[36]
An empirical study of catastrophic forgetting in large language models during continual fine-tuning
Yun Luo, Zhen Yang, Fandong Meng, Yafu Li, Jie Zhou, and Yue Zhang. An empirical study of catastrophic forgetting in large language models during continual fine-tuning. IEEE Transactions on Audio, Speech and Language Processing, 33: 0 3776--3786, 2025
2025
-
[37]
Beyond NTK with vanilla gradient descent: A mean-field analysis of neural networks with polynomial width, samples, and time
Arvind Mahankali, Haochen Zhang, Kefan Dong, Margalit Glasgow, and Tengyu Ma. Beyond NTK with vanilla gradient descent: A mean-field analysis of neural networks with polynomial width, samples, and time. In Advances in Neural Information Processing Systems (NeurIPS), volume 36, 2023
2023
-
[38]
Weak-to-Strong Generalization even in random feature networks, provably
Marko Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li, and Nathan Srebro. Weak-to-Strong Generalization even in random feature networks, provably. In International Conference on Machine Learning (ICML), 2025
2025
-
[39]
Language models implement simple word2vec-style vector arithmetic
Jack Merullo, Carsten Eickhoff, and Ellie Pavlick. Language models implement simple word2vec-style vector arithmetic. In Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024
2024
-
[40]
Linguistic regularities in continuous space word representations
Tom \' a s Mikolov, Wen - tau Yih, and Geoffrey Zweig. Linguistic regularities in continuous space word representations. In Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2013
2013
-
[41]
On the mechanisms of Weak-to-Strong Generalization : A theoretical perspective
Behrad Moniri and Hamed Hassani. On the mechanisms of Weak-to-Strong Generalization : A theoretical perspective. In Advances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[42]
A theory of non-linear feature learning with one gradient step in two-layer neural networks
Behrad Moniri, Donghwan Lee, Hamed Hassani, and Edgar Dobriban. A theory of non-linear feature learning with one gradient step in two-layer neural networks. In International Conference on Machine Learning (ICML), volume 235, pages 36106--36159. PMLR, 2024
2024
-
[43]
Neural networks efficiently learn low-dimensional representations with SGD
Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, and Murat A Erdogdu. Neural networks efficiently learn low-dimensional representations with SGD . In International Conference on Learning Representations (ICLR), 2022
2022
-
[44]
Alireza Mousavi-Hosseini, Denny Wu, Taiji Suzuki, and Murat A. Erdogdu. Gradient-based feature learning under structured data. In Advances in Neural Information Processing Systems (NeurIPS), 2023
2023
-
[45]
Emergent linear representations in world models of self-supervised sequence models
Neel Nanda, Andrew Lee, and Martin Wattenberg. Emergent linear representations in world models of self-supervised sequence models. In BlackboxNLP Workshop at Empirical Methods in Natural Language Processing (BlackboxNLP@EMNLP), 2023
2023
-
[46]
Nonlinear transformers can perform inference-time feature learning
Naoki Nishikawa, Yujin Song, Kazusato Oko, Denny Wu, and Taiji Suzuki. Nonlinear transformers can perform inference-time feature learning. In International Conference on Machine Learning (ICML), 2025
2025
-
[47]
From linear to nonlinear: Provable Weak-to-Strong Generalization through feature learning
Junsoo Oh, Jerry Song, and Chulhee Yun. From linear to nonlinear: Provable Weak-to-Strong Generalization through feature learning. In Advances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[48]
Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations
Kazusato Oko, Yujin Song, Taiji Suzuki, and Denny Wu. Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations. In Conference on Learning Theory (COLT), volume 247, pages 4009--4081, 2024 a
2024
-
[49]
Pretrained transformer efficiently learns low-dimensional target functions in-context
Kazusato Oko, Yujin Song, Taiji Suzuki, and Denny Wu. Pretrained transformer efficiently learns low-dimensional target functions in-context. In Advances in Neural Information Processing Systems (NeurIPS), 2024 b
2024
-
[50]
Task-specific skill localization in fine-tuned language models
Abhishek Panigrahi, Nikunj Saunshi, Haoyu Zhao, and Sanjeev Arora. Task-specific skill localization in fine-tuned language models. In International Conference on Learning Representations (ICLR), 2023
2023
-
[51]
The linear representation hypothesis and the geometry of large language models
Kiho Park, Yo Joong Choe, and Victor Veitch. The linear representation hypothesis and the geometry of large language models. In International Conference on Machine Learning (ICML), 2024
2024
-
[52]
Language models are unsupervised multitask learners
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1 0 (8): 0 9, 2019. URL https://storage.prod.researchhub.com/uploads/papers/2020/06/01/language-models.pdf
2019
-
[53]
Yunwei Ren, Eshaan Nichani, Denny Wu, and Jason D. Lee. Emergence and scaling laws in SGD learning of shallow neural networks. In Advances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[54]
Weak-to-Strong Generalization through the data-centric lens
Changho Shin, John Cooper, and Frederic Sala. Weak-to-Strong Generalization through the data-centric lens. In International Conference on Learning Representations (ICLR), 2025
2025
-
[55]
Learning G aussian multi-index models with gradient flow: Time complexity and directional convergence
Berfin Simsek, Amire Bendjeddou, and Daniel Hsu. Learning G aussian multi-index models with gradient flow: Time complexity and directional convergence. In International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
2025
-
[56]
Your weak LLM is secretly a strong teacher for alignment
Leitian Tao and Yixuan Li. Your weak LLM is secretly a strong teacher for alignment. In International Conference on Learning Representations (ICLR), 2025
2025
-
[57]
Steering Language Models With Activation Engineering
Alexander Matt Turner, Lisa Thiergart, Gavin Leech, David Udell, Juan J Vazquez, Ulisse Mini, and Monte MacDiarmid. Steering language models with activation engineering, 2023. arXiv:2308.10248
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[58]
High-Dimensional Probability: An Introduction with Applications in Data Science
Roman Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018
2018
-
[59]
Yu, Cho - Jui Hsieh, Inderjit S
Yihan Wang, Si Si, Daliang Li, Michal Lukasik, Felix X. Yu, Cho - Jui Hsieh, Inderjit S. Dhillon, and Sanjiv Kumar. Two-stage LLM fine-tuning with less specialization and more generalization. In International Conference on Learning Representations (ICLR), 2024
2024
-
[60]
Provable Weak-to-Strong Generalization via benign overfitting
David Xing Wu and Anant Sahai. Provable Weak-to-Strong Generalization via benign overfitting. In International Conference on Learning Representations (ICLR), 2025
2025
-
[61]
Representations shape Weak-to-Strong Generalization : Theoretical insights and empirical predictions
Yihao Xue, Jiping Li, and Baharan Mirzasoleiman. Representations shape Weak-to-Strong Generalization : Theoretical insights and empirical predictions. In International Conference on Machine Learning (ICML), 2025
2025
-
[62]
Y. Yu, T. Wang, and R. J. Samworth. A useful variant of the D avis- K ahan theorem for statisticians. Biometrika, 102 0 (2): 0 315--323, 2015
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.