In linear regression, LoRA can achieve lower excess risk than full fine-tuning when the pretraining-downstream difference is low-rank, and small LoRA ranks can improve generalization by acting as regularization.
Benign overfitting in linear regression.Proceedings of the National Academy of Sciences, 117(48):30063–30070, 2020
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Experiments on modular arithmetic with heavy label noise show that over-parameterized networks form a distributed internal generalization structure that can be extracted via frequency methods to achieve high accuracy despite 80% noise.
citing papers explorer
-
LoRA vs. Full Fine-Tuning: A Theoretical Perspective
In linear regression, LoRA can achieve lower excess risk than full fine-tuning when the pretraining-downstream difference is low-rank, and small LoRA ranks can improve generalization by acting as regularization.
-
Unveiling Memorization-Generalization Coexistence: A Case Study on Arithmetic Tasks with Label Noise
Experiments on modular arithmetic with heavy label noise show that over-parameterized networks form a distributed internal generalization structure that can be extracted via frequency methods to achieve high accuracy despite 80% noise.