LoRA learns less and forgets less.Transactions on Machine Learning Research

Dan Biderman, Jacob Portes, Jose Javier Gonzalez Ortiz, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, John Patrick Cunningham · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Reasoning-Trace Collapse: Evaluating the Loss of Explicit Reasoning During Fine-Tuning

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

Fine-tuning reasoning models on answer-only data induces reasoning-trace collapse where valid traces disappear while answer performance stays high, and simple loss-masking can mitigate it.

LoRA vs. Full Fine-Tuning: A Theoretical Perspective

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

In linear regression, LoRA can achieve lower excess risk than full fine-tuning when the pretraining-downstream difference is low-rank, and small LoRA ranks can improve generalization by acting as regularization.

citing papers explorer

Showing 2 of 2 citing papers.

Reasoning-Trace Collapse: Evaluating the Loss of Explicit Reasoning During Fine-Tuning cs.LG · 2026-05-20 · unverdicted · none · ref 3
Fine-tuning reasoning models on answer-only data induces reasoning-trace collapse where valid traces disappear while answer performance stays high, and simple loss-masking can mitigate it.
LoRA vs. Full Fine-Tuning: A Theoretical Perspective cs.LG · 2026-05-18 · unverdicted · none · ref 3
In linear regression, LoRA can achieve lower excess risk than full fine-tuning when the pretraining-downstream difference is low-rank, and small LoRA ranks can improve generalization by acting as regularization.

LoRA learns less and forgets less.Transactions on Machine Learning Research

fields

years

verdicts

representative citing papers

citing papers explorer