← back to paper
arxiv: 2605.11416 · 2 revisions
Freeze Deep, Train Shallow: Interpretable Layer Allocation for Continued Pre-Training