pith. sign in

arxiv: 2210.11466 · v3 · pith:AE3UUO5Nnew · submitted 2022-10-20 · 💻 cs.LG · cs.AI

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

classification 💻 cs.LG cs.AI
keywords fine-tuningdistributionlayersshiftinformationlearnedshiftssubset
0
0 comments X
read the original abstract

A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task. This paper shows that in such settings, selectively fine-tuning a subset of layers (which we term surgical fine-tuning) matches or outperforms commonly used fine-tuning approaches. Moreover, the type of distribution shift influences which subset is more effective to tune: for example, for image corruptions, fine-tuning only the first few layers works best. We validate our findings systematically across seven real-world data tasks spanning three types of distribution shifts. Theoretically, we prove that for two-layer neural networks in an idealized setting, first-layer tuning can outperform fine-tuning all layers. Intuitively, fine-tuning more parameters on a small target dataset can cause information learned during pre-training to be forgotten, and the relevant information depends on the type of shift.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Sample-wise Targeted Adversarial Attacks on Test-time Adaptation

    cs.LG 2026-05 unverdicted novelty 6.0

    Proposes meta-learning attack with priority-aware gradient alignment for sample-wise targeted attacks on TTA that maintain label distribution consistency with no-attack baseline.

  2. Selective, Regularized, and Calibrated: Harnessing Vision Foundation Models for Cross-Domain Few-Shot Semantic Segmentation

    cs.CV 2026-05 unverdicted novelty 6.0

    HERA is a select-regularize-calibrate framework adapting frozen vision foundation models for cross-domain few-shot semantic segmentation via hierarchical layer selection with ETR, prior-guided regularization, and pixe...

  3. Intermediate Representations are Strong AI-Generated Image Detectors

    cs.CV 2026-05 unverdicted novelty 6.0

    Intermediate layer embedding sensitivity to perturbations distinguishes AI-generated images from real ones, yielding higher AUROC on GenImage and Forensics Small benchmarks than prior methods.

  4. Task-agnostic Low-rank Residual Adaptation for Efficient Federated Continual Fine-Tuning

    cs.LG 2025-05 unverdicted novelty 6.0

    Fed-TaLoRA uses task-agnostic low-rank residual adaptation with post-aggregation calibration to enable efficient federated continual fine-tuning across sequential tasks under non-IID conditions.

  5. Generalizable Deepfake Detection Based on Forgery-aware Layer Masking and Multi-artifact Subspace Decomposition

    cs.CV 2026-01 unverdicted novelty 5.0

    FMSD improves cross-dataset generalization in deepfake detection by using gradient-based layer masking to select forgery-sensitive weights and SVD to split them into preserved semantic and multiple learnable artifact ...