Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models

Elad Ben Zaken, Yoav Goldberg, Shauli Ravfogel · 2022

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

baseline 2

citation-polarity summary

baseline 2

representative citing papers

ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning

cs.LG · 2026-05-20 · conditional · novelty 6.0

ChunkFT enables full-parameter fine-tuning of Llama 3-8B on one 24 GB GPU and Llama 3-70B on two 80 GB GPUs by streaming gradients over dynamically activated sub-tensors.

S2FT: Parameter-Efficient Fine-Tuning in Sparse Spectrum Domain

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

S2FT replaces the sparse-spectrum assumption of prior Fourier PEFT with a learned rearrangement that maps a pre-estimated weight change into a domain where few spectral coefficients suffice.

Pretraining Induces a Reusable Spectral Basis for Downstream Task Adaptation

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Pretraining induces stable leading singular vectors that form a reusable spectral basis inherited by downstream tasks, enabling competitive performance with 0.2% trainable parameters on GLUE.

SMoA: Spectrum Modulation Adapter for Parameter-Efficient Fine-Tuning

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

SMoA is a new PEFT adapter that uses block-wise Hadamard-modulated low-rank branches on spectral partitions to cover more pretrained spectral directions than standard LoRA under a smaller parameter budget.

citing papers explorer

Showing 4 of 4 citing papers.

ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning cs.LG · 2026-05-20 · conditional · none · ref 9
ChunkFT enables full-parameter fine-tuning of Llama 3-8B on one 24 GB GPU and Llama 3-70B on two 80 GB GPUs by streaming gradients over dynamically activated sub-tensors.
S2FT: Parameter-Efficient Fine-Tuning in Sparse Spectrum Domain cs.CV · 2026-05-09 · unverdicted · none · ref 47
S2FT replaces the sparse-spectrum assumption of prior Fourier PEFT with a learned rearrangement that maps a pre-estimated weight change into a domain where few spectral coefficients suffice.
Pretraining Induces a Reusable Spectral Basis for Downstream Task Adaptation cs.LG · 2026-05-08 · unverdicted · none · ref 13
Pretraining induces stable leading singular vectors that form a reusable spectral basis inherited by downstream tasks, enabling competitive performance with 0.2% trainable parameters on GLUE.
SMoA: Spectrum Modulation Adapter for Parameter-Efficient Fine-Tuning cs.LG · 2026-05-20 · unverdicted · none · ref 46
SMoA is a new PEFT adapter that uses block-wise Hadamard-modulated low-rank branches on spectral partitions to cover more pretrained spectral directions than standard LoRA under a smaller parameter budget.

Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer