ChunkFT enables full-parameter fine-tuning of Llama 3-8B on one 24 GB GPU and Llama 3-70B on two 80 GB GPUs by streaming gradients over dynamically activated sub-tensors.
Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4roles
baseline 2polarities
baseline 2representative citing papers
S2FT replaces the sparse-spectrum assumption of prior Fourier PEFT with a learned rearrangement that maps a pre-estimated weight change into a domain where few spectral coefficients suffice.
Pretraining induces stable leading singular vectors that form a reusable spectral basis inherited by downstream tasks, enabling competitive performance with 0.2% trainable parameters on GLUE.
SMoA is a new PEFT adapter that uses block-wise Hadamard-modulated low-rank branches on spectral partitions to cover more pretrained spectral directions than standard LoRA under a smaller parameter budget.
citing papers explorer
-
ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning
ChunkFT enables full-parameter fine-tuning of Llama 3-8B on one 24 GB GPU and Llama 3-70B on two 80 GB GPUs by streaming gradients over dynamically activated sub-tensors.
-
S2FT: Parameter-Efficient Fine-Tuning in Sparse Spectrum Domain
S2FT replaces the sparse-spectrum assumption of prior Fourier PEFT with a learned rearrangement that maps a pre-estimated weight change into a domain where few spectral coefficients suffice.
-
Pretraining Induces a Reusable Spectral Basis for Downstream Task Adaptation
Pretraining induces stable leading singular vectors that form a reusable spectral basis inherited by downstream tasks, enabling competitive performance with 0.2% trainable parameters on GLUE.
-
SMoA: Spectrum Modulation Adapter for Parameter-Efficient Fine-Tuning
SMoA is a new PEFT adapter that uses block-wise Hadamard-modulated low-rank branches on spectral partitions to cover more pretrained spectral directions than standard LoRA under a smaller parameter budget.