MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.
Sparse MeZO: Less parameters for better performance in zeroth-order LLM fine-tuning,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization
MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.