Enhancing zeroth-order fine-tuning for language models with low-rank structures.arXiv preprint arXiv:2410.07698

Yiming Chen, Yuan Zhang, Liyuan Cao, Kun Yuan, Zaiwen Wen · 2024 · arXiv 2410.07698

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Accelerating Zeroth-Order Spectral Optimization with Partial Orthogonalization from Power Iteration

cs.LG · 2026-05-09 · conditional · novelty 6.0 · 2 refs

ZO-MOPI accelerates zeroth-order LLM fine-tuning by applying partial spectral orthogonalization from power iteration inside a momentum-projected subspace to reduce variance and exploit dominant directions.

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

cs.LG · 2025-09-23 · unverdicted · novelty 6.0

CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.

Low-rank surrogate modeling and stochastic zero-order optimization for training of neural networks with black-box layers

cs.LG · 2025-09-18 · unverdicted · novelty 6.0

A framework combining stochastic zeroth-order optimization and dynamic low-rank surrogate modeling with an implicit projector-splitting integrator enables end-to-end training of hybrid neural networks containing black-box physical layers and reaches near-digital accuracy on vision, audio, and text任务

AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

cs.LG · 2026-05-01 · unverdicted · novelty 5.0

AdaMeZO adapts Adam moment estimates to zeroth-order LLM fine-tuning without extra memory storage, outperforming MeZO with up to 70% fewer forward passes.

citing papers explorer

Showing 4 of 4 citing papers.

Accelerating Zeroth-Order Spectral Optimization with Partial Orthogonalization from Power Iteration cs.LG · 2026-05-09 · conditional · none · ref 1 · 2 links
ZO-MOPI accelerates zeroth-order LLM fine-tuning by applying partial spectral orthogonalization from power iteration inside a momentum-projected subspace to reduce variance and exploit dominant directions.
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure cs.LG · 2025-09-23 · unverdicted · none · ref 10
CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.
Low-rank surrogate modeling and stochastic zero-order optimization for training of neural networks with black-box layers cs.LG · 2025-09-18 · unverdicted · none · ref 18
A framework combining stochastic zeroth-order optimization and dynamic low-rank surrogate modeling with an implicit projector-splitting integrator enables end-to-end training of hybrid neural networks containing black-box physical layers and reaches near-digital accuracy on vision, audio, and text任务
AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments cs.LG · 2026-05-01 · unverdicted · none · ref 55
AdaMeZO adapts Adam moment estimates to zeroth-order LLM fine-tuning without extra memory storage, outperforming MeZO with up to 70% fewer forward passes.

Enhancing zeroth-order fine-tuning for language models with low-rank structures.arXiv preprint arXiv:2410.07698

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer