pith. sign in

arXiv preprint arXiv:2209.13569 , year=

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

fields

cs.LG 5

clear filters

representative citing papers

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

cs.LG · 2024-03-06 · conditional · novelty 7.0

GaLore performs full-parameter LLM training with up to 65.5% less optimizer memory by projecting gradients onto a low-rank subspace at each step, matching full-rank performance on LLaMA pre-training and RoBERTa fine-tuning.

DLR: Zero-Inference-Cost Latent Residuals for Low-Rank Pre-Training

cs.LG · 2026-06-27 · unverdicted · novelty 5.0

DLR augments low-rank factorization with a fixed structured residual during training that is absorbed post-training, improving C4 perplexity for LLaMA models from 60M to 7B while preserving exact low-rank inference cost.

citing papers explorer

Showing 4 of 4 citing papers after filters.