arXiv preprint arXiv:2209.13569 , year=

Exploring low rank training of deep neural networks , author= · 2022 · arXiv 2209.13569

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

representative citing papers

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

cs.LG · 2024-03-06 · conditional · novelty 7.0

GaLore performs full-parameter LLM training with up to 65.5% less optimizer memory by projecting gradients onto a low-rank subspace at each step, matching full-rank performance on LLaMA pre-training and RoBERTa fine-tuning.

Dr. Post-Training: A Data Regularization Perspective on LLM Post-Training

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

Dr. Post-Training reframes general data as a data-induced regularizer for LLM post-training updates, yielding a family of methods that outperform data-selection baselines on SFT, RLHF, and RLVR tasks.

BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models

cs.LG · 2025-12-13 · unverdicted · novelty 6.0

BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

cs.LG · 2025-09-23 · unverdicted · novelty 6.0

CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.

DLR: Zero-Inference-Cost Latent Residuals for Low-Rank Pre-Training

cs.LG · 2026-06-27 · unverdicted · novelty 5.0

DLR augments low-rank factorization with a fixed structured residual during training that is absorbed post-training, improving C4 perplexity for LLaMA models from 60M to 7B while preserving exact low-rank inference cost.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Dr. Post-Training: A Data Regularization Perspective on LLM Post-Training cs.LG · 2026-05-08 · unverdicted · none · ref 55
Dr. Post-Training reframes general data as a data-induced regularizer for LLM post-training updates, yielding a family of methods that outperform data-selection baselines on SFT, RLHF, and RLVR tasks.
BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models cs.LG · 2025-12-13 · unverdicted · none · ref 7
BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure cs.LG · 2025-09-23 · unverdicted · none · ref 32
CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.
DLR: Zero-Inference-Cost Latent Residuals for Low-Rank Pre-Training cs.LG · 2026-06-27 · unverdicted · none · ref 26
DLR augments low-rank factorization with a fixed structured residual during training that is absorbed post-training, improving C4 perplexity for LLaMA models from 60M to 7B while preserving exact low-rank inference cost.

arXiv preprint arXiv:2209.13569 , year=

fields

years

verdicts

representative citing papers

citing papers explorer