Han et al

· 2025 · arXiv 2508.02668

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Low-rank pre-training methods converge to geometrically and spectrally distinct basins from full-rank training and from each other, even at similar validation perplexity.

Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction

cs.LG · 2026-04-01 · conditional · novelty 6.0

SCT pre-trains LLMs by keeping weights as compact SVD factors with Stiefel QR retraction, delivering up to 199x memory reduction per layer and allowing 70B-parameter training on a Steam Deck.

BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models

cs.LG · 2025-12-13 · unverdicted · novelty 6.0

BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

cs.LG · 2025-09-23 · unverdicted · novelty 6.0

CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.

Low-Rank Adaptation Redux for Large Models

cs.LG · 2026-04-23 · unverdicted · novelty 3.0

An overview revisits LoRA variants by categorizing advances in architectural design, efficient optimization, and applications while linking them to classical signal processing tools for principled fine-tuning.

citing papers explorer

Showing 5 of 5 citing papers.

Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training cs.LG · 2026-05-13 · unverdicted · none · ref 13
Low-rank pre-training methods converge to geometrically and spectrally distinct basins from full-rank training and from each other, even at similar validation perplexity.
Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction cs.LG · 2026-04-01 · conditional · none · ref 2
SCT pre-trains LLMs by keeping weights as compact SVD factors with Stiefel QR retraction, delivering up to 199x memory reduction per layer and allowing 70B-parameter training on a Steam Deck.
BOOST: BOttleneck-Optimized Scalable Training Framework for Low-Rank Large Language Models cs.LG · 2025-12-13 · unverdicted · none · ref 12
BOOST delivers 1.46-2.27x end-to-end speedups for low-rank bottleneck LLMs by redesigning tensor parallelism around the bottleneck structure plus supporting optimizations.
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure cs.LG · 2025-09-23 · unverdicted · none · ref 36
CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.
Low-Rank Adaptation Redux for Large Models cs.LG · 2026-04-23 · unverdicted · none · ref 107
An overview revisits LoRA variants by categorizing advances in architectural design, efficient optimization, and applications while linking them to classical signal processing tools for principled fine-tuning.

Han et al

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer