ARCA assigns token credit in LoRA-based LLM RL from the norm of adapter-induced hidden state changes, yielding non-degenerate distributions and competitive performance on MATH tasks with Qwen3-1.7B under GRPO.
arXiv preprint arXiv:2512.23165 , year=
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
BoostLoRA grows effective adapter rank linearly via iterative boosting on hard examples with orthogonal low-rank updates, outperforming both single-shot ultra-low-rank adapters and full fine-tuning on math and code tasks with zero added inference overhead.
FuRA uses block tensor-train factorization with fixed pretrained SVD basis to achieve full-rank spectral preconditioning, outperforming Full FT by +1.37 on LLaMA-3-8B commonsense reasoning and surpassing QLoRA in quantized settings.
Alpha in LoRA outperforms learning-rate scaling, follows a square-root law with rank, and enables a minimalist LoRA-alpha method that improves performance across tasks.
PEFT adapters are positioned as persistent personal state on foundation models, organized via Scale Up, Scale Down, and Scale Out axes, with MinT as an infrastructure example for managing them.
A 14B model trained on synthetic data from Brazilian clinical guidelines outperforms larger LLMs on new benchmarks for Brazilian healthcare protocols.
citing papers explorer
-
ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate
ARCA assigns token credit in LoRA-based LLM RL from the norm of adapter-induced hidden state changes, yielding non-degenerate distributions and competitive performance on MATH tasks with Qwen3-1.7B under GRPO.
-
BoostLoRA: Growing Effective Rank by Boosting Adapters
BoostLoRA grows effective adapter rank linearly via iterative boosting on hard examples with orthogonal low-rank updates, outperforming both single-shot ultra-low-rank adapters and full fine-tuning on math and code tasks with zero added inference overhead.
-
FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning
FuRA uses block tensor-train factorization with fixed pretrained SVD basis to achieve full-rank spectral preconditioning, outperforming Full FT by +1.37 on LLaMA-3-8B commonsense reasoning and surpassing QLoRA in quantized settings.
-
The Hidden Power of Scaling Factor in LoRA Optimization
Alpha in LoRA outperforms learning-rate scaling, follows a square-root law with rank, and enables a minimalist LoRA-alpha method that improves performance across tasks.
-
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters
PEFT adapters are positioned as persistent personal state on foundation models, organized via Scale Up, Scale Down, and Scale Out axes, with MinT as an infrastructure example for managing them.
-
Teaching LLMs Brazilian Healthcare: Injecting Knowledge from Official Clinical Guidelines
A 14B model trained on synthetic data from Brazilian clinical guidelines outperforms larger LLMs on new benchmarks for Brazilian healthcare protocols.