Language models can use a two-stage sleep process of upward distillation for memory consolidation and RL-based dreaming for unsupervised self-improvement to enable continual learning.
Merging loras like playing lego: Pushing the modularity of lora to extremes through rank-wise clustering
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
LoRA-Mixer routes modular LoRA experts into attention projection matrices with an adaptive Routing Specialization Loss to improve multi-task performance while using fewer trainable parameters than prior LoRA-MoE methods.
SOLAR introduces a self-optimizing agent using meta-learning on model weights and RL-driven strategy discovery for lifelong adaptation in LLMs, claiming superior performance on reasoning tasks across domains.
citing papers explorer
-
Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories
Language models can use a two-stage sleep process of upward distillation for memory consolidation and RL-based dreaming for unsupervised self-improvement to enable continual learning.
-
SOLAR: A Self-Optimizing Open-Ended Autonomous Agent for Lifelong Learning and Continual Adaptation
SOLAR introduces a self-optimizing agent using meta-learning on model weights and RL-driven strategy discovery for lifelong adaptation in LLMs, claiming superior performance on reasoning tasks across domains.