PARAMΔ upcycles dense models to MoE for per-language experts and grafts post-training deltas to enable data-efficient language expansion while preserving original capabilities.
Advances in Neural Information Processing Systems , volume=
6 Pith papers cite this work. Polarity classification is still indexing.
years
2026 6representative citing papers
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
ENMP prunes negative LoRA modules via evolutionary search to boost merging performance to new state-of-the-art levels across language and vision tasks.
MVProbe is a multi-perspective probing framework for weight-space learning that combines first-order and Gram-based views and outperforms ProbeX on the Model Jungle benchmark.
Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.
FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.
citing papers explorer
-
Evolutionary Negative Module Pruning for Better LoRA Merging
ENMP prunes negative LoRA modules via evolutionary search to boost merging performance to new state-of-the-art levels across language and vision tasks.