Model merging is cast as PoE inference with EBM experts, revealing Gaussian assumptions in prior work and proposing convergent Cauchy experts that improve empirical performance.
Advances in Neural Information Processing Systems , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 5years
2026 5verdicts
UNVERDICTED 5representative citing papers
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
DKPS-based methods predict new model benchmark scores using cached responses, matching baseline mean absolute error with substantially fewer queries and an offline query selection approach.
Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.
FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.
citing papers explorer
-
Model Merging as Probabilistic Inference in Fine-Tuning Parameter Space
Model merging is cast as PoE inference with EBM experts, revealing Gaussian assumptions in prior work and proposing convergent Cauchy experts that improve empirical performance.
-
Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
-
Query-efficient model evaluation using cached responses
DKPS-based methods predict new model benchmark scores using cached responses, matching baseline mean absolute error with substantially fewer queries and an offline query selection approach.
-
Differentially Private Model Merging
Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.
-
FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion
FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.