pith. sign in

arxiv: 2502.10436 · v4 · pith:BEZMONTNnew · submitted 2025-02-09 · 💻 cs.NE · cs.AI· cs.LG

MERGE³: Efficient Evolutionary Merging on Consumer-grade GPUs

classification 💻 cs.NE cs.AIcs.LG
keywords mergingevolutionarymergemodelefficientenablesperformanceabilities
0
0 comments X
read the original abstract

Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging feasible on a single GPU by reducing fitness computation costs 50$\times$ while preserving performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Model Merging: Foundations and Algorithms

    cs.LG 2026-05 unverdicted novelty 6.0

    New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.