Recognition: no theorem link
Demystifying Mergeability: Interpretable Properties to Predict Model Merging Success
Pith reviewed 2026-05-16 09:47 UTC · model grok-4.3
The pith
Gradient alignment between fine-tuned models is the strongest predictor of successful merging across methods and tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Mergeability depends on both the merging method and the partner tasks rather than being intrinsic to the models. L1-regularized linear optimization over interpretable pairwise metrics, such as gradient L2 distance, reveals that gradient alignment metrics are the most reliable signals of compatibility and consistently correlate with post-merge accuracy. Architecture- and method-specific variation exists, with an average 64.0 percent top-5 metric overlap and 79.3 percent sign agreement, yet TIES and similar methods show distinct fingerprints while gradient alignment remains the shared core driver.
What carries the argument
L1-regularized linear optimization applied to a set of interpretable pairwise metrics (gradient L2 distance and similar quantities) to predict normalized post-merge accuracy.
If this is right
- Gradient alignment can serve as a pre-merge diagnostic to select compatible model pairs.
- Different merging methods such as TIES possess distinct sets of predictive metrics.
- Task pairs can be evaluated for merge potential using gradient-based metrics before any merging is attempted.
- The framework supports development of fine-tuning procedures that deliberately improve gradient alignment.
- Architecture-specific differences in metric importance can guide method selection for particular model families.
Where Pith is reading between the lines
- Practitioners could compute gradient metrics on candidate models to rank merge partners without running full merge experiments.
- Fine-tuning objectives might be modified to encourage better gradient alignment between related tasks.
- The same metrics could help decide which merging method to apply to a given pair of models.
Load-bearing premise
The chosen collection of pairwise metrics plus L1-regularized linear optimization is sufficient to reveal the true drivers of merge success without overlooking important unmeasured factors.
What would settle it
Finding model pairs that exhibit strong gradient alignment yet produce low post-merge accuracy, or weak alignment yet high accuracy, would falsify the claim that gradient alignment is the fundamental signal.
read the original abstract
Model merging combines knowledge from separately fine-tuned models, yet the factors driving its success remain poorly understood. While recent work treats mergeability as an intrinsic property of the models, we show with an architecture-agnostic framework that it fundamentally depends on both the merging method and the partner tasks. Using L1-regularized linear optimization over a set of interpretable pairwise metrics (e.g., gradient $L_2$ distance), we uncover properties correlating with post-merge normalized accuracy across five merging methods. We find architecture- and method-specific variation in success drivers (64.0% average top-5 metric overlap; 79.3% sign agreement), with certain methods, notably TIES, exhibiting distinct ``fingerprints'' that diverge from the broader consensus. Crucially, however, \textit{gradient alignment} metrics consistently emerge as the most fundamental signals of compatibility. These findings provide a diagnostic foundation for understanding mergeability and motivate future merge-aware fine-tuning strategies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that model mergeability depends on both the merging method and partner tasks rather than being an intrinsic model property. Using an architecture-agnostic framework, it applies L1-regularized linear regression to a set of interpretable pairwise metrics (e.g., gradient L2 distance) to predict post-merge normalized accuracy across five merging methods. The analysis reveals method-specific variation in top predictors (64% average top-5 overlap, 79.3% sign agreement) but identifies gradient alignment metrics as the most consistent signals of compatibility.
Significance. If the empirical correlations hold under fuller validation, the work supplies a diagnostic toolkit of interpretable metrics that could guide merge-aware fine-tuning and method selection. The emphasis on architecture-agnostic pairwise metrics and the observation of method-specific fingerprints are constructive contributions that move beyond black-box merge success prediction.
major comments (3)
- [§4] §4 (Experimental Setup): No details are supplied on the datasets, number of fine-tuned models, validation splits, number of random seeds, or statistical significance testing for the reported correlations. Without these, it is impossible to determine whether the claimed superiority of gradient alignment metrics is robust or sensitive to unstated post-hoc choices.
- [§5.1] §5.1 (Metric Selection and Regression): The 64% average top-5 metric overlap across methods already signals instability in the selected drivers. This variability, combined with the exclusive use of L1-linear regression, raises the possibility that non-linear interactions or omitted variables (e.g., curvature or task-specific loss landscape terms) are the true drivers; the manuscript provides no ablation or comparison against non-linear models to rule this out.
- [§5.3] §5.3 (Gradient Alignment Claim): The central assertion that gradient alignment metrics are the most fundamental signals rests on their consistent ranking under L1 regularization. Because the paper does not test whether this ranking survives removal of the linearity assumption or addition of interaction terms, the claim remains conditional on an unverified modeling choice.
minor comments (2)
- [§3] Notation for normalized accuracy and the precise definition of each pairwise metric should be collected in a single table for easy reference.
- [Figures] Figure captions should explicitly state what the error bars represent and whether results are averaged over multiple random seeds.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and describe the revisions we will make to improve clarity and robustness.
read point-by-point responses
-
Referee: [§4] §4 (Experimental Setup): No details are supplied on the datasets, number of fine-tuned models, validation splits, number of random seeds, or statistical significance testing for the reported correlations. Without these, it is impossible to determine whether the claimed superiority of gradient alignment metrics is robust or sensitive to unstated post-hoc choices.
Authors: We agree that §4 requires expanded details for full reproducibility and to substantiate the robustness of the gradient alignment findings. In the revised manuscript we will add: the exact datasets and tasks used, the total number of fine-tuned models, validation split ratios, the number of random seeds (five seeds were employed), and statistical significance tests (including p-values and confidence intervals for the reported correlations). These additions will allow readers to evaluate sensitivity to experimental choices. revision: yes
-
Referee: [§5.1] §5.1 (Metric Selection and Regression): The 64% average top-5 metric overlap across methods already signals instability in the selected drivers. This variability, combined with the exclusive use of L1-linear regression, raises the possibility that non-linear interactions or omitted variables (e.g., curvature or task-specific loss landscape terms) are the true drivers; the manuscript provides no ablation or comparison against non-linear models to rule this out.
Authors: We disagree that the reported 64% top-5 overlap indicates instability; it instead quantifies the method-specific variation that constitutes one of the paper’s central contributions, as further supported by the 79.3% sign agreement and the distinct fingerprints observed for methods such as TIES. The deliberate choice of L1-regularized linear regression prioritizes interpretability of the pairwise metrics, which is essential to the diagnostic toolkit we aim to provide. While non-linear models might capture additional interactions, they would sacrifice the transparency needed to identify consistent signals such as gradient alignment. In the revision we will add an explicit discussion of this design choice and the associated trade-off. revision: partial
-
Referee: [§5.3] §5.3 (Gradient Alignment Claim): The central assertion that gradient alignment metrics are the most fundamental signals rests on their consistent ranking under L1 regularization. Because the paper does not test whether this ranking survives removal of the linearity assumption or addition of interaction terms, the claim remains conditional on an unverified modeling choice.
Authors: The claim is explicitly conditioned on the linear regression framework we adopted for interpretability. Within that framework, gradient alignment metrics rank consistently across all five merging methods, providing the most reliable signal we observe. We acknowledge that relaxing the linearity assumption could be informative; however, doing so would move away from the interpretable properties that are the paper’s focus. In the revision we will clarify the scope of the claim and list non-linear extensions as a direction for future work. revision: no
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper defines interpretable pairwise metrics (e.g., gradient L2 distance) independently of merge outcomes and applies L1-regularized linear regression to identify correlations with post-merge normalized accuracy. This is a standard empirical correlation analysis with no reduction of the claimed result to a fitted quantity by construction, no self-definitional loops, and no load-bearing self-citations or ansatzes invoked in the provided text. The finding that gradient alignment metrics emerge as key signals is data-driven rather than tautological.
Axiom & Free-Parameter Ledger
free parameters (1)
- L1 regularization strength
axioms (1)
- domain assumption Linear relationship between selected pairwise metrics and normalized post-merge accuracy
Forward citations
Cited by 2 Pith papers
-
Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training
Low-rank pre-training methods converge to geometrically and spectrally distinct basins from full-rank training and from each other, even at similar validation perplexity.
-
Model Merging: Foundations and Algorithms
New cycle-consistent optimization, task vector theory, singular vector decompositions, adaptive routing, and efficient evolutionary search provide foundations for merging neural network weights across tasks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.