Ties-merging: Resolving interference when merging models.Advances in neural information processing systems, 36:7093–7115

Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raffel, Mohit Bansal · 2023

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

baseline 2 method 1

citation-polarity summary

baseline 2 use method 1

representative citing papers

Discovering Physical Directions in Weight Space: Composing Neural PDE Experts

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.

DiM\textsuperscript{3}: Bridging Multilingual and Multimodal Models via Direction- and Magnitude-Aware Merging

cs.CL · 2026-05-13 · conditional · novelty 6.0 · 2 refs

DiM3 is a direction- and magnitude-aware merging method that composes heterogeneous multilingual and multimodal updates in LLM backbones, outperforming baselines on 57-language benchmarks while retaining multimodal performance.

Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Forgetting in LLM continual post-training is a geometry conflict between task-induced covariance structures and the evolving model state, controlled by gating Wasserstein barycenter merging on measured conflict.

Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval

cs.CV · 2026-05-08 · unverdicted · novelty 6.0 · 2 refs

DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.

M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

M2A uses null-space model merging to combine mathematical and agentic reasoning in LLMs, raising SWE-Bench Verified performance from 44.0% to 51.2% on Qwen3-8B without retraining.

citing papers explorer

Showing 5 of 5 citing papers.

Discovering Physical Directions in Weight Space: Composing Neural PDE Experts cs.LG · 2026-05-14 · unverdicted · none · ref 38
Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.
DiM\textsuperscript{3}: Bridging Multilingual and Multimodal Models via Direction- and Magnitude-Aware Merging cs.CL · 2026-05-13 · conditional · none · ref 21 · 2 links
DiM3 is a direction- and magnitude-aware merging method that composes heterogeneous multilingual and multimodal updates in LLM backbones, outperforming baselines on 57-language benchmarks while retaining multimodal performance.
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training cs.LG · 2026-05-10 · unverdicted · none · ref 60
Forgetting in LLM continual post-training is a geometry conflict between task-induced covariance structures and the evolving model state, controlled by gating Wasserstein barycenter merging on measured conflict.
Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval cs.CV · 2026-05-08 · unverdicted · none · ref 46 · 2 links
DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.
M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models cs.AI · 2026-05-11 · unverdicted · none · ref 41
M2A uses null-space model merging to combine mathematical and agentic reasoning in LLMs, raising SWE-Bench Verified performance from 44.0% to 51.2% on Qwen3-8B without retraining.

Ties-merging: Resolving interference when merging models.Advances in neural information processing systems, 36:7093–7115

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer