Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.
Ties-merging: Resolving interference when merging models.Advances in neural information processing systems, 36:7093–7115
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5representative citing papers
DiM3 is a direction- and magnitude-aware merging method that composes heterogeneous multilingual and multimodal updates in LLM backbones, outperforming baselines on 57-language benchmarks while retaining multimodal performance.
Forgetting in LLM continual post-training is a geometry conflict between task-induced covariance structures and the evolving model state, controlled by gating Wasserstein barycenter merging on measured conflict.
DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.
M2A uses null-space model merging to combine mathematical and agentic reasoning in LLMs, raising SWE-Bench Verified performance from 44.0% to 51.2% on Qwen3-8B without retraining.
citing papers explorer
-
Discovering Physical Directions in Weight Space: Composing Neural PDE Experts
Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.
-
DiM\textsuperscript{3}: Bridging Multilingual and Multimodal Models via Direction- and Magnitude-Aware Merging
DiM3 is a direction- and magnitude-aware merging method that composes heterogeneous multilingual and multimodal updates in LLM backbones, outperforming baselines on 57-language benchmarks while retaining multimodal performance.
-
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training
Forgetting in LLM continual post-training is a geometry conflict between task-induced covariance structures and the evolving model state, controlled by gating Wasserstein barycenter merging on measured conflict.
-
Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval
DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.
-
M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models
M2A uses null-space model merging to combine mathematical and agentic reasoning in LLMs, raising SWE-Bench Verified performance from 44.0% to 51.2% on Qwen3-8B without retraining.