Adamerging: Adaptive model merging for multi-task learning

Enneng Yang, Zhenyi Wang, Li Shen, Shiwei Liu, Guibing Guo, Xingwei Wang, Dacheng Tao · 2023 · arXiv 2310.02575

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Discovering Physical Directions in Weight Space: Composing Neural PDE Experts

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.

CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing

cs.SE · 2026-05-13 · unverdicted · novelty 7.0

CRANE merges Instruct and Thinking model checkpoints via constrained nullspace editing to improve code agent reasoning and benchmark performance without retraining.

PivotMerge: Bridging Heterogeneous Multimodal Pre-training via Post-Alignment Model Merging

cs.CV · 2026-04-18 · unverdicted · novelty 6.0

PivotMerge merges heterogeneous multimodal pre-trained models via shared-space decomposition to filter conflicts and layer-wise weights based on alignment contributions, outperforming baselines on multimodal benchmarks.

Model Merging Scaling Laws in Large Language Models

cs.AI · 2025-09-29 · unverdicted · novelty 6.0

Empirical scaling laws for LLM merging show a size-dependent floor and 1/k-like tail in cross-entropy loss that holds across architectures and merging methods.

M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

M2A uses null-space model merging to combine mathematical and agentic reasoning in LLMs, raising SWE-Bench Verified performance from 44.0% to 51.2% on Qwen3-8B without retraining.

Differentially Private Model Merging

cs.LG · 2026-04-22 · unverdicted · novelty 5.0

Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.

Retrofit: Continual Learning with Controlled Forgetting for Binary Security Detection and Analysis

cs.LG · 2025-11-14 · unverdicted · novelty 5.0

RETROFIT enables continual learning for malware detection and binary summarization by retrospective-free parameter merging with low-rank sparse updates and confidence-guided arbitration, improving retention and generalization without historical data.

citing papers explorer

Showing 7 of 7 citing papers.

Discovering Physical Directions in Weight Space: Composing Neural PDE Experts cs.LG · 2026-05-14 · unverdicted · none · ref 39
Fine-tuning neural PDE operators to regime endpoints reveals a physical direction in weight space that CCM uses to compose accurate merged models for new or extrapolated regimes from metadata or short prefixes.
CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing cs.SE · 2026-05-13 · unverdicted · none · ref 12
CRANE merges Instruct and Thinking model checkpoints via constrained nullspace editing to improve code agent reasoning and benchmark performance without retraining.
PivotMerge: Bridging Heterogeneous Multimodal Pre-training via Post-Alignment Model Merging cs.CV · 2026-04-18 · unverdicted · none · ref 5
PivotMerge merges heterogeneous multimodal pre-trained models via shared-space decomposition to filter conflicts and layer-wise weights based on alignment contributions, outperforming baselines on multimodal benchmarks.
Model Merging Scaling Laws in Large Language Models cs.AI · 2025-09-29 · unverdicted · none · ref 19
Empirical scaling laws for LLM merging show a size-dependent floor and 1/k-like tail in cross-entropy loss that holds across architectures and merging methods.
M2A: Synergizing Mathematical and Agentic Reasoning in Large Language Models cs.AI · 2026-05-11 · unverdicted · none · ref 43
M2A uses null-space model merging to combine mathematical and agentic reasoning in LLMs, raising SWE-Bench Verified performance from 44.0% to 51.2% on Qwen3-8B without retraining.
Differentially Private Model Merging cs.LG · 2026-04-22 · unverdicted · none · ref 9
Post-processing via random selection or linear combination of differentially private models allows meeting arbitrary target privacy parameters without additional training.
Retrofit: Continual Learning with Controlled Forgetting for Binary Security Detection and Analysis cs.LG · 2025-11-14 · unverdicted · none · ref 53
RETROFIT enables continual learning for malware detection and binary summarization by retrospective-free parameter merging with low-rank sparse updates and confidence-guided arbitration, improving retention and generalization without historical data.

Adamerging: Adaptive model merging for multi-task learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer