Sewa: Selective weight average via probabilistic masking

13 Under review as a conference paper Peng Wang, Shengchao Hu, Zerui Tao, Guoxia Wang, Dianhai Yu, Li Shen, Quan Zheng, Dacheng Tao · arXiv 2502.10119

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

Model Merging Scaling Laws in Large Language Models

cs.AI · 2025-09-29 · unverdicted · novelty 6.0

Empirical scaling laws for LLM merging show a size-dependent floor and 1/k-like tail in cross-entropy loss that holds across architectures and merging methods.

citing papers explorer

Showing 1 of 1 citing paper.

Model Merging Scaling Laws in Large Language Models cs.AI · 2025-09-29 · unverdicted · none · ref 16
Empirical scaling laws for LLM merging show a size-dependent floor and 1/k-like tail in cross-entropy loss that holds across architectures and merging methods.

Sewa: Selective weight average via probabilistic masking

fields

years

verdicts

representative citing papers

citing papers explorer