TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.
Unlocking the Potential of Model Merging for Low-Resource Languages
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3representative citing papers
Merging fine-tuned models for multilingual translation fails because fine-tuning redistributes language-specific neurons rather than sharpening them, increasing representational divergence in output-generating layers.
SSU mitigates catastrophic forgetting in low-resource LLM target-language adaptation by scoring and column-wise freezing source-critical parameters, reducing source degradation to ~3% versus ~20% for full fine-tuning while matching target performance.
citing papers explorer
-
Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion
TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.
-
One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model Merging
Merging fine-tuned models for multilingual translation fails because fine-tuning redistributes language-specific neurons rather than sharpening them, increasing representational divergence in output-generating layers.
-
Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
SSU mitigates catastrophic forgetting in low-resource LLM target-language adaptation by scoring and column-wise freezing source-critical parameters, reducing source degradation to ~3% versus ~20% for full fine-tuning while matching target performance.