RL with chrF reward trains LLMs to better utilize in-context linguistic knowledge for zero-shot translation of unseen languages, outperforming ICL and SFT.
BLOOM +1: Adding language support to BLOOM for zero-shot prompting
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CL 4roles
background 1polarities
unclear 1representative citing papers
TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.
SSU mitigates catastrophic forgetting in low-resource LLM target-language adaptation by scoring and column-wise freezing source-critical parameters, reducing source degradation to ~3% versus ~20% for full fine-tuning while matching target performance.
Continued pretraining of Code Llama on Proof-Pile-2 yields Llemma, an open math-specialized LLM that beats known open base models on MATH and supports tool use plus formal proving out of the box.
citing papers explorer
-
Llemma: An Open Language Model For Mathematics
Continued pretraining of Code Llama on Proof-Pile-2 yields Llemma, an open math-specialized LLM that beats known open base models on MATH and supports tool use plus formal proving out of the box.