A controlled formal language task reveals fine-tuning outperforms in-context learning on in-distribution generalization but equals it on out-of-distribution, with ICL showing greater sensitivity to model size and tokenization.
BUFFET : Benchmarking Large Language Models for Few-shot Cross-lingual Transfer
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CL 3years
2026 3roles
background 1polarities
background 1representative citing papers
A Bayesian framework decomposes mLLM variance, showing language features explain 79-92% of language identity variance and that model identity vs. benchmark-model interactions dominate differently for understanding versus reasoning tasks.
Parameter alignment strategies substantially reduce forgetting in family-based continual pretraining of multilingual LLMs across 32 languages with minimal impact on language acquisition.
citing papers explorer
-
Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective
A controlled formal language task reveals fine-tuning outperforms in-context learning on in-distribution generalization but equals it on out-of-distribution, with ICL showing greater sensitivity to model size and tokenization.
-
DEPART: DEcomposing PARiTy across Multilingual LLMs
A Bayesian framework decomposes mLLM variance, showing language features explain 79-92% of language identity variance and that model identity vs. benchmark-model interactions dominate differently for understanding versus reasoning tasks.
-
Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models
Parameter alignment strategies substantially reduce forgetting in family-based continual pretraining of multilingual LLMs across 32 languages with minimal impact on language acquisition.