InfoLaw models pretraining as information accumulation where quality sets information density and repetition causes scale-dependent diminishing returns, predicting loss with low error on unseen mixtures and larger scales up to 7B models and 425B tokens.
Few-shot Learning with Multilingual Generative Language Models
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CL 4roles
dataset 1polarities
use dataset 1representative citing papers
LANG combines language-adaptive hint guidance, progressive decay, and difficulty-tailored learning horizons in RL to boost non-English reasoning performance while preserving language consistency.
BM25 retrieval makes many-shot ICL for low-resource MT roughly 5x more data-efficient, with 50 examples matching 250 random ones and 250 matching 1000.
The survey identifies a key tension in multilingual vision-language models between language neutrality via contrastive learning and cultural awareness via diverse data, with most benchmarks relying on translation-based evaluation.
citing papers explorer
-
InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition
InfoLaw models pretraining as information accumulation where quality sets information density and repetition causes scale-dependent diminishing returns, predicting loss with low error on unseen mixtures and larger scales up to 7B models and 425B tokens.
-
LANG: Reinforcement Learning for Multilingual Reasoning with Language-Adaptive Hint Guidance
LANG combines language-adaptive hint guidance, progressive decay, and difficulty-tailored learning horizons in RL to boost non-English reasoning performance while preserving language consistency.
-
An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages
BM25 retrieval makes many-shot ICL for low-resource MT roughly 5x more data-efficient, with 50 examples matching 250 random ones and 250 matching 1000.
-
Multilingual Vision-Language Models, A Survey
The survey identifies a key tension in multilingual vision-language models between language neutrality via contrastive learning and cultural awareness via diverse data, with most benchmarks relying on translation-based evaluation.