arXiv e-prints (2024), arXiv–2407

The llama 3 herd of models · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training

cs.LG · 2025-07-21 · unverdicted · novelty 7.0

An RL agent learns domain re-weighting policies from evaluation feedback to improve balanced performance in continual pre-training of LLMs across source and target domains.

citing papers explorer

Showing 1 of 1 citing paper.

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training cs.LG · 2025-07-21 · unverdicted · none · ref 8
An RL agent learns domain re-weighting policies from evaluation feedback to improve balanced performance in continual pre-training of LLMs across source and target domains.

arXiv e-prints (2024), arXiv–2407

fields

years

verdicts

representative citing papers

citing papers explorer