A 7B Qwen-2.5 LLM trained with a new RL framework on only 9 ML tasks achieves performance comparable to much larger proprietary LLM agents at lower computational cost with cross-task generalization.
The learning rate was changed to 0.05 and the number of boosting stages increased to 200, but the performance decreased slightly
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering
A 7B Qwen-2.5 LLM trained with a new RL framework on only 9 ML tasks achieves performance comparable to much larger proprietary LLM agents at lower computational cost with cross-task generalization.