Mmlu-pro: A more robust and challenging multi-task language understanding benchmark

Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, Wenhu Chen · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling

cs.LG · 2025-07-02 · unverdicted · novelty 7.0

Prefix-RFT blends SFT and RFT via prefix sampling from demonstrations to outperform standalone SFT, RFT, and mixed-policy baselines on math reasoning problems.

citing papers explorer

Showing 1 of 1 citing paper.

Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling cs.LG · 2025-07-02 · unverdicted · none · ref 37
Prefix-RFT blends SFT and RFT via prefix sampling from demonstrations to outperform standalone SFT, RFT, and mixed-policy baselines on math reasoning problems.

Mmlu-pro: A more robust and challenging multi-task language understanding benchmark

fields

years

verdicts

representative citing papers

citing papers explorer