LiveCodeBench 05/23-05/24 subset has 511 problems released between May 2023 and May 2024, whereas the 06/24-01/25 subset has 369 problems released between May 2024 and Jan

LiveCodeBench: a benchmark of real-world programming tasks that evaluate a model’s ability to generate, execute, verify, iteratively repair solutions using unit-test feedback · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

OpenThoughts: Data Recipes for Reasoning Models

cs.LG · 2025-06-04 · conditional · novelty 7.0

OpenThoughts3-7B, trained on a 1.2M-example public dataset via systematic pipeline optimizations, achieves 53% on AIME 2025, 51% on LiveCodeBench, and 54% on GPQA Diamond.

citing papers explorer

Showing 1 of 1 citing paper.

OpenThoughts: Data Recipes for Reasoning Models cs.LG · 2025-06-04 · conditional · none · ref 8
OpenThoughts3-7B, trained on a 1.2M-example public dataset via systematic pipeline optimizations, achieves 53% on AIME 2025, 51% on LiveCodeBench, and 54% on GPQA Diamond.

LiveCodeBench 05/23-05/24 subset has 511 problems released between May 2023 and May 2024, whereas the 06/24-01/25 subset has 369 problems released between May 2024 and Jan

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer