1·(4−1) = 3,2·(4−2) = 4,3·(4−3) = 3,4·(4−4) = 0, Sum:3+4+3+0=1 0 - For the input[4,3,2,1], the sorted form is[1,2,3,4], and the output is 20 (same as above)

**Analyze the Input/Output Pairs:** - For the input[1,2,3,4], the sorted form is the same, the output is 20

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

citation-role summary

other 1

citation-polarity summary

unclear 1

representative citing papers

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

cs.LG · 2025-05-06 · conditional · novelty 7.0

A model trained only by proposing and solving its own verifiable code tasks achieves state-of-the-art results on math and coding benchmarks without external data.

citing papers explorer

Showing 1 of 1 citing paper.

Absolute Zero: Reinforced Self-play Reasoning with Zero Data cs.LG · 2025-05-06 · conditional · none · ref 26
A model trained only by proposing and solving its own verifiable code tasks achieves state-of-the-art results on math and coding benchmarks without external data.

1·(4−1) = 3,2·(4−2) = 4,3·(4−3) = 3,4·(4−4) = 0, Sum:3+4+3+0=1 0 - For the input[4,3,2,1], the sorted form is[1,2,3,4], and the output is 20 (same as above)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer