Title resolution pending

Self-Revision:For each correct initial response, prompt the model to generate 3 rephrased responses yrevised, For each incorrect initial response, prompt the model to generate 3 co

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision

cs.CL · 2026-04-13 · unverdicted · novelty 6.0

SD-Zero converts binary rewards into dense self-supervision by having a model revise its own outputs and distill the improvements back into generation, yielding at least 10% gains on math and code benchmarks.

citing papers explorer

Showing 1 of 1 citing paper.

Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision cs.CL · 2026-04-13 · unverdicted · none · ref 3
SD-Zero converts binary rewards into dense self-supervision by having a model revise its own outputs and distill the improvements back into generation, yielding at least 10% gains on math and code benchmarks.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer