Title resolution pending

KodCode: A Diverse, Challenging, Verifiable Synthetic Dataset for Coding , author= · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Self-Supervised On-Policy Distillation for Reasoning Language Models

cs.LG · 2026-05-17 · unverdicted · novelty 6.0

SSOPD converts intra-group correct-wrong contrast into process supervision by distilling a teacher distribution from the shortest correct completion into prefixes of the longest wrong completion, improving GRPO on AIME and HMMT benchmarks.

DACA-GRPO: Denoising-Aware Credit Assignment for Reinforcement Learning in Diffusion Language Models

cs.LG · 2026-05-08 · unverdicted · novelty 5.0

DACA-GRPO adds denoising-aware credit assignment and bias-reduced likelihood estimation to GRPO, delivering consistent gains up to 36.3pp on math, code, constraint, and schema benchmarks for diffusion LLMs.

citing papers explorer

Showing 2 of 2 citing papers.

Self-Supervised On-Policy Distillation for Reasoning Language Models cs.LG · 2026-05-17 · unverdicted · none · ref 101
SSOPD converts intra-group correct-wrong contrast into process supervision by distilling a teacher distribution from the shortest correct completion into prefixes of the longest wrong completion, improving GRPO on AIME and HMMT benchmarks.
DACA-GRPO: Denoising-Aware Credit Assignment for Reinforcement Learning in Diffusion Language Models cs.LG · 2026-05-08 · unverdicted · none · ref 13
DACA-GRPO adds denoising-aware credit assignment and bias-reduced likelihood estimation to GRPO, delivering consistent gains up to 36.3pp on math, code, constraint, and schema benchmarks for diffusion LLMs.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer