OpenCodeInstruct : A large-scale instruction tuning dataset for code LLMs , 2025

NVIDIA · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Domain-Adaptable Reinforcement Learning for Code Generation with Dense Rewards

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

A PPO-based RL framework with execution-aware dense rewards and token-level mapping improves pass@1 by 19% on MBPP and reduces execution failures by 51% on RoboEval for LLM code generation.

citing papers explorer

Showing 1 of 1 citing paper.

Domain-Adaptable Reinforcement Learning for Code Generation with Dense Rewards cs.LG · 2026-05-20 · unverdicted · none · ref 21
A PPO-based RL framework with execution-aware dense rewards and token-level mapping improves pass@1 by 19% on MBPP and reduces execution failures by 51% on RoboEval for LLM code generation.

OpenCodeInstruct : A large-scale instruction tuning dataset for code LLMs , 2025

fields

years

verdicts

representative citing papers

citing papers explorer