International Conference on Machine Learning , pages=

Lever: Learning to verify language-to-code generation with execution , author= · 2023

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

representative citing papers

Memory-Guided Tree Search with Cross-Branch Knowledge Transfer for LLM Solver Synthesis

cs.AI · 2026-05-17 · unverdicted · novelty 7.0

MEMOIR adds branch-local and global memory with a reflection step to tree search for LLM solver synthesis, reaching 96.7% solution validity and 7.3-point score gains over baselines on seven CO problems with lower run-to-run variance.

StepCodeReasoner: Aligning Code Reasoning with Stepwise Execution Traces via Reinforcement Learning

cs.SE · 2026-05-12 · unverdicted · novelty 7.0

StepCodeReasoner aligns code reasoning with verifiable stepwise execution traces via print anchors and bi-level GRPO reinforcement learning, reaching SOTA results on CRUXEval (91.1%) and LiveCodeBench (86.5%) for a 7B model.

Structural Verification for Reliable EDA Code Generation without Tool-in-the-Loop Debugging

cs.SE · 2026-04-20 · unverdicted · novelty 7.0

Structural dependency graphs and staged pre-execution verification raise LLM-based EDA code pass rates to 82.5% (single-step) and 70-84% (multi-step) while halving tool calls by catching dependency violations before runtime.

Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.

Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

LLM Benchmark Datasets Should Be Contamination-Resistant

cs.LG · 2026-05-19 · unverdicted · novelty 4.0

Authors call for contamination-resistant LLM benchmarks that exploit Transformer training-inference asymmetry and require new mathematical methods for cross-architecture interoperability.

citing papers explorer

Showing 6 of 6 citing papers.

Memory-Guided Tree Search with Cross-Branch Knowledge Transfer for LLM Solver Synthesis cs.AI · 2026-05-17 · unverdicted · none · ref 14
MEMOIR adds branch-local and global memory with a reflection step to tree search for LLM solver synthesis, reaching 96.7% solution validity and 7.3-point score gains over baselines on seven CO problems with lower run-to-run variance.
StepCodeReasoner: Aligning Code Reasoning with Stepwise Execution Traces via Reinforcement Learning cs.SE · 2026-05-12 · unverdicted · none · ref 40
StepCodeReasoner aligns code reasoning with verifiable stepwise execution traces via print anchors and bi-level GRPO reinforcement learning, reaching SOTA results on CRUXEval (91.1%) and LiveCodeBench (86.5%) for a 7B model.
Structural Verification for Reliable EDA Code Generation without Tool-in-the-Loop Debugging cs.SE · 2026-04-20 · unverdicted · none · ref 21
Structural dependency graphs and staged pre-execution verification raise LLM-based EDA code pass rates to 82.5% (single-step) and 70-84% (multi-step) while halving tool calls by catching dependency violations before runtime.
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning cs.CL · 2026-04-11 · unverdicted · none · ref 79
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs cs.CL · 2026-04-11 · unverdicted · none · ref 94
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
LLM Benchmark Datasets Should Be Contamination-Resistant cs.LG · 2026-05-19 · unverdicted · none · ref 53
Authors call for contamination-resistant LLM benchmarks that exploit Transformer training-inference asymmetry and require new mathematical methods for cross-architecture interoperability.

International Conference on Machine Learning , pages=

fields

years

verdicts

representative citing papers

citing papers explorer