arXiv preprint arXiv:2206.06888 , year=

CERT: Continual Pre-training on Sketches for Library-oriented Code Generation , author= · 2022 · arXiv 2206.06888

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ARIADNE: Agentic Reward-Informed Adaptive Decision Exploration via Blackboard-Driven MCTS for Competitive Program Generation

cs.SE · 2026-05-04 · unverdicted · novelty 7.0

ARIADNE combines blackboard architecture with MCTS to coordinate strategy, code, test, evaluation, and repair stages, yielding higher Pass@1 scores than prior LLM baselines on APPS, CodeContests, and related benchmarks.

RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices

cs.SE · 2026-04-24 · unverdicted · novelty 6.0

RealBench is a new repo-level code generation benchmark that adds UML diagrams to natural language specs, showing LLMs struggle more at full repositories, create modules with errors, and perform best with whole-repo generation on small projects versus module-by-module on complex ones.

Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive

cs.CL · 2024-02-20 · conditional · novelty 6.0

DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.

citing papers explorer

Showing 3 of 3 citing papers.

ARIADNE: Agentic Reward-Informed Adaptive Decision Exploration via Blackboard-Driven MCTS for Competitive Program Generation cs.SE · 2026-05-04 · unverdicted · none · ref 28
ARIADNE combines blackboard architecture with MCTS to coordinate strategy, code, test, evaluation, and repair stages, yielding higher Pass@1 scores than prior LLM baselines on APPS, CodeContests, and related benchmarks.
RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices cs.SE · 2026-04-24 · unverdicted · none · ref 50
RealBench is a new repo-level code generation benchmark that adds UML diagrams to natural language specs, showing LLMs struggle more at full repositories, create modules with errors, and perform best with whole-repo generation on small projects versus module-by-module on complex ones.
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive cs.CL · 2024-02-20 · conditional · none · ref 129
DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.

arXiv preprint arXiv:2206.06888 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer