Evaluating language models for efficient code generation

Jiawei Liu, Songrun Xie, Junhao Wang, Yuxiang Wei, Yifeng Ding, Lingming Zhang · 2024 · arXiv 2408.06450

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

representative citing papers

SkelDPO: A Skeleton-Guided Direct Preference Optimization Framework for Efficient Code Generation

cs.SE · 2026-06-05 · unverdicted · novelty 7.0

SkelDPO improves code generation efficiency by 2-7% over prior DPO methods via joint preference losses on full code and efficiency-critical skeletons.

CppPerf: An Automated Pipeline and Dataset for Performance-Improving C++ Commits

cs.SE · 2026-05-11 · accept · novelty 7.0

CppPerf-Mine produces CppPerf-DB, a benchmark of 347 real-world performance-improving C++ patches (39% multi-file) from 42 repositories to evaluate repository-level repair tools.

Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software

cs.SE · 2025-10-17 · unverdicted · novelty 7.0

LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.

JETO-Bench: A Reproducible Benchmark for Execution Time Improvement Patches in Java

cs.SE · 2026-06-30 · conditional · novelty 6.0

JETO-Mine is a reusable three-phase pipeline that mines 1.8 million Java commits to produce JETO-Bench containing 91 verified executable ETIPs, on which OpenHands succeeds at 14.3%.

Chiseling Out Efficiency: Structured Skeleton Supervision for Efficient Code Generation

cs.SE · 2026-06-05 · unverdicted · novelty 6.0

EffiSkel improves LLM-generated code efficiency by supervising on extracted structural efficiency skeletons via multi-task learning of code generation and skeleton prediction.

ConVer: Using Contracts and Loop Invariant Synthesis for Scalable Formal Software Verification

cs.SE · 2026-05-26 · unverdicted · novelty 6.0

ConVer decomposes C program verification top-down by synthesizing contracts with LLMs and refining them in a CEGAR-CEGIS loop, reporting 82-96% success on simple benchmarks and lower rates on harder suites.

MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training

cs.LG · 2026-05-26 · unverdicted · novelty 6.0

MONA integrates Nesterov acceleration into Muon's orthogonalization framework, reporting better convergence than Muon and AdamW on MoE models up to 68B parameters trained on 1T tokens and SOTA fine-tuning results.

SysLLMatic: Large Language Models are Software System Optimizers

cs.SE · 2025-06-02 · unverdicted · novelty 6.0

SysLLMatic integrates LLMs with performance diagnostics and a 43-pattern catalog to optimize complex software, reporting 1.54x latency and 1.24x energy gains over compilers on large Java systems where prior LLM methods did not scale.

citing papers explorer

Showing 1 of 1 citing paper after filters.

JETO-Bench: A Reproducible Benchmark for Execution Time Improvement Patches in Java cs.SE · 2026-06-30 · conditional · none · ref 28
JETO-Mine is a reusable three-phase pipeline that mines 1.8 million Java commits to produce JETO-Bench containing 91 verified executable ETIPs, on which OpenHands succeeds at 14.3%.

Evaluating language models for efficient code generation

fields

years

verdicts

representative citing papers

citing papers explorer