AdaDec: A Uncertainty-Guided Lookahead Decoding Framework for LLM-Based Code Generation

· 2025 · cs.SE · arXiv 2506.08980

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Code generation with large language models (LLMs) is highly sensitive to token selection during decoding, particularly at uncertain decision points that influence program logic. While standard strategies such as greedy decoding treat all tokens uniformly, they overlook code-specific uncertainty patterns, leading to suboptimal performance. This paper presents an empirical study revealing that many generation errors stem from token ranking mistakes at high-uncertainty steps, where the correct token is present but not top-ranked. Motivated by these findings, we propose AdaDec, a lookahead-based uncertainty-guided adaptive decoding framework that integrates a token-level pause-then-rerank mechanism driven by token uncertainty. AdaDec learns model-specific uncertainty thresholds and applies a lookahead-based reranking strategy when uncertainty is high. Experiments on HumanEval+, MBPP+, and DevEval benchmarks show that AdaDec improves Pass@1 accuracy by up to 20.9% in absolute terms over greedy decoding. More importantly, it consistently outperforms both competitive baselines like Beam Search and state-of-the-art adaptive decoding methods such as AdapT, while maintaining high efficiency through selective, uncertainty-triggered pausing. Our results highlight the promise of uncertainty-aware adaptive decoding for improving both the reliability and efficiency of LLM-based code generation.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Bridging Generation and Training: A Systematic Review of Quality Issues in LLMs for Code

cs.SE · 2026-05-06 · accept · novelty 6.0

A review of 114 studies creates taxonomies for code and data quality issues, formalizes 18 propagation mechanisms from training data defects to LLM-generated code defects, and synthesizes detection and mitigation techniques.

Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study

cs.SE · 2025-11-29 · unverdicted · novelty 6.0

APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

cs.CL · 2026-04-17

citing papers explorer

Showing 3 of 3 citing papers.

Bridging Generation and Training: A Systematic Review of Quality Issues in LLMs for Code cs.SE · 2026-05-06 · accept · none · ref 39 · internal anchor
A review of 114 studies creates taxonomies for code and data quality issues, formalizes 18 propagation mechanisms from training data defects to LLM-generated code defects, and synthesizes detection and mitigation techniques.
Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study cs.SE · 2025-11-29 · unverdicted · none · ref 23 · internal anchor
APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.
Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning cs.CL · 2026-04-17 · unreviewed · ref 9 · internal anchor

AdaDec: A Uncertainty-Guided Lookahead Decoding Framework for LLM-Based Code Generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer