pith. sign in

arxiv: 1703.05698 · v5 · pith:LWOKTXLDnew · submitted 2017-03-16 · 💻 cs.PL · cs.LG

Neural Sketch Learning for Conditional Program Generation

classification 💻 cs.PL cs.LG
keywords programscodeprogramgenerationcallsconditionaldistributionduring
0
0 comments X p. Extension
pith:LWOKTXLD Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{LWOKTXLD}

Prints a linked pith:LWOKTXLD badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We study the problem of generating source code in a strongly typed, Java-like programming language, given a label (for example a set of API calls or types) carrying a small amount of information about the code that is desired. The generated programs are expected to respect a "realistic" relationship between programs and labels, as exemplified by a corpus of labeled programs available during training. Two challenges in such conditional program generation are that the generated programs must satisfy a rich set of syntactic and semantic constraints, and that source code contains many low-level features that impede learning. We address these problems by training a neural generator not on code but on program sketches, or models of program syntax that abstract out names and operations that do not generalize across programs. During generation, we infer a posterior distribution over sketches, then concretize samples from this distribution into type-safe programs using combinatorial techniques. We implement our ideas in a system for generating API-heavy Java code, and show that it can often predict the entire body of a method given just a few API calls or data types that appear in the method.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Gradient-Based Program Synthesis with Neurally Interpreted Languages

    cs.LG 2026-04 unverdicted novelty 8.0

    NLI autonomously discovers a vocabulary of primitive operations and interprets variable-length programs via a neural executor, allowing end-to-end training and gradient-based test-time adaptation that outperforms prio...

  2. Competition-Level Code Generation with AlphaCode

    cs.PL 2022-02 unverdicted novelty 6.0

    AlphaCode generates novel code solutions for competitive programming problems and achieves an average top 54.3% ranking in Codeforces contests with over 5,000 participants.

  3. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

    cs.SE 2021-02 unverdicted novelty 6.0

    CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.