Towards Synthesizing Complex Programs from Input-Output Examples

Chang Liu; Dawn Song; Xinyun Chen

Towards Synthesizing Complex Programs from Input-Output Examples

Not yet reviewed by Pith; the record is open.

Re-run · record.json Download PDF Read on arXiv ↗

This paper has not been read by Pith yet. Machine review is queued; the pith claim, tier, and objections will appear here once it completes.

SPECIMEN: schema-true, not a live event

T0 review · schema-true

One-sentence machine reading of the paper's core claim.

pith:XXXXXXXX · record.json · timestamp

arxiv 1706.01284 v4 pith:DSIQOZZV submitted 2017-06-05 cs.LG cs.AIcs.PL

Towards Synthesizing Complex Programs from Input-Output Examples

Xinyun Chen , Chang Liu , Dawn Song This is my paper

classification cs.LG cs.AIcs.PL

keywords learningmachinenon-differentiablenovelprogramprogramsexamplesinput-output

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

0 comments

read the original abstract

In recent years, deep learning techniques have been developed to improve the performance of program synthesis from input-output examples. Albeit its significant progress, the programs that can be synthesized by state-of-the-art approaches are still simple in terms of their complexity. In this work, we move a significant step forward along this direction by proposing a new class of challenging tasks in the domain of program synthesis from input-output examples: learning a context-free parser from pairs of input programs and their parse trees. We show that this class of tasks are much more challenging than previously studied tasks, and the test accuracy of existing approaches is almost 0%. We tackle the challenges by developing three novel techniques inspired by three novel observations, which reveal the key ingredients of using deep learning to synthesize a complex program. First, the use of a non-differentiable machine is the key to effectively restrict the search space. Thus our proposed approach learns a neural program operating a domain-specific non-differentiable machine. Second, recursion is the key to achieve generalizability. Thus, we bake-in the notion of recursion in the design of our non-differentiable machine. Third, reinforcement learning is the key to learn how to operate the non-differentiable machine, but it is also hard to train the model effectively with existing reinforcement learning algorithms from a cold boot. We develop a novel two-phase reinforcement learning-based search algorithm to overcome this issue. In our evaluation, we show that using our novel approach, neural parsing programs can be learned to achieve 100% test accuracy on test inputs that are 500x longer than the training samples.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Neural-based Program Decompiler
cs.PL 2019-06 unverdicted novelty 7.0

Coda is an end-to-end neural decompiler that recovers source code from binaries at 82% accuracy on unseen samples where conventional tools achieve 0%.