Grammar Prompting for Domain-Specific Language Generation with Large Language Models

Bailin Wang; Rif A. Saurous; Xuezhi Wang; Yoon Kim; Yuan Cao; Zi Wang

arxiv: 2305.19234 · v3 · pith:B2I4DBWEnew · submitted 2023-05-30 · 💻 cs.CL · cs.AI

Grammar Prompting for Domain-Specific Language Generation with Large Language Models

Bailin Wang , Zi Wang , Xuezhi Wang , Yuan Cao , Rif A. Saurous , Yoon Kim This is my paper

classification 💻 cs.CL cs.AI

keywords grammarlanguagepromptingdomain-specificgenerationllmsenableexample

0 comments

read the original abstract

Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We propose \emph{grammar prompting}, a simple approach to enable LLMs to use external knowledge and domain-specific constraints, expressed through a grammar in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and SMILES-based molecule generation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Sequential Planning via Anchored Robotic Keypoints
cs.RO 2026-06 unverdicted novelty 6.0

SPARK reaches 43.7% success on six LIBERO-PRO cells by LLM-generated typed behavior trees plus multi-prompt perception and recovery, more than doubling CaP-Agent0 and VLA baselines.
Context-Instrumental Data Distillation for Kubernetes Manifest Generation: Method and Experimental Evaluation
cs.LG 2026-05 unverdicted novelty 4.0

Context-instrumental data distillation allows a 1.5B SLM to generate valid Kubernetes manifests at 91.5% pass@1 rate, with strict output formatting proving more impactful than additional training data.