Prefix Parsing is Just Parsing
Pith reviewed 2026-05-09 22:40 UTC · model grok-4.3
The pith
Prefix parsing is solved by a grammar transformation that turns it into standard parsing on a slightly larger grammar.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce the prefix grammar transformation, an efficient reduction of prefix parsing to ordinary parsing. Given a grammar, our method constructs another grammar that generates exactly the prefixes of its original strings.
Load-bearing premise
The transformed grammar generates exactly the prefixes of the strings generated by the original grammar, and the size increase remains small enough for practical use.
Figures
read the original abstract
Prefix parsing asks whether an input prefix can be extended to a complete string generated by a given grammar. In the weighted setting, it also provides prefix probabilities, which are central to context-free language modeling, psycholinguistic analysis, and syntactically constrained generation from large language models. We introduce the prefix grammar transformation, an efficient reduction of prefix parsing to ordinary parsing. Given a grammar, our method constructs another grammar that generates exactly the prefixes of its original strings. Prefix parsing is then solved by applying any ordinary parsing algorithm on the transformed grammar without modification. The reduction is both elegant and practical: the transformed grammar is only a small factor larger than the input, and any optimized implementation can be used directly, eliminating the need for bespoke prefix-parsing algorithms. We also present a strategy-based on algorithmic differentiation-for computing the next-token weight vector, i.e., the prefix weights of all one-token extensions, enabling efficient prediction of the next token. Together, these contributions yield a simple, general, and efficient framework for prefix parsing.
Editorial analysis
A structured set of objections, weighed in public.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Context-free grammars admit transformations that preserve prefix languages while remaining context-free.
Reference graph
Works this paper leans on
-
[1]
Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. https://www.usenix.org/s...
work page 2016
-
[2]
J. K. Baker. 1979. https://compcalc.github.io/public/baker/baker1979trainable.pdf Trainable grammars for speech recognition . In Speech Communication Papers for the Meeting of the Acoustical Society of America
work page 1979
-
[3]
Yehoshua Bar-Hillel, Micha Asher Perles, and Eli Shamir. 1961. https://www.proquest.com/openview/fb41296047fb7453dcb1de182b4aa0b6/1 On formal properties of simple phrase-structure grammars . Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung, 14
work page 1961
-
[4]
Shalev Ben-David. 2022. https://cs.uwaterloo.ca/ s4bendav/files/CS360S22Lec10.pdf Lecture 10: Closure properties for context‐free languages ( CS 360, Winter 2022 )
work page 2022
-
[5]
Taylor L. Booth and Richard A. Thompson. 1973. https://doi.org/10.1109/T-C.1973.223746 Applying probability measures to abstract languages . IEEE Transactions on Computers, 22(5)
-
[6]
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake Vander P las, Skye Wanderman- M ilne, and Qiao Zhang. 2018. http://github.com/google/jax JAX : Composable transformations of Python + NumPy programs
work page 2018
-
[7]
John Canny, David Hall, and Dan Klein. 2013. https://aclanthology.org/D13-1195/ A multi-teraflop constituency parser using GPU s . In Proceedings of the Conference on Empirical Methods in Natural Language Processing
work page 2013
-
[8]
Zhiyi Chi. 1999. https://aclanthology.org/J99-1004/ Statistical properties of probabilistic context-free grammars . Computational Linguistics, 25(1)
work page 1999
- [9]
-
[10]
Manfred Droste, Werner Kuich, and Heiko Vogler. 2009. https://doi.org/10.1007/978-3-642-01492-5 Handbook of Weighted Automata . Springer
-
[11]
Aaron Joseph Dunlop. 2014. https://digitalcollections.ohsu.edu/record/2702 Efficient Latent-Variable Grammars: Learning and Inference . Ph.D. thesis, Oregon Health & Science University
work page 2014
-
[12]
Jay Earley. 1970. https://doi.org/10.1145/362007.362035 An efficient context-free parsing algorithm . Communications of the Association for Computing Machinery, 13(2)
-
[13]
Jason Eisner. 2016. https://aclanthology.org/W16-5901 Inside-outside and forward-backward algorithms are just backprop (tutorial paper) . In Proceedings of the Workshop on Structured Prediction for NLP
work page 2016
-
[14]
Jason Eisner and John Blatz. 2007. http://cs.jhu.edu/ jason/papers/#eisner-blatz-2007 Program transformations for optimization of parsing algorithms and other weighted logic programs . In Proceedings of the Conference on Formal Grammar
work page 2007
-
[15]
Jason Eisner, Eric Goldlust, and Noah A. Smith. 2005. https://aclanthology.org/H05-1036/ Compiling comp ling: Weighted dynamic programming and the Dyna language . In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing
work page 2005
-
[16]
Javier Esparza, Stefan Kiefer, and Michael Luttenberger. 2007. https://link.springer.com/chapter/10.1007/978-3-540-73208-2_17 An extension of Newton's method to -continuous semirings . In Developments in Language Theory
-
[17]
Javier Esparza, Stefan Kiefer, and Michael Luttenberger. 2008. https://doi.org/10.1007/978-3-540-70583-3_2 Newton 's method for -continuous semirings . In Automata, Languages and Programming
-
[18]
Joshua Goodman. 1999. https://aclanthology.org/J99-4004 Semiring parsing . Computational Linguistics, 25(4)
work page 1999
-
[19]
Andreas Griewank and Andrea Walther. 2008. https://epubs.siam.org/doi/book/10.1137/1.9780898717761 Evaluating Derivatives . SIAM
-
[20]
John Hale. 2001. https://doi.org/10.3115/1073336.1073357 A probabilistic Earley parser as a psycholinguistic model . In Proceedings of the Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies
-
[21]
Keith B. Hall. 2005. https://www.proquest.com/openview/eeb76c0001e6d80db7c72be0ea977440/ Best-first Word-lattice Parsing: Techniques for Integrated Syntactic Language Modeling . Ph.D. thesis, Brown University
work page 2005
-
[22]
Laurent Hasco \" e t and Val \' e rie Pascual. 2013. https://doi.org/10.1145/2450153.2450158 The Tapenade automatic differentiation tool: Principles, model, and specification . Association for Computing Machinery Transactions on Mathematical Software, 39(3)
- [23]
-
[24]
Tadao Kasami. 1965. https://www.ideals.illinois.edu/handle/2142/74304 An efficient recognition and syntax algorithm for context-free languages . Technical Report AF-CRL-65-758 , Air Force Cambridge Research Laboratory
work page 1965
- [25]
-
[26]
Martin Lange and Hans Lei . 2009. https://www.informaticadidactica.de/uploads/Artikel/LangeLeiss2009/LangeLeiss2009.pdf To CNF or not to CNF ? A n efficient yet presentable version of the CYK algorithm . Informatica Didactica, 8
work page 2009
-
[27]
Lew, Tim Vieira, and Timothy J
Jo \ a o Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, and Timothy J. O'Donnell. 2025. https://openreview.net/forum?id=xoXn62FzD0 Syntactic and semantic control of large language models via sequential Monte...
work page 2025
-
[28]
Minh-Thang Luong, Michael C. Frank, and Mark Johnson. 2013. https://doi.org/10.1162/tacl_a_00230 Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning . Transactions of the Association for Computational Linguistics, 1
-
[29]
Mark-Jan Nederhof. 1999. https://aclanthology.org/J99-3002/ The computational complexity of the correct-prefix property for TAG s . Computational Linguistics, 25(3)
work page 1999
-
[30]
Mark-Jan Nederhof and Giorgio Satta. 2008. https://doi.org/10.1007/978-3-540-78291-9_7 Probabilistic Parsing , chapter 7. Springer
-
[31]
Franz Nowak and Ryan Cotterell. 2023. https://aclanthology.org/2023.acl-short.6 A fast algorithm for computing prefix probabilities . In Proceedings of the Annual Meeting of the Association for Computational Linguistics
work page 2023
-
[32]
Andreas Opedal, Anej Svete, Alexandra Butoi, Franz Nowak, and Ryan Cotterell. 2022. https://drive.google.com/file/d/16kzFC7sjhH4TW1tXhd-h8YDoJPkBx_mp/view Lecture 7: Grammar transformations
work page 2022
-
[33]
Andreas Opedal, Ran Zmigrod, Tim Vieira, Ryan Cotterell, and Jason Eisner. 2023. https://aclanthology.org/2023.acl-long.204 Efficient semiring-weighted Earley parsing . In Proceedings of the Annual Meeting of the Association for Computational Linguistics
work page 2023
-
[34]
Clemente Pasti, Andreas Opedal, Tiago Pimentel, Tim Vieira, Jason Eisner, and Ryan Cotterell. 2023. https://aclanthology.org/2023.eacl-main.52 On the intersection of context-free and regular languages . In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics
work page 2023
-
[35]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf Automatic differentiation in PyTorch . In Proceedings of the NeurIPS Autodiff Workshop
work page 2017
-
[36]
Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. 2022. https://openreview.net/forum?id=KmtVD97J43e Synchromesh : Reliable code generation from pre-trained language models . In Proceedings of the International Conference on Learning Representations
work page 2022
-
[37]
Alexander Rush. 2020. https://aclanthology.org/2020.acl-demos.38/ Torch-Struct : Deep structured prediction library . In Proceedings of the Annual Meeting of the Association for Computational Linguistics: System Demonstrations
work page 2020
-
[38]
Yves Schabes and Aravind K. Joshi. 1988. https://aclanthology.org/P88-1032/ An Earley -type parsing algorithm for tree adjoining grammars . In Proceedings of the Annual Meeting of the Association for Computational Linguistics
work page 1988
-
[39]
Richard Shin, Christopher Lin, Sam Thomson, Charles Chen, Subhro Roy, Emmanouil Antonios Platanios, Adam Pauls, Dan Klein, Jason Eisner, and Benjamin Van Durme. 2021. https://aclanthology.org/2021.emnlp-main.608 Constrained language models yield few-shot semantic parsers . In Proceedings of the Conference on Empirical Methods in Natural Language Processing
work page 2021
-
[40]
Milo s Stanojevi \'c and Laurent Sartran. 2023. https://aclanthology.org/2023.emnlp-demo.32/ SynJax : Structured probability distributions for JAX . In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations
work page 2023
-
[41]
Andreas Stolcke. 1995. https://aclanthology.org/J95-2002 An efficient probabilistic context-free parsing algorithm that computes prefix probabilities . Computational Linguistics, 21(2)
work page 1995
-
[42]
Leslie G. Valiant. 1975. https://www.sciencedirect.com/science/article/pii/S0022000075800468 General context-free recognition in less than cubic time . Journal of Computer and System Sciences, 10(2)
work page 1975
-
[43]
Youngmin Yi, Chao-Yue Lai, Slav Petrov, and Kurt Keutzer. 2011. https://aclanthology.org/W11-2921/ Efficient parallel CKY parsing on GPU s . In Proceedings of the International Conference on Parsing Technologies
work page 2011
-
[44]
Daniel H. Younger. 1967. https://www.sciencedirect.com/science/article/pii/S001999586780007X Recognition and parsing of context-free languages in time n^3 . Information and Control, 10(2)
work page 1967
-
[45]
Theoretical Computer Science , volume =
Baur, Walter and Strassen, Volker , title =. Theoretical Computer Science , volume =. 1983 , url =
work page 1983
-
[46]
Martin Lange and Hans Lei. To. Informatica Didactica , volume =. 2009 , url =
work page 2009
-
[47]
Might, Matthew and Darais, David and Spiewak, Daniel , title =. 2011 , volume =
work page 2011
-
[48]
Proceedings of the Conference on Empirical Methods in Natural Language Processing , year =
Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing , author =. Proceedings of the Conference on Empirical Methods in Natural Language Processing , year =
-
[49]
Syntactic and Semantic Control of Large Language Models via Sequential
Jo. Syntactic and Semantic Control of Large Language Models via Sequential. Proceedings of the International Conference on Learning Representations , year=
- [50]
-
[51]
D.J. Rosenkrantz and R.E. Stearns , title =. Information and Control , volume =. 1970 , url =
work page 1970
-
[52]
On the translation of languages from left to right , journal =. 1965 , url =
work page 1965
- [53]
-
[54]
Computational Linguistics , volume =
Computation of the Probability of Initial Substring Generation by Stochastic Context-Free Grammars , author =. Computational Linguistics , volume =. 1991 , url =
work page 1991
-
[55]
Baker, J. K. , booktitle =. Trainable grammars for speech recognition , year =
- [56]
-
[57]
Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =
A Fast Algorithm for Computing Prefix Probabilities , author =. Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =
-
[58]
Computational Linguistics , volume =
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , author =. Computational Linguistics , volume =. 1995 , url =
work page 1995
-
[59]
The International Conference on Computational Linguistics , year =
Finite-state Approximation of Constraint-based Grammars using Left-corner Grammar Transforms , author =. The International Conference on Computational Linguistics , year =
-
[60]
On Formal Properties of Simple Phrase-Structure Grammars , author =. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung , volume = 14, year = 1961, issue = 2, url =
work page 1961
-
[61]
Proceedings of the International Conference on Parsing Technologies , year =
Probabilistic Parsing as Intersection , author =. Proceedings of the International Conference on Parsing Technologies , year =
- [62]
- [63]
-
[64]
Theoretical Computer Science , volume=
Partial derivatives of regular expressions and finite automaton constructions , author=. Theoretical Computer Science , volume=. 1996 , url=
work page 1996
- [65]
-
[66]
Quotient Complexity of Regular Languages , author=. J. Autom. Lang. Comb. , year=
-
[67]
Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =
Relating Probabilistic Grammars and Automata , author =. Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =
- [68]
-
[69]
Advanced Dynamic Programming in Semiring and Hypergraph Frameworks , author =. Coling: Advanced Dynamic Programming in Computational Linguistics: Theory, Algorithms and Applications - Tutorial notes , year =
-
[70]
A Lattice-Theoretical Fixpoint Theorem and its Applications , volume =
Tarski, Alfred , journal =. A Lattice-Theoretical Fixpoint Theorem and its Applications , volume =. 1955 , url =
work page 1955
-
[71]
Proceedings of the Conference on Formal Grammar , year =
Jason Eisner and John Blatz , title =. Proceedings of the Conference on Formal Grammar , year =
-
[72]
Allauzen, Cyril and Mohri, Mehryar , title =. J. Autom. Lang. Comb. , url =. 2003 , volume =
work page 2003
-
[73]
Frederick Jelinek , title =
-
[74]
and Hollenbeck, Celeste and Might, Matthew , title =
Adams, Michael D. and Hollenbeck, Celeste and Might, Matthew , title =. 2016 , url =
work page 2016
-
[75]
Information Processing Letters , volume =
A generalization of. Information Processing Letters , volume =. 1977 , url =
work page 1977
-
[76]
Computational Linguistics , volume =
Training Tree Transducers , author =. Computational Linguistics , volume =. 2008 , url =
work page 2008
-
[77]
Recognition and parsing of context-free languages in time n^3 , journal =. 1967 , url =
work page 1967
-
[78]
Dijkstra, E. W. , journal =. A Note on Two Problems in Connexion with Graphs , url =
- [79]
-
[80]
An Efficient Recognition and Syntax Algorithm for Context-free Languages , author =. 1965 , institution =
work page 1965
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.