pith. sign in

arxiv: 2604.21191 · v1 · submitted 2026-04-23 · 💻 cs.CL · cs.FL

Prefix Parsing is Just Parsing

Pith reviewed 2026-05-09 22:40 UTC · model grok-4.3

classification 💻 cs.CL cs.FL
keywords prefixparsinggrammarefficientgiveninputlanguageordinary
0
0 comments X

The pith

Prefix parsing is solved by a grammar transformation that turns it into standard parsing on a slightly larger grammar.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Prefix parsing checks whether a partial input string can be completed to a full valid sentence according to a grammar, and in weighted cases it also gives the probability of that prefix. The authors create a new grammar whose complete strings are exactly the prefixes of the original grammar's strings. Any standard parser can then be run on this new grammar to answer prefix questions. They also describe how to use differentiation to get the weights for all possible next tokens efficiently.

Core claim

We introduce the prefix grammar transformation, an efficient reduction of prefix parsing to ordinary parsing. Given a grammar, our method constructs another grammar that generates exactly the prefixes of its original strings.

Load-bearing premise

The transformed grammar generates exactly the prefixes of the strings generated by the original grammar, and the size increase remains small enough for practical use.

Figures

Figures reproduced from arXiv: 2604.21191 by Andreas Opedal, Clemente Pasti, Ryan Cotterell, Timothy J. O'Donnell, Tim Vieira.

Figure 1
Figure 1. Figure 1: We compare parsing, prefix parsing, and the next￾token weight vector algorithm on WSJ 5000 (Luong et al., 2013, EarleyX), a large grammar with 35,016 rules, 448 non￾terminals, and a total size of 116,667. The underlying parser is Earley’s (see Alg. 4), paired with its next-token variant (§§ 4 and F.2). Results are shown for 500 strings on a log–log plot; to obtain data across all string lengths, we include… view at source ↗
Figure 2
Figure 2. Figure 2: Left: Our prefix parser vs. EarleyX (Luong et al., 2013). We compare the two implementations on WSJ 5000 and 500 sentences. Our algorithm uses Earley’s with the prefix grammar (Alg. 3), while EarleyX uses Stolcke’s (1995) algorithm to compute prefix probabilities (itself an adaptation of Earley’s). As in [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
read the original abstract

Prefix parsing asks whether an input prefix can be extended to a complete string generated by a given grammar. In the weighted setting, it also provides prefix probabilities, which are central to context-free language modeling, psycholinguistic analysis, and syntactically constrained generation from large language models. We introduce the prefix grammar transformation, an efficient reduction of prefix parsing to ordinary parsing. Given a grammar, our method constructs another grammar that generates exactly the prefixes of its original strings. Prefix parsing is then solved by applying any ordinary parsing algorithm on the transformed grammar without modification. The reduction is both elegant and practical: the transformed grammar is only a small factor larger than the input, and any optimized implementation can be used directly, eliminating the need for bespoke prefix-parsing algorithms. We also present a strategy-based on algorithmic differentiation-for computing the next-token weight vector, i.e., the prefix weights of all one-token extensions, enabling efficient prediction of the next token. Together, these contributions yield a simple, general, and efficient framework for prefix parsing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard properties of context-free grammars and parsing algorithms; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • standard math Context-free grammars admit transformations that preserve prefix languages while remaining context-free.
    Invoked to guarantee the transformed grammar is usable with ordinary parsers.

pith-pipeline@v0.9.0 · 5478 in / 974 out tokens · 19821 ms · 2026-05-09T22:40:40.863197+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

170 extracted references · 170 canonical work pages

  1. [1]

    Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

    Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. https://www.usenix.org/s...

  2. [2]

    J. K. Baker. 1979. https://compcalc.github.io/public/baker/baker1979trainable.pdf Trainable grammars for speech recognition . In Speech Communication Papers for the Meeting of the Acoustical Society of America

  3. [3]

    Yehoshua Bar-Hillel, Micha Asher Perles, and Eli Shamir. 1961. https://www.proquest.com/openview/fb41296047fb7453dcb1de182b4aa0b6/1 On formal properties of simple phrase-structure grammars . Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung, 14

  4. [4]

    Shalev Ben-David. 2022. https://cs.uwaterloo.ca/ s4bendav/files/CS360S22Lec10.pdf Lecture 10: Closure properties for context‐free languages ( CS 360, Winter 2022 )

  5. [5]

    doi: 10.1109/t-c.1973.223746

    Taylor L. Booth and Richard A. Thompson. 1973. https://doi.org/10.1109/T-C.1973.223746 Applying probability measures to abstract languages . IEEE Transactions on Computers, 22(5)

  6. [6]

    James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake Vander P las, Skye Wanderman- M ilne, and Qiao Zhang. 2018. http://github.com/google/jax JAX : Composable transformations of Python + NumPy programs

  7. [7]

    John Canny, David Hall, and Dan Klein. 2013. https://aclanthology.org/D13-1195/ A multi-teraflop constituency parser using GPU s . In Proceedings of the Conference on Empirical Methods in Natural Language Processing

  8. [8]

    Zhiyi Chi. 1999. https://aclanthology.org/J99-1004/ Statistical properties of probabilistic context-free grammars . Computational Linguistics, 25(1)

  9. [9]

    Schwartz

    John Cocke and Jacob T. Schwartz. 1970. http://www.softwarepreservation.org/projects/FORTRAN/CockeSchwartz_ProgLangCompilers.pdf Programming languages and their compilers: Preliminary notes . Technical report, Courant Institute of Mathematical Sciences, New York University

  10. [10]

    Manfred Droste, Werner Kuich, and Heiko Vogler. 2009. https://doi.org/10.1007/978-3-642-01492-5 Handbook of Weighted Automata . Springer

  11. [11]

    Aaron Joseph Dunlop. 2014. https://digitalcollections.ohsu.edu/record/2702 Efficient Latent-Variable Grammars: Learning and Inference . Ph.D. thesis, Oregon Health & Science University

  12. [12]

    Jay Earley. 1970. https://doi.org/10.1145/362007.362035 An efficient context-free parsing algorithm . Communications of the Association for Computing Machinery, 13(2)

  13. [13]

    Jason Eisner. 2016. https://aclanthology.org/W16-5901 Inside-outside and forward-backward algorithms are just backprop (tutorial paper) . In Proceedings of the Workshop on Structured Prediction for NLP

  14. [14]

    Jason Eisner and John Blatz. 2007. http://cs.jhu.edu/ jason/papers/#eisner-blatz-2007 Program transformations for optimization of parsing algorithms and other weighted logic programs . In Proceedings of the Conference on Formal Grammar

  15. [15]

    Jason Eisner, Eric Goldlust, and Noah A. Smith. 2005. https://aclanthology.org/H05-1036/ Compiling comp ling: Weighted dynamic programming and the Dyna language . In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing

  16. [16]

    Javier Esparza, Stefan Kiefer, and Michael Luttenberger. 2007. https://link.springer.com/chapter/10.1007/978-3-540-73208-2_17 An extension of Newton's method to -continuous semirings . In Developments in Language Theory

  17. [17]

    Javier Esparza, Stefan Kiefer, and Michael Luttenberger. 2008. https://doi.org/10.1007/978-3-540-70583-3_2 Newton 's method for -continuous semirings . In Automata, Languages and Programming

  18. [18]

    Joshua Goodman. 1999. https://aclanthology.org/J99-4004 Semiring parsing . Computational Linguistics, 25(4)

  19. [19]

    Andreas Griewank and Andrea Walther. 2008. https://epubs.siam.org/doi/book/10.1137/1.9780898717761 Evaluating Derivatives . SIAM

  20. [20]

    John Hale. 2001. https://doi.org/10.3115/1073336.1073357 A probabilistic Earley parser as a psycholinguistic model . In Proceedings of the Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies

  21. [21]

    Keith B. Hall. 2005. https://www.proquest.com/openview/eeb76c0001e6d80db7c72be0ea977440/ Best-first Word-lattice Parsing: Techniques for Integrated Syntactic Language Modeling . Ph.D. thesis, Brown University

  22. [22]

    Laurent Hasco \" e t and Val \' e rie Pascual. 2013. https://doi.org/10.1145/2450153.2450158 The Tapenade automatic differentiation tool: Principles, model, and specification . Association for Computing Machinery Transactions on Mathematical Software, 39(3)

  23. [23]

    Lafferty

    Frederick Jelinek and John D. Lafferty. 1991. https://aclanthology.org/J91-3004 Computation of the probability of initial substring generation by stochastic context-free grammars . Computational Linguistics, 17(3)

  24. [24]

    Tadao Kasami. 1965. https://www.ideals.illinois.edu/handle/2142/74304 An efficient recognition and syntax algorithm for context-free languages . Technical Report AF-CRL-65-758 , Air Force Cambridge Research Laboratory

  25. [25]

    Jeffrey Kegler. 2023. https://arxiv.org/abs/1910.08129 Marpa , a practical general parser: the recognizer

  26. [26]

    Martin Lange and Hans Lei . 2009. https://www.informaticadidactica.de/uploads/Artikel/LangeLeiss2009/LangeLeiss2009.pdf To CNF or not to CNF ? A n efficient yet presentable version of the CYK algorithm . Informatica Didactica, 8

  27. [27]

    Lew, Tim Vieira, and Timothy J

    Jo \ a o Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, and Timothy J. O'Donnell. 2025. https://openreview.net/forum?id=xoXn62FzD0 Syntactic and semantic control of large language models via sequential Monte...

  28. [28]

    Frank, and Mark Johnson

    Minh-Thang Luong, Michael C. Frank, and Mark Johnson. 2013. https://doi.org/10.1162/tacl_a_00230 Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning . Transactions of the Association for Computational Linguistics, 1

  29. [29]

    Mark-Jan Nederhof. 1999. https://aclanthology.org/J99-3002/ The computational complexity of the correct-prefix property for TAG s . Computational Linguistics, 25(3)

  30. [30]

    Mark-Jan Nederhof and Giorgio Satta. 2008. https://doi.org/10.1007/978-3-540-78291-9_7 Probabilistic Parsing , chapter 7. Springer

  31. [31]

    Franz Nowak and Ryan Cotterell. 2023. https://aclanthology.org/2023.acl-short.6 A fast algorithm for computing prefix probabilities . In Proceedings of the Annual Meeting of the Association for Computational Linguistics

  32. [32]

    Andreas Opedal, Anej Svete, Alexandra Butoi, Franz Nowak, and Ryan Cotterell. 2022. https://drive.google.com/file/d/16kzFC7sjhH4TW1tXhd-h8YDoJPkBx_mp/view Lecture 7: Grammar transformations

  33. [33]

    Andreas Opedal, Ran Zmigrod, Tim Vieira, Ryan Cotterell, and Jason Eisner. 2023. https://aclanthology.org/2023.acl-long.204 Efficient semiring-weighted Earley parsing . In Proceedings of the Annual Meeting of the Association for Computational Linguistics

  34. [34]

    Clemente Pasti, Andreas Opedal, Tiago Pimentel, Tim Vieira, Jason Eisner, and Ryan Cotterell. 2023. https://aclanthology.org/2023.eacl-main.52 On the intersection of context-free and regular languages . In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics

  35. [35]

    Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf Automatic differentiation in PyTorch . In Proceedings of the NeurIPS Autodiff Workshop

  36. [36]

    Gabriel Poesia, Alex Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. 2022. https://openreview.net/forum?id=KmtVD97J43e Synchromesh : Reliable code generation from pre-trained language models . In Proceedings of the International Conference on Learning Representations

  37. [37]

    Alexander Rush. 2020. https://aclanthology.org/2020.acl-demos.38/ Torch-Struct : Deep structured prediction library . In Proceedings of the Annual Meeting of the Association for Computational Linguistics: System Demonstrations

  38. [38]

    Yves Schabes and Aravind K. Joshi. 1988. https://aclanthology.org/P88-1032/ An Earley -type parsing algorithm for tree adjoining grammars . In Proceedings of the Annual Meeting of the Association for Computational Linguistics

  39. [39]

    Richard Shin, Christopher Lin, Sam Thomson, Charles Chen, Subhro Roy, Emmanouil Antonios Platanios, Adam Pauls, Dan Klein, Jason Eisner, and Benjamin Van Durme. 2021. https://aclanthology.org/2021.emnlp-main.608 Constrained language models yield few-shot semantic parsers . In Proceedings of the Conference on Empirical Methods in Natural Language Processing

  40. [40]

    Milo s Stanojevi \'c and Laurent Sartran. 2023. https://aclanthology.org/2023.emnlp-demo.32/ SynJax : Structured probability distributions for JAX . In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations

  41. [41]

    Andreas Stolcke. 1995. https://aclanthology.org/J95-2002 An efficient probabilistic context-free parsing algorithm that computes prefix probabilities . Computational Linguistics, 21(2)

  42. [42]

    Leslie G. Valiant. 1975. https://www.sciencedirect.com/science/article/pii/S0022000075800468 General context-free recognition in less than cubic time . Journal of Computer and System Sciences, 10(2)

  43. [43]

    Youngmin Yi, Chao-Yue Lai, Slav Petrov, and Kurt Keutzer. 2011. https://aclanthology.org/W11-2921/ Efficient parallel CKY parsing on GPU s . In Proceedings of the International Conference on Parsing Technologies

  44. [44]

    Daniel H. Younger. 1967. https://www.sciencedirect.com/science/article/pii/S001999586780007X Recognition and parsing of context-free languages in time n^3 . Information and Control, 10(2)

  45. [45]

    Theoretical Computer Science , volume =

    Baur, Walter and Strassen, Volker , title =. Theoretical Computer Science , volume =. 1983 , url =

  46. [46]

    Martin Lange and Hans Lei. To. Informatica Didactica , volume =. 2009 , url =

  47. [47]

    2011 , volume =

    Might, Matthew and Darais, David and Spiewak, Daniel , title =. 2011 , volume =

  48. [48]

    Proceedings of the Conference on Empirical Methods in Natural Language Processing , year =

    Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing , author =. Proceedings of the Conference on Empirical Methods in Natural Language Processing , year =

  49. [49]

    Syntactic and Semantic Control of Large Language Models via Sequential

    Jo. Syntactic and Semantic Control of Large Language Models via Sequential. Proceedings of the International Conference on Learning Representations , year=

  50. [50]

    2011 , url =

    Parr, Terence and Fisher, Kathleen , title =. 2011 , url =

  51. [51]

    Rosenkrantz and R.E

    D.J. Rosenkrantz and R.E. Stearns , title =. Information and Control , volume =. 1970 , url =

  52. [52]

    1965 , url =

    On the translation of languages from left to right , journal =. 1965 , url =

  53. [53]

    , title =

    Brzozowski, Janusz A. , title =. 1964 , volume =

  54. [54]

    Computational Linguistics , volume =

    Computation of the Probability of Initial Substring Generation by Stochastic Context-Free Grammars , author =. Computational Linguistics , volume =. 1991 , url =

  55. [55]

    Baker, J. K. , booktitle =. Trainable grammars for speech recognition , year =

  56. [56]

    1970 , volume =

    Earley, Jay , title =. 1970 , volume =

  57. [57]

    Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =

    A Fast Algorithm for Computing Prefix Probabilities , author =. Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =

  58. [58]

    Computational Linguistics , volume =

    An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , author =. Computational Linguistics , volume =. 1995 , url =

  59. [59]

    The International Conference on Computational Linguistics , year =

    Finite-state Approximation of Constraint-based Grammars using Left-corner Grammar Transforms , author =. The International Conference on Computational Linguistics , year =

  60. [60]

    Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung , volume = 14, year = 1961, issue = 2, url =

    On Formal Properties of Simple Phrase-Structure Grammars , author =. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung , volume = 14, year = 1961, issue = 2, url =

  61. [61]

    Proceedings of the International Conference on Parsing Technologies , year =

    Probabilistic Parsing as Intersection , author =. Proceedings of the International Conference on Parsing Technologies , year =

  62. [62]

    2022 , url =

    On quotients of formal power series , journal =. 2022 , url =

  63. [63]

    2009 , volume =

    Nederhof, Mark-Jan and Satta, Giorgio , title =. 2009 , volume =

  64. [64]

    Theoretical Computer Science , volume=

    Partial derivatives of regular expressions and finite automaton constructions , author=. Theoretical Computer Science , volume=. 1996 , url=

  65. [65]

    2009 , volume =

    Etessami, Kousha and Yannakakis, Mihalis , title =. 2009 , volume =

  66. [66]

    Quotient Complexity of Regular Languages , author=. J. Autom. Lang. Comb. , year=

  67. [67]

    Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =

    Relating Probabilistic Grammars and Automata , author =. Proceedings of the Annual Meeting of the Association for Computational Linguistics , year =

  68. [68]

    2008 , url =

    Gondran, Michel and Minoux, Michel , title =. 2008 , url =

  69. [69]

    Coling: Advanced Dynamic Programming in Computational Linguistics: Theory, Algorithms and Applications - Tutorial notes , year =

    Advanced Dynamic Programming in Semiring and Hypergraph Frameworks , author =. Coling: Advanced Dynamic Programming in Computational Linguistics: Theory, Algorithms and Applications - Tutorial notes , year =

  70. [70]

    A Lattice-Theoretical Fixpoint Theorem and its Applications , volume =

    Tarski, Alfred , journal =. A Lattice-Theoretical Fixpoint Theorem and its Applications , volume =. 1955 , url =

  71. [71]

    Proceedings of the Conference on Formal Grammar , year =

    Jason Eisner and John Blatz , title =. Proceedings of the Conference on Formal Grammar , year =

  72. [72]

    Allauzen, Cyril and Mohri, Mehryar , title =. J. Autom. Lang. Comb. , url =. 2003 , volume =

  73. [73]

    Frederick Jelinek , title =

  74. [74]

    and Hollenbeck, Celeste and Might, Matthew , title =

    Adams, Michael D. and Hollenbeck, Celeste and Might, Matthew , title =. 2016 , url =

  75. [75]

    Information Processing Letters , volume =

    A generalization of. Information Processing Letters , volume =. 1977 , url =

  76. [76]

    Computational Linguistics , volume =

    Training Tree Transducers , author =. Computational Linguistics , volume =. 2008 , url =

  77. [77]

    1967 , url =

    Recognition and parsing of context-free languages in time n^3 , journal =. 1967 , url =

  78. [78]

    Dijkstra, E. W. , journal =. A Note on Two Problems in Connexion with Graphs , url =

  79. [79]

    1959 , url =

    On certain formal properties of grammars , journal =. 1959 , url =

  80. [80]

    1965 , institution =

    An Efficient Recognition and Syntax Algorithm for Context-free Languages , author =. 1965 , institution =

Showing first 80 references.