pith. sign in

arxiv: 2603.10139 · v2 · pith:LYDR2X43new · submitted 2026-03-10 · 💻 cs.CL · cs.AI· cs.CC· cs.FL

The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory

Pith reviewed 2026-05-21 11:13 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.CCcs.FL
keywords generation-recognition asymmetryformal language theorycomputational complexityparsingsurprisalbidirectional modelsnatural language processinglarge language models
0
0 comments X

The pith

Generation and recognition characterize the same languages but differ operationally across six independent dimensions, with parsing always constrained by an input while generation need not be.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that every formal grammar can be used to generate strings, recognize them, or be inferred from examples, yet generation and recognition describe the same set while operating differently. It identifies six dimensions of asymmetry: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. The common claim that generation is easy and parsing is hard is reframed because free generation is trivial but constrained generation can be NP-hard, whereas parsing is always limited by the given input string. Two dimensions, directionality and temporality, are presented as previously unidentified, with the temporal aspect tied to surprisal in predictive models where generators face zero surprise and parsers operate under uncertainty. The work also reviews why bidirectional systems have existed for decades without broad adoption in specialized applications and notes that large language models unify the tasks architecturally while retaining the operational divide.

Core claim

Generation and recognition are extensionally equivalent but operationally asymmetric in six ways. Unconstrained generation is trivial while generation under constraints can be NP-hard. Parsing is invariably constrained because the input string is supplied. Directionality and temporality are newly identified dimensions, with temporality linked to the surprisal framework in which a generator incurs zero surprisal and a parser predicts under uncertainty. Bidirectional systems have been feasible for fifty years yet have not transferred to most domain-specific applications. Large language models architecturally combine generation and recognition while preserving the operational asymmetry.

What carries the argument

The six dimensions of operational asymmetry between generation and recognition in formal grammars.

If this is right

  • Unconstrained generation requires no input and is computationally trivial.
  • Generation under constraints can reach NP-hard complexity.
  • Parsing is always limited by the supplied input string.
  • Temporality formalizes as surprisal equal to zero for generators and greater than zero for parsers.
  • Bidirectional systems have been available for fifty years without widespread transfer to domain-specific applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework implies that complexity results in formal language theory should explicitly separate generation from recognition tasks.
  • It suggests designing NLP systems that explicitly manage the constraint difference rather than treating the two directions symmetrically.
  • In cognitive modeling, the asymmetry may clarify differences between human language production and comprehension under uncertainty.
  • Empirical tests of dimension independence could be run on specific grammar families to check whether they behave as orthogonal.

Load-bearing premise

The six listed dimensions are independent and that directionality and temporality are previously unidentified dimensions of the asymmetry.

What would settle it

A formal reduction showing that directionality or temporality collapses into one of the other four dimensions, or a concrete grammar where constrained generation remains tractable while parsing is NP-hard.

Figures

Figures reproduced from arXiv: 2603.10139 by Romain Peyrichou.

Figure 1
Figure 1. Figure 1: Shannon’s communication model and the generation-recognition analogy. The encoder (generator) knows the [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Commutative diagram: the round-trip test. Generation followed by recognition should recover the original [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Complexity heatmaps for generation (top) and recognition (bottom). Rows represent task specification; [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The complexity gap (semi-log scale). Seven curves corresponding to the complexity levels identified in Figure [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The three operations: a hierarchy of difficulty. Canonical case (k = 2, simplest tasks). Full ranges: O(n) to < R — see §4.1. The inequality between recognition and inference is sharper than the one between generation and recognition: Pitt and Warmuth showed that PAC-predicting DFAs is “as difficult as inverting the RSA encryption function” (cited in de la Higuera, 2010). Inference is not merely harder — i… view at source ↗
Figure 6
Figure 6. Figure 6: Surprisal asymmetry on the running example [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗
read the original abstract

Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript surveys the generation-recognition asymmetry in formal language theory, arguing that generation and recognition are extensionally equivalent but operationally asymmetric along six dimensions: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. It contends that the common characterization 'generation is easy, parsing is hard' is misleading because unconstrained generation is trivial while constrained generation can be NP-hard, with the real asymmetry being that parsing is always input-constrained. The paper claims novelty for directionality and temporality, links the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), reviews bidirectional NLP systems, and discusses how large language models architecturally unify the two while preserving operational asymmetry.

Significance. If the six dimensions prove independent and the novelty claims for directionality and temporality are substantiated, the work could offer a useful organizing framework for asymmetries in formal language theory, compiler design, and NLP, particularly by connecting to surprisal theory and highlighting underutilized bidirectionality. The conceptual synthesis and discussion of LLMs are potential strengths for guiding future research, though the absence of formal derivations or empirical checks limits its immediate impact to literature organization.

major comments (2)
  1. [Abstract] Abstract: The claim that directionality and temporality 'have not previously been identified as dimensions of the generation-recognition asymmetry' is load-bearing for the paper's novelty contribution but lacks an explicit contrast with prior work on incremental parsing, bidirectional models, or real-time processing. The abstract links temporality to Hale (2001) and Levy (2008) yet does not demonstrate that these dimensions are orthogonal to computational complexity, ambiguity, or information availability (e.g., via a counter-example where directionality varies while holding the other four fixed).
  2. [Abstract] The central claim of six independent dimensions requires substantiation that is not evident from the abstract's literature review. For instance, temporality (real-time prediction under uncertainty vs. zero-surprisal generation) appears closely related to information availability and directionality; without a separation argument or table showing distinct predictions, the multidimensional framing risks collapsing into re-descriptions of known trade-offs rather than new axes.
minor comments (1)
  1. The abstract is information-dense; consider enumerating the six dimensions in a bulleted list or table for improved readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these insightful comments on the abstract and the framing of our contributions. We agree that the novelty claims for directionality and temporality, as well as the independence of the six dimensions, require stronger substantiation to avoid appearing as re-descriptions of existing ideas. We will revise the abstract and add supporting material in the body to address these points directly. Our responses to the major comments follow.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that directionality and temporality 'have not previously been identified as dimensions of the generation-recognition asymmetry' is load-bearing for the paper's novelty contribution but lacks an explicit contrast with prior work on incremental parsing, bidirectional models, or real-time processing. The abstract links temporality to Hale (2001) and Levy (2008) yet does not demonstrate that these dimensions are orthogonal to computational complexity, ambiguity, or information availability (e.g., via a counter-example where directionality varies while holding the other four fixed).

    Authors: We accept this criticism. The full manuscript contrasts our framing with incremental parsing literature (e.g., in the NLP review section), but the abstract does not make this explicit. In revision, we will insert a concise contrast: prior work treats directionality as a processing choice within recognition, whereas we position it as a fundamental operational asymmetry between generation (which can choose direction freely) and recognition (which is input-directed). For orthogonality, we will add a parenthetical counter-example in the abstract: directionality can be reversed in a regular language generator without altering its linear-time complexity or ambiguity profile, holding the other dimensions fixed. This revision will be incorporated. revision: yes

  2. Referee: [Abstract] The central claim of six independent dimensions requires substantiation that is not evident from the abstract's literature review. For instance, temporality (real-time prediction under uncertainty vs. zero-surprisal generation) appears closely related to information availability and directionality; without a separation argument or table showing distinct predictions, the multidimensional framing risks collapsing into re-descriptions of known trade-offs rather than new axes.

    Authors: We agree that the abstract does not provide a separation argument and that temporality overlaps superficially with information availability. However, they are distinct: information availability concerns the presence or absence of the input string at decision time, while temporality concerns the necessity of sequential prediction under uncertainty (surprisal > 0) versus zero-surprisal generation. In the manuscript body we already separate them via different implications for cognitive modeling, but we acknowledge the abstract lacks this. We will add a short table (or enumerated distinctions) in the revised abstract or introduction showing unique predictions for each dimension, including a case where temporality varies independently (e.g., offline vs. online parsing with identical information sets). This addresses the risk of collapse into re-description. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual survey grounded in external literature

full rationale

The manuscript is a survey-style conceptual analysis that enumerates six dimensions of generation-recognition asymmetry and connects the temporality dimension explicitly to the independent surprisal framework of Hale (2001) and Levy (2008). No equations, fitted parameters, self-citations, or internal definitions are used to derive the central claims; the novelty assertions for directionality and temporality are presented as literature-based observations rather than reductions to the paper's own outputs. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard domain assumptions from formal language theory that generation and recognition characterize the same language set, plus the background literature on computational complexity and surprisal; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Generation and recognition characterize the same set of strings
    Explicitly stated in the abstract as 'Generation and recognition are extensionally equivalent -- they characterize the same set'

pith-pipeline@v0.9.0 · 5836 in / 1431 out tokens · 92007 ms · 2026-05-21T11:13:25.977895+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages

  1. [1]

    Hale, A probabilistic Earley parser as a psycholinguistic model, in: Proceedings of NAACL, 2001, pp

    J. Hale, A probabilistic Earley parser as a psycholinguistic model, in: Proceedings of NAACL, 2001, pp. 1–8

  2. [2]

    Levy, Expectation-based syntactic comprehension, Cognition 106 (2008) 1126–1177

    R. Levy, Expectation-based syntactic comprehension, Cognition 106 (2008) 1126–1177

  3. [3]

    E. M. Gold, Language identification in the limit, Information and Control 10 (1967) 447– 474

  4. [4]

    Chomsky, Syntactic Structures, Mouton, 1957

    N. Chomsky, Syntactic Structures, Mouton, 1957

  5. [5]

    A. V . Aho, R. Sethi, J. D. Ullman, Compilers: Principles, Techniques, and Tools, Addison- Wesley, 1986

  6. [6]

    R. C. Berwick, Computational complexity and Lexical-Functional grammar, Computa- tional Linguistics 8 (1982) 97–109

  7. [7]

    R. C. Berwick, Strong generative capacity, weak generative capacity, and modern linguistic theories, Computational Linguistics 10 (1984) 189–202

  8. [8]

    Strzalkowski, Reversible Grammar in Natural Language Processing, Springer, 1993

    T. Strzalkowski, Reversible Grammar in Natural Language Processing, Springer, 1993

  9. [9]

    S. M. Shieber, A uniform architecture for parsing and generation, in: Proceedings of COLING, 1988, pp. 614–619

  10. [10]

    Karanikolas, E

    N. Karanikolas, E. Manga, N. E. Samaridi, E. Tousidou, M. Vassilakopoulos, Large lan- guage models versus natural language understanding and generation, in: Proceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics (PCI), 2023, pp. 412–419.doi:10.1145/3635059.3635104

  11. [11]

    Su, C.-W

    S.-Y . Su, C.-W. Huang, Y .-N. Chen, Dual supervised learning for natural language under- standing and generation, in: Proceedings of ACL, 2019, pp. 5472–5477

  12. [12]

    L. Song, Y . Zhang, X. Peng, Z. Wang, D. Gildea, AMR-to-text generation as a traveling salesman problem, in: Proceedings of EMNLP, 2016, pp. 2084–2089.doi:10.18653/ v1/D16-1224

  13. [13]

    G. E. Barton, R. C. Berwick, E. S. Ristad, Computational Complexity and Natural Lan- guage, MIT Press, 1987

  14. [14]

    C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal 27 (1948) 379–423

  15. [15]

    Eco, The influence of Roman Jakobson on the development of semiotics, in: M

    U. Eco, The influence of Roman Jakobson on the development of semiotics, in: M. Kram- pen, et al. (Eds.), Classics of Semiotics, 1987, pp. 109–127

  16. [16]

    Jakobson, Linguistics and poetics, in: T

    R. Jakobson, Linguistics and poetics, in: T. Sebeok (Ed.), Style in Language, 1960, pp. 350–377

  17. [17]

    Miller, Strong Generative Capacity, CSLI Publications, 2000

    P. Miller, Strong Generative Capacity, CSLI Publications, 2000. 29

  18. [18]

    C. W. Morris, Foundations of the theory of signs, in: International Encyclopedia of Unified Science, volume I, 1938

  19. [19]

    Lambek, The mathematics of sentence structure, American Mathematical Monthly 65 (1958) 154–170

    J. Lambek, The mathematics of sentence structure, American Mathematical Monthly 65 (1958) 154–170

  20. [20]

    F. W. Lawvere, Functorial Semantics of Algebraic Theories, Ph.D. thesis, Columbia Uni- versity, 1963

  21. [21]

    D. E. Knuth, On the translation of languages from left to right, Information and Control 8 (1965) 607–639

  22. [22]

    Earley, An efficient context-free parsing algorithm, Communications of the ACM 13 (1970) 94–102

    J. Earley, An efficient context-free parsing algorithm, Communications of the ACM 13 (1970) 94–102

  23. [23]

    Nivre, Algorithms for deterministic incremental dependency parsing, Computational Linguistics 34 (2008) 513–553

    J. Nivre, Algorithms for deterministic incremental dependency parsing, Computational Linguistics 34 (2008) 513–553

  24. [24]

    Wintner, E

    S. Wintner, E. Gabrilovich, N. Francez, Amalia — a unified platform for parsing and generation, in: NLPRS, 1997, pp. 467–472

  25. [25]

    Reiter, R

    E. Reiter, R. Dale, Building Natural Language Generation Systems, Cambridge University Press, 2000

  26. [26]

    Colmerauer, Les systèmes-Q, ou un formalisme pour analyser et synthétiser des phrases sur ordinateur, Technical Report, Université de Montréal, 1970

    A. Colmerauer, Les systèmes-Q, ou un formalisme pour analyser et synthétiser des phrases sur ordinateur, Technical Report, Université de Montréal, 1970

  27. [27]

    R. A. Kowalski, Algorithm=logic+control, Communications of the ACM 22 (1979) 424–436

  28. [28]

    Ranta, Grammatical framework: A type-theoretical grammar formalism, Journal of Functional Programming 14 (2004) 145–189

    A. Ranta, Grammatical framework: A type-theoretical grammar formalism, Journal of Functional Programming 14 (2004) 145–189

  29. [29]

    Ranta, Grammatical framework: An interlingual grammar formalism, in: FSMNLP, 2019

    A. Ranta, Grammatical framework: An interlingual grammar formalism, in: FSMNLP, 2019

  30. [30]

    Ranta, The GF resource grammar library, Linguistic Issues in Language Technology 2 (2009)

    A. Ranta, The GF resource grammar library, Linguistic Issues in Language Technology 2 (2009)

  31. [31]

    D. E. Appelt, Bidirectional grammars and the design of natural language generation sys- tems, in: TINLAP-3, 1987, pp. 185–191

  32. [32]

    Strzalkowski, A general computational method for grammar inversion, in: T

    T. Strzalkowski, A general computational method for grammar inversion, in: T. Strza- lkowski (Ed.), Reversible Grammar in Natural Language Processing, Springer, 1994, pp. 175–199.doi:10.1007/978-1-4615-2722-0_8

  33. [33]

    M. W. Goodman, F. Bond, Using generation for grammar analysis and error detection, in: Proceedings of ACL-IJCNLP, 2009, pp. 109–112

  34. [34]

    Satta, Tree-adjoining grammar parsing and Boolean matrix multiplication, Computa- tional Linguistics 20 (1994) 173–191

    G. Satta, Tree-adjoining grammar parsing and Boolean matrix multiplication, Computa- tional Linguistics 20 (1994) 173–191. 30

  35. [35]

    Adolfi, T

    F. Adolfi, T. Wareham, I. van Rooij, A computational complexity perspective on segmen- tation as a cognitive subcomputation, Topics in Cognitive Science 15 (2023) 723–741. doi:10.1111/tops.12629

  36. [36]

    K. W. Church, R. Patil, Coping with syntactic ambiguity or how to put the block in the box on the table, American Journal of Computational Linguistics 8 (1982) 139–149

  37. [37]

    Billot, B

    S. Billot, B. Lang, The structure of shared forests in ambiguous parsing, in: Proceedings of ACL, 1989, pp. 143–151

  38. [38]

    R. J. Parikh, On context-free languages, Journal of the ACM 13 (1966) 570–581

  39. [39]

    Brabrand, R

    C. Brabrand, R. Giegerich, A. Møller, Analyzing ambiguity of context-free grammars, Science of Computer Programming 75 (2007) 176–191

  40. [40]

    Eco, Opera Aperta, Bompiani, 1962

    U. Eco, Opera Aperta, Bompiani, 1962

  41. [41]

    de la Higuera, Grammatical Inference: Learning Automata and Grammars, Cambridge University Press, 2010

    C. de la Higuera, Grammatical Inference: Learning Automata and Grammars, Cambridge University Press, 2010

  42. [42]

    Lindenmayer, Mathematical models for cellular interactions in development, Journal of Theoretical Biology 18 (1968) 280–315

    A. Lindenmayer, Mathematical models for cellular interactions in development, Journal of Theoretical Biology 18 (1968) 280–315

  43. [43]

    Stiny, J

    G. Stiny, J. Gips, Shape grammars and the generative specification of painting and sculp- ture, in: IFIP Congress, 1972, pp. 1460–1465

  44. [44]

    Kay, Chart generation, in: Proceedings of ACL, 1996, pp

    M. Kay, Chart generation, in: Proceedings of ACL, 1996, pp. 200–204

  45. [45]

    D. J. Rosenkrantz, P. M. Lewis, Deterministic left corner parsing, in: Proceedings of FOCS, 1970, pp. 139–152

  46. [46]

    S. M. Shieber, G. van Noord, F. C. N. Pereira, R. C. Moore, Semantic-head-driven genera- tion, Computational Linguistics 16 (1990) 30–42

  47. [47]

    D. E. Knuth, Semantics of context-free languages, Mathematical Systems Theory 2 (1968) 127–145

  48. [48]

    B. J. Frey, G. E. Hinton, Free energy coding, in: Proceedings of Data Compression Con- ference (DCC ’96), 1996, pp. 73–81.doi:10.1109/DCC.1996.488312

  49. [49]

    Angluin, Learning regular sets from queries and counterexamples, Information and Computation 75 (1987) 87–106

    D. Angluin, Learning regular sets from queries and counterexamples, Information and Computation 75 (1987) 87–106

  50. [50]

    J. J. Horning, A Study of Grammatical Inference, Ph.D. thesis, Stanford University, 1969

  51. [51]

    L. Pitt, M. K. Warmuth, The minimum consistent DFA problem cannot be approximated within any polynomial, Journal of the ACM 40 (1993) 95–142

  52. [52]

    Kearns, L

    M. Kearns, L. G. Valiant, Cryptographic limitations on learning Boolean formulae and finite automata, Journal of the ACM 41 (1994) 67–95

  53. [53]

    Stolcke, An efficient probabilistic context-free parsing algorithm that computes prefix probabilities, Computational Linguistics 21 (1995) 165–201

    A. Stolcke, An efficient probabilistic context-free parsing algorithm that computes prefix probabilities, Computational Linguistics 21 (1995) 165–201. 31

  54. [54]

    Dubnov, K

    S. Dubnov, K. Greer, Deep & shallow: Connections between music information processing and the humanities, Music & Science 7 (2024)

  55. [55]

    J. G. Roederer, The Physics and Psychophysics of Music, 4th ed., Springer, 2008

  56. [56]

    D. E. Appelt, Planning English referring expressions, Artificial Intelligence 26 (1985) 1–33

  57. [57]

    Su, C.-W

    S.-Y . Su, C.-W. Huang, Y .-N. Chen, Towards unsupervised language understanding and generation by joint dual learning, in: Proceedings of ACL, 2020, pp. 671–680

  58. [58]

    Birman, J

    A. Birman, J. D. Ullman, Parsing algorithms with backtrack, Information and Control 23 (1973) 1–34

  59. [59]

    Rissanen, Modeling by shortest data description, Automatica 14 (1978) 465–471

    J. Rissanen, Modeling by shortest data description, Automatica 14 (1978) 465–471

  60. [60]

    Grünwald, The Minimum Description Length Principle, MIT Press, 2007

    P. Grünwald, The Minimum Description Length Principle, MIT Press, 2007

  61. [61]

    Kanani, S

    M. Kanani, S. O’Leary, J. McDermott, Graph-based mutations for music generation, in: Proceedings of the Companion Conference on Genetic and Evolutionary Computation (GECCO), 2023.doi:10.1145/3583133.3596318

  62. [62]

    Tsushima, E

    H. Tsushima, E. Nakamura, K. Itoyama, K. Yoshii, Generative statistical models with self-emergent grammar of chord sequences, Journal of New Music Research 47 (2017) 361–377

  63. [63]

    Bonnet, P

    É. Bonnet, P. Rz˛ a˙ zewski, F. Sikora, Designing RNA secondary structures is hard, Journal of Computational Biology 27 (2020) 302–316. 32