Formally Verified Linear-Time Invertible Lexing

Samuel Chassot; Viktor Kun\v{c}ak

arxiv: 2510.18479 · v3 · pith:VZTJKT26new · submitted 2025-10-21 · 💻 cs.PL · cs.FL

Formally Verified Linear-Time Invertible Lexing

Samuel Chassot , Viktor Kun\v{c}ak This is my paper

Pith reviewed 2026-05-22 13:19 UTC · model grok-4.3

classification 💻 cs.PL cs.FL

keywords verified lexerinvertible lexinglinear timelongest matchformal verificationtoken sequenceszippersmemoization

0 comments

The pith

ZipLex verifies that lexing and printing are exact inverses while running in linear time under longest-match rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ZipLex as a verified framework for lexical analysis that ensures any tokenized string can be printed back to recover the exact original input. It achieves linear-time performance through memoized operations on a new representation of token sequences that preserves their separability. This combination matters for building reliable language tools where round-tripping between source text and tokens must be lossless and efficient. Earlier approaches either skipped invertibility guarantees or accepted quadratic costs in some inputs. If the claims hold, verified lexers can support practical applications like JSON handling and full language front-ends without hidden performance traps.

Core claim

ZipLex establishes that lexing a string to a token sequence and then printing that sequence always yields the original string, that the lexing step itself completes in time linear in the input length, and that the process follows longest-match semantics. These properties rest on an abstraction for token sequences that enables both efficient updates and proofs of invertibility, combined with verified zippers for navigation and a standalone imperative hash table for memoization.

What carries the argument

The new abstraction of token sequences that captures separability of tokens while supporting efficient manipulation and invertibility proofs.

If this is right

Token sequences can be manipulated and then printed to recover source text exactly in verified compilers or pretty-printers.
JSON processors and language lexers built on this framework avoid quadratic slowdowns on inputs that defeat flex-style methods.
Linear-time behavior holds even when memoization is required to handle overlapping token choices.
The same verified structures can be reused for other string-to-structure transformations that require round-tripping.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of token sequences from concrete string positions could simplify proofs in other bidirectional language processing tools.
Combining zippers with verified hash tables might transfer to linear-time verified parsing or pretty-printing beyond lexing.
Extending the abstraction to handle context-sensitive tokens would test whether invertibility survives more complex grammar rules.

Load-bearing premise

The token sequence abstraction correctly models separability and invertibility without introducing unverified edge cases during memoization or zipper operations.

What would settle it

A concrete input string for which lexing followed by printing yields a string different from the original, or for which measured lexing time grows quadratically rather than linearly with input size.

Figures

Figures reproduced from arXiv: 2510.18479 by Samuel Chassot, Viktor Kun\v{c}ak.

**Figure 2.** Figure 2: Token semantic value for JSON integer literal along with the injective transformation function. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: ADT definitions for tokens and rules of the lexer framework. Signature of the [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 5.** Figure 5: Correctness theorem (in Stainless) for the [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Definition of the new Injection construct. In the token transformation case, f is transform and witness is charsOf. For readability, in this section we use the syntactic sugar [𝑎-𝑧] and [0-9] to denote union of all characters in the given interval, and 𝑟 + for 𝑟 · 𝑟 ∗ . While not primitive in our DSL, they can be easily expressed as Scala functions that build the underlying expanded regular expressions. 3.… view at source ↗

**Figure 7.** Figure 7: PrintableTokens token sequence abstraction that maintains sep property on successive elements of the sequence. Also shown are operations append and slice that maintain the invariant. Formally, an R-path condition 𝑝𝑅 for a given relation 𝑅 is defined as follows for a sequence 𝑠: 𝑝𝑅 (𝑠) = ∀𝑖, 0 ≤ 𝑖 < |𝑠| − 1 =⇒ 𝑅(𝑠𝑖 , 𝑠𝑖+1) We call R-Path a sequence that satisfies an R-Path condition. One commonly known R-Pa… view at source ↗

**Figure 8.** Figure 8: Formal definition of Brzozowski’s derivatives where [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: ADT definitions for zipper-based representation of regular expressions and signatures of [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Brzozowski derivatives on zippers, and matching function for zippers. [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗

**Figure 11.** Figure 11: Matching of single-line comments of size [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: JSON lexer performance with or without memoization (left). Comparison of R-Path [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 13.** Figure 13: Comparison between lexing and computing R-Path [PITH_FULL_IMAGE:figures/full_fig_p021_13.png] view at source ↗

**Figure 14.** Figure 14: JSON lexers performance comparison. x-axis shows the length of input files in number of characters; y-axis shows the time in [PITH_FULL_IMAGE:figures/full_fig_p022_14.png] view at source ↗

read the original abstract

We present ZipLex, a verified framework for invertible linear-time lexical analysis following the longest match (maximal munch) semantics. Unlike past verified lexers that focus only on satisfying the semantics of regular expressions and the longest match property, ZipLex also guarantees that lexing and printing are mutual inverses. Thanks to verified memoization, it also ensures that the lexical analysis of a string is linear in the size of the string. Our design and implementation rely on two sets of ideas: (1) a new abstraction of token sequences that captures the separability of tokens in a sequence while supporting their efficient manipulation, and (2) a combination of verified data structures and optimizations, including Huet's zippers and memoization with a standalone verified imperative hash table. Our hash table offers competitive performance as shown by our evaluation. We implemented and verified ZipLex using the Stainless deductive verifier for Scala. Our evaluation demonstrates that ZipLex supports realistic applications such as JSON processing and lexers of programming languages, and behaves linearly even in cases that make flex-style approaches quadratic. ZipLex is two orders of magnitude faster than Verbatim++, showing that verified invertibility and linear-time algorithms can be developed without prohibitive cost. Compared to Coqlex, ZipLex also offers linear (instead of quadratic) time lexing, and is the first lexer that comes with invertibility proofs for printing token sequences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ZipLex combines verified mutual invertibility with linear-time lexing via a token-sequence abstraction and a custom verified hash table, which prior work like Coqlex did not achieve.

read the letter

The key advance is a verified lexer that proves lexing and printing are mutual inverses while guaranteeing linear time under longest-match rules. This comes from a new token-sequence abstraction that captures separability, combined with Huet's zippers and a standalone imperative hash table for memoization, all checked in Stainless for Scala. The evaluation shows it stays linear on JSON and real language lexers where flex-style methods turn quadratic, and it runs two orders of magnitude faster than Verbatim++ while adding the invertibility property that Coqlex lacked. That combination of formal guarantees and practical speed is what makes the result stand out. The deductive verification approach gives direct support for the claims rather than relying on post-hoc checks, and the shipped implementation lets others inspect the details. The hash table's competitive performance is a nice concrete win for verified data structures. One minor soft spot is that full proof scripts and any remaining edge cases in the zipper or memoization interactions would benefit from more space in the paper, though the abstract and stress-test indicate the central arguments hold without circularity or fitting. The work targets readers building verified language processors or compiler components who need both correctness and efficiency. Anyone working on formal methods for parsing or invertible transformations will find usable building blocks here. It deserves a serious referee because the novelty in the verified combination is clear, the evidence from verification plus benchmarks is proportionate, and the claims are falsifiable through the provided code.

Referee Report

0 major / 3 minor

Summary. The paper presents ZipLex, a formally verified framework for linear-time invertible lexical analysis under longest-match (maximal munch) semantics. It guarantees that lexing and printing are mutual inverses, achieves linear time via verified memoization using a standalone imperative hash table, and relies on Huet's zippers combined with a new token-sequence abstraction for separability and efficient manipulation. The implementation is verified in Stainless for Scala, with evaluation demonstrating practicality on JSON processing and programming-language lexers, linear behavior where flex-style approaches become quadratic, and performance two orders of magnitude faster than Verbatim++ while offering linear time unlike Coqlex.

Significance. If the central verification results hold, the work is significant for demonstrating that deductive verification can deliver practical, high-performance lexers with strong guarantees (invertibility plus linearity) that prior tools lack. The machine-checked proofs, reproducible code, and parameter-free derivations from the stated semantics are explicit strengths; the evaluation on realistic inputs further supports that verified invertibility need not incur prohibitive cost.

minor comments (3)

[Section 3] The description of the token-sequence abstraction would benefit from an explicit small example showing how separability is preserved under zipper operations (e.g., around the definition of the new data structure).
[Section 5] Table 1 (performance comparison) reports speedups but does not include raw timing data or input sizes for the quadratic cases; adding these would strengthen the linear-time claim.
[Section 4] A few Scala-specific notations in the code listings (e.g., implicit parameters for the hash table) could be clarified for readers less familiar with Stainless.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review, detailed summary of our contributions, and recommendation to accept the paper. We appreciate the recognition that ZipLex provides the first verified lexer combining linear-time performance with provable invertibility between lexing and printing.

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper establishes its claims of invertibility and linear-time lexing through direct formal verification in Stainless against explicitly stated semantics, using Huet's zippers, a token-sequence abstraction for separability, and a standalone verified imperative hash table for memoization. No equations, parameters, or derivations reduce the central properties to fitted inputs or self-citations by construction; the proofs are independent of the target results and the evaluation on JSON and language lexers supplies external confirmation. The derivation is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard domain assumptions about longest-match lexing and the soundness of Stainless, plus the correctness of the newly introduced token-sequence abstraction; no free parameters or invented entities are introduced.

axioms (2)

domain assumption Longest-match (maximal munch) semantics for tokenization
Invoked as the target semantics that both lexing and the invertibility property must satisfy.
standard math Soundness of the Stainless deductive verifier for the Scala subset used
The entire verification effort depends on Stainless correctly discharging the proof obligations.

pith-pipeline@v0.9.0 · 5773 in / 1310 out tokens · 61733 ms · 2026-05-22T13:19:46.909175+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce a new predicate family, R-path predicates, to describe predicates where a relation applies to pairs of successive elements... sep(t1, t2) := ¬(∪ ri).prefixMatch(t1 ::: t2(0))
IndisputableMonolith/Foundation/RealityFromDistinction reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ZipLex guarantees that lexing and printing are mutual inverses and that lexical analysis runs in linear time... using Huet’s zippers and memoization

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

[1]

Jonathan Immanuel Brachthäuser, Tillmann Rendel, and Klaus Ostermann. 2016. Parsing with first-class derivatives. InProceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications(Amsterdam, Netherlands)(OOPSLA 2016). Association for Computing Machinery, New York, NY, USA, 588–606. doi:10.1...

work page doi:10.1145/2983990.2984026 2016
[2]

Janusz A Brzozowski. 1964. Derivatives of regular expressions.Journal of the ACM (JACM)11, 4 (1964), 481–494

work page 1964
[3]

Mario Bucev, Samuel Chassot, Simon Felix, Filip Schramka, and Viktor Kunčak. 2025. Formally Verifiable Generated ASN.1/ACN Encoders and Decoders: A Case Study. InVerification, Model Checking, and Abstract Interpretation, Krishna Shankaranarayanan, Sriram Sankaranarayanan, and Ashutosh Trivedi (Eds.). Springer Nature Switzerland, Cham, 185–207

work page 2025
[4]

1978.Formalization and automatic derivation of code generators.Technical Report

Roderic Geoffrey Galton Cattell. 1978.Formalization and automatic derivation of code generators.Technical Report. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE

work page 1978
[5]

Samuel Chassot and Viktor Kunčak. 2024. Verifying a Realistic Mutable Hash Table. InAutomated Reasoning, Christoph Benzmüller, Marijn J.H. Heule, and Renate A. Schmidt (Eds.). Springer Nature Switzerland, Cham, 304–314. doi:10.1007/978-3-031-63498-7_18

work page doi:10.1007/978-3-031-63498-7_18 2024
[6]

Li, and Konstantinos Mamouras

Agnishom Chattopadhyay, Angela W. Li, and Konstantinos Mamouras. 2025. Verified and Efficient Matching of Regular Expressions with Lookaround. InProceedings of the 14th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP ’25). Association for Computing Machinery, New York, NY, USA, 198–213. doi:10.1145/3703595.3705884

work page doi:10.1145/3703595.3705884 2025
[7]

Pierce Darragh and Michael D. Adams. 2020. Parsing with zippers (functional pearl).Proc. ACM Program. Lang.4, ICFP (2020), 108:1–108:28. doi:10.1145/3408990

work page doi:10.1145/3408990 2020
[8]

2021.Efficient Parsing with Derivatives and Zippers

Romain Edelmann. 2021.Efficient Parsing with Derivatives and Zippers. Ph. D. Dissertation. EPFL. doi:10.5075/epfl-thesis-7357

work page doi:10.5075/epfl-thesis-7357 2021
[9]

Romain Edelmann, Jad Hamza, and Viktor Kunčak. 2020. Zippy LL(1) parsing with derivatives. InProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation(London, UK)(PLDI 2020). Association for Computing Machinery, New York, NY, USA, 1036–1051. doi:10.1145/3385412.3385992

work page doi:10.1145/3385412.3385992 2020
[10]

Derek Egolf, Sam Lasser, and Kathleen Fisher. 2021. Verbatim: A Verified Lexer Generator. In2021 IEEE Security and Privacy Workshops (SPW). IEEE, San Francisco, CA, USA, 92–100. doi:10.1109/SPW53761.2021.00022

work page doi:10.1109/spw53761.2021.00022 2021
[11]

Derek Egolf, Sam Lasser, and Kathleen Fisher. 2022. Verbatim++: verified, optimized, and semantically rich lexing with derivatives. InProceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs. ACM, Philadelphia PA USA, 27–39. doi:10.1145/3497775.3503694

work page doi:10.1145/3497775.3503694 2022
[12]

Simon Guilloud and Clément Pit-Claudel. 2025. Verified and Optimized Implementation of Orthologic Proof Search. InComputer Aided Verification - 37th International Conference, CA V 2025, July 23-25, 2025, Proceedings, Part III (Lecture Notes in Computer Science, Vol. 15933), Ruzica Piskac and Zvonimir Rakamaric (Eds.). Springer, Zagreb, Croatia, 130–152. d...

work page doi:10.1007/978-3-031-98682-6_8 2025
[13]

Gérard Huet. 1997. The zipper.Journal of functional programming7, 5 (1997), 549–554

work page 1997
[14]

In: Clarke, E.M., Voronkov, A

K. Rustan M. Leino. 2010. Dafny: An Automatic Program Verifier for Functional Correctness. InLogic for Programming, Artificial Intelligence, and Reasoning, Edmund M. Clarke and Andrei Voronkov (Eds.). Springer, Berlin, Heidelberg, 348–370. doi:10.1007/978-3-642-17511-4_20

work page doi:10.1007/978-3-642-17511-4_20 2010
[15]

2024.The CompCert C verified compiler: Documentation and user’s manual

Xavier Leroy. 2024.The CompCert C verified compiler: Documentation and user’s manual. report. Inria. https://inria.hal.science/hal-01091802 Pages: 1

work page 2024
[16]

Xavier Leroy, Sandrine Blazy, Daniel Kästner, Bernhard Schommer, Markus Pister, and Christian Ferdinand. 2016. CompCert - A Formally Verified Optimizing Compiler. InERTS 2016: Embedded Real Time Software and Systems, 8th European Congress. SEE, HAL, Toulouse, France, 0, 0. https://inria.hal.science/hal-01238879

work page 2016
[17]

Ravichandhran Madhavan, Sumith Kulal, and Viktor Kuncak. 2017. Contract-based resource verification for higher-order functions with memoization. SIGPLAN Not.52, 1 (Jan. 2017), 330–343. doi:10.1145/3093333.3009874

work page doi:10.1145/3093333.3009874 2017
[18]

Matthew Might, David Darais, and Daniel Spiewak. 2011. Parsing with derivatives: a functional pearl. InProceeding of the 16th ACM SIGPLAN international conference on Functional Programming, ICFP 2011, September 19-21, 2011, Manuel M. T. Chakravarty, Zhenjiang Hu, and Olivier Danvy (Eds.). ACM, Tokyo, Japan, 189–195. doi:10.1145/2034773.2034801

work page doi:10.1145/2034773.2034801 2011
[19]

Tobias Nipkow. 1998. Verified lexical analysis. InTheorem Proving in Higher Order Logics, Jim Grundy and Malcolm Newey (Eds.). Springer, Berlin, Heidelberg, 1–15. doi:10.1007/BFb0055126

work page doi:10.1007/bfb0055126 1998
[20]

Wendlasida Ouedraogo, Gabriel Scherer, and Lutz Strassburger. 2023. Coqlex Artifact Gitlab Repository. https://gitlab.inria.fr/wouedrao/coqlex/- /tree/master/Comparison/JSON?ref_type=heads

work page 2023
[21]

Wendlasida Ouedraogo, Gabriel Scherer, and Lutz Strassburger. 2023. Coqlex: Generating Formally Verified Lexers.The Art, Science, and Engineering of Programming8, 1 (2023), 3–1

work page 2023
[22]

Scott Owens, John Reppy, and Aaron Turon. 2009. Regular-expression derivatives re-examined.Journal of Functional Programming19, 2 (2009), 173–190

work page 2009
[23]

2011.Grammatical Framework: Programming with Multilingual Grammars

Aarne Ranta. 2011.Grammatical Framework: Programming with Multilingual Grammars. CSLI Publications, Stanford. ISBN-10: 1-57586-626-9 (Paper), 1-57586-627-7 (Cloth)

work page 2011
[24]

Tillmann Rendel and Klaus Ostermann. 2010. Invertible syntax descriptions: unifying parsing and pretty printing.SIGPLAN Not.45, 11 (Sept. 2010), 1–12. doi:10.1145/2088456.1863525

work page doi:10.1145/2088456.1863525 2010
[25]

Maximal-munch

Thomas W. Reps. 1998. "Maximal-munch" Tokenization in Linear Time.ACM Trans. Program. Lang. Syst.20, 2 (1998), 259–273. doi:10.1145/276393. 276394 Manuscript submitted to ACM ZipLex: Verified Invertible Lexing with Memoized Derivatives and Zippers 25

work page doi:10.1145/276393 1998
[26]

Noé De Santo, Aurèle Barrière, and Clément Pit-Claudel. 2024. A Coq Mechanization of JavaScript Regular Expression Semantics.Proc. ACM Program. Lang.8, ICFP (2024), 1003–1031. doi:10.1145/3674666

work page doi:10.1145/3674666 2024
[27]

Ian Erik Varatalu, Margus Veanes, and Juhan Ernits. 2025. RE#: High Performance Derivative-Based Regex Matching with Intersection, Complement, and Restricted Lookarounds.Proceedings of the ACM on Programming Languages9, POPL (2025), 1–32. Manuscript submitted to ACM

work page 2025

[1] [1]

Jonathan Immanuel Brachthäuser, Tillmann Rendel, and Klaus Ostermann. 2016. Parsing with first-class derivatives. InProceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications(Amsterdam, Netherlands)(OOPSLA 2016). Association for Computing Machinery, New York, NY, USA, 588–606. doi:10.1...

work page doi:10.1145/2983990.2984026 2016

[2] [2]

Janusz A Brzozowski. 1964. Derivatives of regular expressions.Journal of the ACM (JACM)11, 4 (1964), 481–494

work page 1964

[3] [3]

Mario Bucev, Samuel Chassot, Simon Felix, Filip Schramka, and Viktor Kunčak. 2025. Formally Verifiable Generated ASN.1/ACN Encoders and Decoders: A Case Study. InVerification, Model Checking, and Abstract Interpretation, Krishna Shankaranarayanan, Sriram Sankaranarayanan, and Ashutosh Trivedi (Eds.). Springer Nature Switzerland, Cham, 185–207

work page 2025

[4] [4]

1978.Formalization and automatic derivation of code generators.Technical Report

Roderic Geoffrey Galton Cattell. 1978.Formalization and automatic derivation of code generators.Technical Report. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE

work page 1978

[5] [5]

Samuel Chassot and Viktor Kunčak. 2024. Verifying a Realistic Mutable Hash Table. InAutomated Reasoning, Christoph Benzmüller, Marijn J.H. Heule, and Renate A. Schmidt (Eds.). Springer Nature Switzerland, Cham, 304–314. doi:10.1007/978-3-031-63498-7_18

work page doi:10.1007/978-3-031-63498-7_18 2024

[6] [6]

Li, and Konstantinos Mamouras

Agnishom Chattopadhyay, Angela W. Li, and Konstantinos Mamouras. 2025. Verified and Efficient Matching of Regular Expressions with Lookaround. InProceedings of the 14th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP ’25). Association for Computing Machinery, New York, NY, USA, 198–213. doi:10.1145/3703595.3705884

work page doi:10.1145/3703595.3705884 2025

[7] [7]

Pierce Darragh and Michael D. Adams. 2020. Parsing with zippers (functional pearl).Proc. ACM Program. Lang.4, ICFP (2020), 108:1–108:28. doi:10.1145/3408990

work page doi:10.1145/3408990 2020

[8] [8]

2021.Efficient Parsing with Derivatives and Zippers

Romain Edelmann. 2021.Efficient Parsing with Derivatives and Zippers. Ph. D. Dissertation. EPFL. doi:10.5075/epfl-thesis-7357

work page doi:10.5075/epfl-thesis-7357 2021

[9] [9]

Romain Edelmann, Jad Hamza, and Viktor Kunčak. 2020. Zippy LL(1) parsing with derivatives. InProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation(London, UK)(PLDI 2020). Association for Computing Machinery, New York, NY, USA, 1036–1051. doi:10.1145/3385412.3385992

work page doi:10.1145/3385412.3385992 2020

[10] [10]

Derek Egolf, Sam Lasser, and Kathleen Fisher. 2021. Verbatim: A Verified Lexer Generator. In2021 IEEE Security and Privacy Workshops (SPW). IEEE, San Francisco, CA, USA, 92–100. doi:10.1109/SPW53761.2021.00022

work page doi:10.1109/spw53761.2021.00022 2021

[11] [11]

Derek Egolf, Sam Lasser, and Kathleen Fisher. 2022. Verbatim++: verified, optimized, and semantically rich lexing with derivatives. InProceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs. ACM, Philadelphia PA USA, 27–39. doi:10.1145/3497775.3503694

work page doi:10.1145/3497775.3503694 2022

[12] [12]

Simon Guilloud and Clément Pit-Claudel. 2025. Verified and Optimized Implementation of Orthologic Proof Search. InComputer Aided Verification - 37th International Conference, CA V 2025, July 23-25, 2025, Proceedings, Part III (Lecture Notes in Computer Science, Vol. 15933), Ruzica Piskac and Zvonimir Rakamaric (Eds.). Springer, Zagreb, Croatia, 130–152. d...

work page doi:10.1007/978-3-031-98682-6_8 2025

[13] [13]

Gérard Huet. 1997. The zipper.Journal of functional programming7, 5 (1997), 549–554

work page 1997

[14] [14]

In: Clarke, E.M., Voronkov, A

K. Rustan M. Leino. 2010. Dafny: An Automatic Program Verifier for Functional Correctness. InLogic for Programming, Artificial Intelligence, and Reasoning, Edmund M. Clarke and Andrei Voronkov (Eds.). Springer, Berlin, Heidelberg, 348–370. doi:10.1007/978-3-642-17511-4_20

work page doi:10.1007/978-3-642-17511-4_20 2010

[15] [15]

2024.The CompCert C verified compiler: Documentation and user’s manual

Xavier Leroy. 2024.The CompCert C verified compiler: Documentation and user’s manual. report. Inria. https://inria.hal.science/hal-01091802 Pages: 1

work page 2024

[16] [16]

Xavier Leroy, Sandrine Blazy, Daniel Kästner, Bernhard Schommer, Markus Pister, and Christian Ferdinand. 2016. CompCert - A Formally Verified Optimizing Compiler. InERTS 2016: Embedded Real Time Software and Systems, 8th European Congress. SEE, HAL, Toulouse, France, 0, 0. https://inria.hal.science/hal-01238879

work page 2016

[17] [17]

Ravichandhran Madhavan, Sumith Kulal, and Viktor Kuncak. 2017. Contract-based resource verification for higher-order functions with memoization. SIGPLAN Not.52, 1 (Jan. 2017), 330–343. doi:10.1145/3093333.3009874

work page doi:10.1145/3093333.3009874 2017

[18] [18]

Matthew Might, David Darais, and Daniel Spiewak. 2011. Parsing with derivatives: a functional pearl. InProceeding of the 16th ACM SIGPLAN international conference on Functional Programming, ICFP 2011, September 19-21, 2011, Manuel M. T. Chakravarty, Zhenjiang Hu, and Olivier Danvy (Eds.). ACM, Tokyo, Japan, 189–195. doi:10.1145/2034773.2034801

work page doi:10.1145/2034773.2034801 2011

[19] [19]

Tobias Nipkow. 1998. Verified lexical analysis. InTheorem Proving in Higher Order Logics, Jim Grundy and Malcolm Newey (Eds.). Springer, Berlin, Heidelberg, 1–15. doi:10.1007/BFb0055126

work page doi:10.1007/bfb0055126 1998

[20] [20]

Wendlasida Ouedraogo, Gabriel Scherer, and Lutz Strassburger. 2023. Coqlex Artifact Gitlab Repository. https://gitlab.inria.fr/wouedrao/coqlex/- /tree/master/Comparison/JSON?ref_type=heads

work page 2023

[21] [21]

Wendlasida Ouedraogo, Gabriel Scherer, and Lutz Strassburger. 2023. Coqlex: Generating Formally Verified Lexers.The Art, Science, and Engineering of Programming8, 1 (2023), 3–1

work page 2023

[22] [22]

Scott Owens, John Reppy, and Aaron Turon. 2009. Regular-expression derivatives re-examined.Journal of Functional Programming19, 2 (2009), 173–190

work page 2009

[23] [23]

2011.Grammatical Framework: Programming with Multilingual Grammars

Aarne Ranta. 2011.Grammatical Framework: Programming with Multilingual Grammars. CSLI Publications, Stanford. ISBN-10: 1-57586-626-9 (Paper), 1-57586-627-7 (Cloth)

work page 2011

[24] [24]

Tillmann Rendel and Klaus Ostermann. 2010. Invertible syntax descriptions: unifying parsing and pretty printing.SIGPLAN Not.45, 11 (Sept. 2010), 1–12. doi:10.1145/2088456.1863525

work page doi:10.1145/2088456.1863525 2010

[25] [25]

Maximal-munch

Thomas W. Reps. 1998. "Maximal-munch" Tokenization in Linear Time.ACM Trans. Program. Lang. Syst.20, 2 (1998), 259–273. doi:10.1145/276393. 276394 Manuscript submitted to ACM ZipLex: Verified Invertible Lexing with Memoized Derivatives and Zippers 25

work page doi:10.1145/276393 1998

[26] [26]

Noé De Santo, Aurèle Barrière, and Clément Pit-Claudel. 2024. A Coq Mechanization of JavaScript Regular Expression Semantics.Proc. ACM Program. Lang.8, ICFP (2024), 1003–1031. doi:10.1145/3674666

work page doi:10.1145/3674666 2024

[27] [27]

Ian Erik Varatalu, Margus Veanes, and Juhan Ernits. 2025. RE#: High Performance Derivative-Based Regex Matching with Intersection, Complement, and Restricted Lookarounds.Proceedings of the ACM on Programming Languages9, POPL (2025), 1–32. Manuscript submitted to ACM

work page 2025