DateSAT: A Framework for Solving Date and Period Constraints

Leyi Cui; Rohan Padhye; Shrey Tiwari

arxiv: 2605.25180 · v1 · pith:RRGBY7GDnew · submitted 2026-05-24 · 💻 cs.LO · cs.PL

DateSAT: A Framework for Solving Date and Period Constraints

Leyi Cui , Shrey Tiwari , Rohan Padhye This is my paper

Pith reviewed 2026-06-29 23:38 UTC · model grok-4.3

classification 💻 cs.LO cs.PL

keywords date constraintscalendar periodssatisfiabilitySMT reductionprogram analysisconstraint solvingdate arithmetic

0 comments

The pith

Date and period constraints reduce to integer SMT problems for automated solving.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a language and semantics for constraints on dates and calendar periods such as months and years. It then supplies five reduction strategies that convert these constraints into satisfiability problems over integers. A reader would care because date logic appears often in software and documents yet remains error-prone and unsupported by standard analysis tools. If the reductions hold, existing solvers can decide satisfiability of date constraints without manual encoding of calendar rules.

Core claim

The paper claims that satisfiability constraints involving dates and periods can be decided by formalizing their semantics and applying five reduction strategies that convert them into integer-based SMT formulas, which can then be solved using backends like Z3. This is demonstrated through a curated dataset of 450 constraints and performance evaluation.

What carries the argument

Five reduction strategies that map date and period operations to integer constraints while maintaining equivalence in satisfiability.

If this is right

Program analysis tools gain the ability to reason symbolically about operations on dates and periods.
Verification of software containing calendar logic becomes possible without ad-hoc encodings.
Automated checking of date-based rules in legal or contractual documents becomes feasible.
Empirical tests on 450 synthesized and mined constraints show the reductions can be solved by current SMT backends.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar reduction techniques could apply to related domains such as recurring schedules or fiscal periods.
The approach might combine with domain-specific date solvers to improve performance on large instances.
Integration into broader verification frameworks would allow date constraints to appear alongside other program properties.

Load-bearing premise

The five reduction strategies correctly preserve satisfiability and unsatisfiability with respect to the formalized semantics of date and period arithmetic.

What would settle it

A concrete date constraint where direct evaluation against the date semantics yields satisfiability different from the result after applying any of the five reductions to an SMT solver.

Figures

Figures reproduced from arXiv: 2605.25180 by Leyi Cui, Rohan Padhye, Shrey Tiwari.

**Figure 2.** Figure 2: Excerpts of legal statutes from the U.S. Internal Revenue Code (IRC) [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Semantics for date and period arithmetic in [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 1.** Figure 1: If satisfiable, then the two Python functions are not equivalent. 1 first_buy : Date 2 acq_date : Date 3 elec_ddl : Date 4 elec_date : Date 5 6 acq_date >= first_buy 7 acq_date < first_buy + 8 Period(0, 12, 0) 9 elec_ddl.day == 15 10 elec_ddl.year == (acq_date + 11 Period(0, 8, 0)).year 12 elec_ddl.month == (acq_date + 13 Period(0, 8, 0)).month 14 elec_date <= elec_ddl 15 16 elec_date == first_buy 17 + Per… view at source ↗

**Figure 5.** Figure 5: Execution time speedups for various encoding strategies (ref. Sections 4.2– [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

**Figure 6.** Figure 6: DateSAT constraint language grammar. The base language is an infix syntax for expressing quantifier-free integer linear arithmetic and boolean logic, whereas the highlighted lines add specific terms for expressing dates, periods, date comparisons, date/period arithmetic, and extracting components of dates [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗

**Figure 7.** Figure 7: Distribution of variables and atoms per constraint across [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗

**Figure 8.** Figure 8: Example of an LLM-generated synthetic constraint with an explicit se [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗

read the original abstract

Dates and calendar periods (i.e., days, months, years) appear frequently in tasks involving analysis of software, data, and documents. Prior research has shown that computer logic involving dates and calendrical calculations is error-prone due to tricky rules (e.g., irregularly sized months), ambiguities (e.g., scheduling one month from "Jan 31st"), and edge cases (e.g., leap years). However, existing program analysis and verification tools do not provide native support for dates, making it hard to reason about operations involving calendrical arithmetic symbolically. This paper presents DateSAT, the first framework for expressing and solving satisfiability constraints involving dates and calendar periods. The paper first formalizes an input language and the semantics of date and period arithmetic. The paper then presents five separate strategies for solving DateSAT constraints based on reductions to SMT formulas involving integers, which we have implemented using Z3 as a backend. We curate a dataset of 450 DateSAT constraints synthesized using LLM prompting, grammar-based sampling, and mining legal documents, and then present an empirical evaluation of DateSAT solver performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DateSAT gives a first cut at native date constraints in SMT via five integer reductions, but the soundness of those mappings rests on unproven claims.

read the letter

The core takeaway is that this paper defines a language for date and period constraints, formalizes their semantics, and reduces them to Z3 integer formulas using five strategies, then tests the result on 450 instances. That is the actual new piece: prior SMT work handled dates only through ad-hoc encodings.

The authors correctly flag that calendar arithmetic is error-prone in real software and that existing tools lack built-in support. The implementation in Z3 and the mixed dataset (LLM-generated, grammar-sampled, and mined from documents) show they tried to make the thing usable rather than purely theoretical.

The main weakness is exactly what the stress-test flags. There is no proof or exhaustive case analysis that the reductions preserve satisfiability in both directions, especially around month lengths, leap years, and ambiguous additions like "Jan 31 + 1 month." Empirical runs on the 450-instance set do not substitute for that argument; they only show the solver does not crash on the chosen examples. If the reductions are unsound on some edge case, the whole framework collapses.

This is aimed at researchers extending SMT solvers for program analysis or verification tasks that touch calendars. A reader who needs a working prototype to experiment with will get value from the code and the dataset. A reader who needs a verified foundation will not.

It is worth sending to referees. The gap it targets is real, the implementation exists, and the soundness question is fixable with more work rather than fatal. A serious review would focus on whether the reductions can be made rigorous.

Referee Report

2 major / 1 minor

Summary. The paper introduces DateSAT, the first framework for expressing and solving satisfiability constraints involving dates and calendar periods. It formalizes an input language and the semantics of date and period arithmetic, presents five reduction strategies to SMT formulas over integers implemented using Z3, and evaluates the approach on a dataset of 450 DateSAT constraints synthesized using LLM prompting, grammar-based sampling, and mining legal documents.

Significance. If the reductions are sound, DateSAT would address a practical gap in program analysis and verification tools by enabling symbolic reasoning about calendrical constraints, which are prone to errors due to irregular month lengths, leap years, and ambiguities. The implementation with Z3 and the curated dataset provide a basis for empirical assessment of performance.

major comments (2)

[Reductions section] Reductions section (following the semantics formalization): No formal proof, machine-checked verification, or exhaustive case analysis is provided showing that the five reduction strategies preserve satisfiability and unsatisfiability with respect to the formalized semantics. This is load-bearing for the central claim that DateSAT correctly solves the constraints, especially given potential failures on edge cases such as month-length variations, leap years, and ambiguous additions like "Jan 31 + 1 month".
[Empirical evaluation section] Empirical evaluation section: Performance results on the 450-instance dataset cannot substitute for a soundness argument, as they provide no evidence that every model of the SMT formula corresponds to a model of the original DateSAT constraint (and vice versa).

minor comments (1)

[Abstract] Abstract and introduction: The claim of being "the first framework" would be strengthened by explicit discussion of related work on date arithmetic in SMT solvers or constraint programming.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the importance of soundness for the reduction strategies. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [Reductions section] Reductions section (following the semantics formalization): No formal proof, machine-checked verification, or exhaustive case analysis is provided showing that the five reduction strategies preserve satisfiability and unsatisfiability with respect to the formalized semantics. This is load-bearing for the central claim that DateSAT correctly solves the constraints, especially given potential failures on edge cases such as month-length variations, leap years, and ambiguous additions like "Jan 31 + 1 month".

Authors: We agree that the current manuscript lacks an explicit formal proof of soundness for the five reduction strategies. The reductions were constructed to be semantics-preserving by design, but we recognize that informal presentation is insufficient for the central claim. We will add a dedicated subsection after the semantics formalization that provides rigorous proofs for each strategy, including explicit handling of edge cases such as leap years, variable month lengths, and ambiguous period additions. These proofs will establish both satisfiability and unsatisfiability preservation. revision: yes
Referee: [Empirical evaluation section] Empirical evaluation section: Performance results on the 450-instance dataset cannot substitute for a soundness argument, as they provide no evidence that every model of the SMT formula corresponds to a model of the original DateSAT constraint (and vice versa).

Authors: We concur that the empirical results on the 450-instance dataset serve only to evaluate performance and scalability and cannot substitute for a soundness argument. We will revise the evaluation section to explicitly state its scope and purpose, removing any phrasing that could be read as implying empirical validation of correctness. The soundness proofs added in response to the first comment will directly address the correspondence between models of the SMT formulas and the original DateSAT constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper formalizes an input language and semantics for dates and periods, then presents five reduction strategies to integer SMT (implemented in Z3) along with empirical results on a 450-instance dataset. No equations, fitted parameters, self-citations, or ansatzes are quoted that would reduce any claimed result to its own inputs by construction. The reductions are presented as a new engineering contribution whose correctness is asserted with respect to the defined semantics; this does not constitute a self-definitional or fitted-input circularity under the specified patterns. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated. The central claim rests on the unshown soundness of the five reductions and the representativeness of the 450-constraint dataset.

pith-pipeline@v0.9.1-grok · 5724 in / 970 out tokens · 21437 ms · 2026-06-29T23:38:44.008266+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 6 canonical work pages · 1 internal anchor

[1]

com/news/claude-sonnet-4-5(2025), retrieved Jan 26, 2026

Anthropic: Introducing Claude Sonnet 4.5.https://www.anthropic. com/news/claude-sonnet-4-5(2025), retrieved Jan 26, 2026

2025
[2]

Azure, M.: Summary of Windows Azure service disruption on Feb 29th, 2012.https://azure.microsoft.com/en-us/blog/summary- of-windows-azure-service-disruption-on-feb-29th-2012/ (2012), retrieved Apr 26, 2026

2012
[3]

Baptiste, P., Le Pape, C., Nuijten, W.: Constraint-based scheduling: ap- plying constraint programming to scheduling problems, vol. 39. Springer Science & Business Media (2001)

2001
[4]

In: Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, UK)

Barrett, C., Stump, A., Tinelli, C., et al.: The SMT-LIB standard: Ver- sion 2.0. In: Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, UK). vol. 13, p. 14 (2010)

2010
[5]

In: Handbook of model checking, pp

Barrett, C., Tinelli, C.: Satisfiability modulo theories. In: Handbook of model checking, pp. 305–343. Springer (2018)

2018
[6]

In: 2017 Formal Methods in Computer Aided Design (FM- CAD)

Berzish, M., Ganesh, V., Zheng, Y.: Z3str3: A string solver with theory- aware heuristics. In: 2017 Formal Methods in Computer Aided Design (FM- CAD). pp. 55–59. IEEE (2017)

2017
[7]

Blair-Stanek, A., Holzenberger, N., Van Durme, B.: Can GPT-3 perform statutory reasoning? arXiv preprint arXiv:2302.06100 (2023)

work page arXiv 2023
[8]

Fordham Intell

Cheian, D.: I see dead patents: How bugs in the patent system keep expired patents alive. Fordham Intell. Prop. Media & Ent. LJ33, 1 (2022)

2022
[9]

In: 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024)

Chistikov, D.: An introduction to the theory of linear integer arithmetic. In: 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024). vol. 323, p. 1. Schloss Dagstuhl—Leibniz-Zentrum für Informatik (2024)

2024
[10]

Creative Commons: CC-BY 4.0 license (2013),https:// creativecommons.org/licenses/by/4.0/

2013
[11]

In: Proceedings of theTheoryandPracticeofSoftware,14thInternationalConferenceonTools and Algorithms for the Construction and Analysis of Systems

De Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Proceedings of theTheoryandPracticeofSoftware,14thInternationalConferenceonTools and Algorithms for the Construction and Analysis of Systems. p. 337–340. TACAS’08/ETAPS’08, Springer-Verlag, Berlin, Heidelberg (2008)

2008
[12]

DeepMind, G.: Gemini 3: Our most intelligent AI model that brings any idea to life.https://deepmind.google/models/gemini/(2025), re- trieved Jan 26, 2026

2025
[13]

In: International Sympo- sium on Formal Techniques in Real-Time and Fault-Tolerant Systems

Dutertre, B., Sorea, M.: Modeling and verification of a fault-tolerant real- time startup protocol using calendar automata. In: International Sympo- sium on Formal Techniques in Real-Time and Fault-Tolerant Systems. pp. 199–214. Springer (2004)

2004
[14]

arXiv preprint arXiv:2406.09170 (2024) DateSAT: A Framework for Solving Date and Period Constraints 23

Fatemi, B., Kazemi, M., Tsitsulin, A., Malkan, K., Yim, J., Palowitch, J., Seo, S., Halcrow, J., Perozzi, B.: Test of time: A benchmark for evaluating LLMs on temporal reasoning. arXiv preprint arXiv:2406.09170 (2024) DateSAT: A Framework for Solving Date and Period Constraints 23

work page arXiv 2024
[15]

PRIMETIME : Limits of LLMs in Temporal Primitives

Gaere, E., Wangenheim, F.: DATETIME: A new benchmark to measure LLM translation and reasoning capabilities. arXiv preprint arXiv:2504.16155 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[16]

Goebel, R., Kano, Y., Kim, M.Y., Rabelo, J., Satoh, K., Yoshioka, M.: Summary of the competition on legal information, extraction/entailment (COLIEE)2023.In:ProceedingsoftheNineteenthInternationalConference on Artificial Intelligence and Law. pp. 472–480 (2023)

2023
[17]

In: European Symposium on Programming

Goutagny, P., Fromherz, A., Monat, R.: CUTECat: Concolic execution for computational law. In: European Symposium on Programming. pp. 31–61. Springer (2025)

2025
[18]

Advances in Neural Information Processing Systems36(2024)

Guha,N.,Nyarko,J.,Ho,D.,Ré,C.,Chilton,A.,Chohlas-Wood,A.,Peters, A., Waldon, B., Rockmore, D., Zambrano, D., et al.: Legalbench: A collab- oratively built benchmark for measuring legal reasoning in large language models. Advances in Neural Information Processing Systems36(2024)

2024
[19]

In: Proceedings of the eighteenth international symposium on Software testing and analysis

Kiezun, A., Ganesh, V., Guo, P.J., Hooimeijer, P., Ernst, M.D.: Hampi: a solver for string constraints. In: Proceedings of the eighteenth international symposium on Software testing and analysis. pp. 105–116 (2009)

2009
[20]

Proceedings of the ACM on Programming Languages 7(OOPSLA1), 286–315 (2023)

Lattuada, A., Hance, T., Cho, C., Brun, M., Subasinghe, I., Zhou, Y., How- ell, J., Parno, B., Hawblitzel, C.: Verus: Verifying Rust programs using linear ghost types. Proceedings of the ACM on Programming Languages 7(OOPSLA1), 286–315 (2023)

2023
[21]

In: International Conference on Computer Aided Verification

Liang, T., Reynolds, A., Tinelli, C., Barrett, C., Deters, M.: A dpll (t) the- ory solver for a theory of strings and regular expressions. In: International Conference on Computer Aided Verification. pp. 646–662. Springer (2014)

2014
[22]

Journal of Open Source Software 4(43), 1891 (11 2019).https://doi.org/10.21105/joss.01891, http://dx.doi.org/10.21105/joss.01891

MacIver, D., Hatfield-Dodds, Z., Contributors, M.: Hypothesis: A new approach to property-based testing. Journal of Open Source Software 4(43), 1891 (11 2019).https://doi.org/10.21105/joss.01891, http://dx.doi.org/10.21105/joss.01891

work page doi:10.21105/joss.01891 2019
[23]

In: European Symposium on Programming

Monat, R., Fromherz, A., Merigoux, D.: Formalizing date arithmetic and statically detecting ambiguities for the law. In: European Symposium on Programming. pp. 421–450. Springer (2024)

2024
[24]

Mora, F., Berzish, M., Kulczynski, M., Nowotka, D., Ganesh, V.: Z3str4: A multi-armedstringsolver.In:InternationalSymposiumonFormalMethods. pp. 389–406. Springer (2021)

2021
[25]

Philosophical Transactions of the Royal Society A382(2270), 20230159 (2024)

Nay, J.J., Karamardian, D., Lawsky, S.B., Tao, W., Bhat, M., Jain, R., Lee, A.T., Choi, J.H., Kasai, J.: Large language models as tax attorneys: a case study in legal capabilities emergence. Philosophical Transactions of the Royal Society A382(2270), 20230159 (2024)

2024
[26]

Code of Laws of the United States of America26(121) (2018)

Office, G.P.: Internal Revenue Code. Code of Laws of the United States of America26(121) (2018)

2018
[27]

OpenAI: Introducing GPT-5.2.https://openai.com/index/ introducing-gpt-5-2/(2025), retrieved Jan 26, 2026

2025
[28]

In: Proceedings of the 2024 ACM SIGPLAN International Sym- posium on New Ideas, New Paradigms, and Reflections on Programming and Software

Padhye, R.: Software engineering methods for AI-driven deductive legal reasoning. In: Proceedings of the 2024 ACM SIGPLAN International Sym- posium on New Ideas, New Paradigms, and Reflections on Programming and Software. pp. 85–95 (2024) 24 L. Cui et al

2024
[29]

Pierce, R.: Birthday puzzle.https://www.mathsisfun.com/ puzzles/birthday.html(2012), retrieved Jan 10, 2026

2012
[30]

Presburger, M.: Über die vollständigkeit eines gewissen systems der arith- metik ganzer zahlen, in welchem die addition als einzige operation hervor- tritt.In:ComptesRendusduICongrèsdesMathématiciensdesPaysSlaves, Warszawa. pp. 92–101. Skład Główny (1929)

1929
[31]

https://www.reuters.com/world/asia-pacific/leap-year- glitch-shuts-some-new-zealand-fuel-pumps-2024-02-29/ (Feb 2024)

Reuters: ’leap year glitch’ shuts some new zealand fuel pumps. https://www.reuters.com/world/asia-pacific/leap-year- glitch-shuts-some-new-zealand-fuel-pumps-2024-02-29/ (Feb 2024)

2024
[32]

In: 2010 IEEE Symposium on Security and Privacy

Saxena, P., Akhawe, D., Hanna, S., Mao, F., McCamant, S., Song, D.: A symbolic execution framework for javascript. In: 2010 IEEE Symposium on Security and Privacy. pp. 513–528. IEEE (2010)

2010
[33]

In: 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR)

Tiwari, S., Chen, S., Joukov, A., Vandervelde, P., Li, A., Padhye, R.: It’s about time: An empirical study of date and time bugs in open-source python software. In: 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR). pp. 39–51 (2025).https://doi.org/10. 1109/MSR66628.2025.00020

work page arXiv 2025
[34]

In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security

Trinh, M.T., Chu, D.H., Jaffar, J.: S3: A symbolic string solver for vul- nerability detection in web applications. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. pp. 1232– 1243 (2014)

2014
[35]

Computers & Operations Research142, 105731 (2022)

Xiong, H., Shi, S., Ren, D., Hu, J.: A survey of job shop scheduling problem: The types and models. Computers & Operations Research142, 105731 (2022)

2022
[36]

In: Proceedings of the eighteenth international conference on artificial in- telligence and law

Yoshioka, M., Aoki, Y., Suzuki, Y.: BERT-based ensemble methods with data augmentation for legal textual entailment in COLIEE statute law task. In: Proceedings of the eighteenth international conference on artificial in- telligence and law. pp. 278–284 (2021)

2021
[37]

Zamudio Amaya, J.A., Smytzek, M., Zeller, A.: Fandango: Evolv- ing language-based testing. Proc. ACM Softw. Eng.2(ISSTA) (Jun 2025).https://doi.org/10.1145/3728915,https://doi.org/ 10.1145/3728915

work page doi:10.1145/3728915 2025
[38]

Zheng, Y., Ganesh, V., Subramanian, S., Tripp, O., Berzish, M., Dolby, J., Zhang, X.: Z3str2: an efficient solver for strings, regular expressions, and lengthconstraints.FormalMethodsinSystemDesign50(2),249–288(2017)

2017
[39]

->" implication 5equality: disjunction | disjunction

Zheng, Y., Zhang, X., Ganesh, V.: Z3-str: A z3-based string solver for web application analysis. In: Proceedings of the 2013 9th Joint Meeting on Foun- dations of Software Engineering. pp. 114–124 (2013) DateSAT: A Framework for Solving Date and Period Constraints 25 Appendix Organization:Appendix A presents the full grammar of theDateSATcon- straint lang...

2013

[1] [1]

com/news/claude-sonnet-4-5(2025), retrieved Jan 26, 2026

Anthropic: Introducing Claude Sonnet 4.5.https://www.anthropic. com/news/claude-sonnet-4-5(2025), retrieved Jan 26, 2026

2025

[2] [2]

Azure, M.: Summary of Windows Azure service disruption on Feb 29th, 2012.https://azure.microsoft.com/en-us/blog/summary- of-windows-azure-service-disruption-on-feb-29th-2012/ (2012), retrieved Apr 26, 2026

2012

[3] [3]

Baptiste, P., Le Pape, C., Nuijten, W.: Constraint-based scheduling: ap- plying constraint programming to scheduling problems, vol. 39. Springer Science & Business Media (2001)

2001

[4] [4]

In: Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, UK)

Barrett, C., Stump, A., Tinelli, C., et al.: The SMT-LIB standard: Ver- sion 2.0. In: Proceedings of the 8th international workshop on satisfiability modulo theories (Edinburgh, UK). vol. 13, p. 14 (2010)

2010

[5] [5]

In: Handbook of model checking, pp

Barrett, C., Tinelli, C.: Satisfiability modulo theories. In: Handbook of model checking, pp. 305–343. Springer (2018)

2018

[6] [6]

In: 2017 Formal Methods in Computer Aided Design (FM- CAD)

Berzish, M., Ganesh, V., Zheng, Y.: Z3str3: A string solver with theory- aware heuristics. In: 2017 Formal Methods in Computer Aided Design (FM- CAD). pp. 55–59. IEEE (2017)

2017

[7] [7]

Blair-Stanek, A., Holzenberger, N., Van Durme, B.: Can GPT-3 perform statutory reasoning? arXiv preprint arXiv:2302.06100 (2023)

work page arXiv 2023

[8] [8]

Fordham Intell

Cheian, D.: I see dead patents: How bugs in the patent system keep expired patents alive. Fordham Intell. Prop. Media & Ent. LJ33, 1 (2022)

2022

[9] [9]

In: 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024)

Chistikov, D.: An introduction to the theory of linear integer arithmetic. In: 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024). vol. 323, p. 1. Schloss Dagstuhl—Leibniz-Zentrum für Informatik (2024)

2024

[10] [10]

Creative Commons: CC-BY 4.0 license (2013),https:// creativecommons.org/licenses/by/4.0/

2013

[11] [11]

In: Proceedings of theTheoryandPracticeofSoftware,14thInternationalConferenceonTools and Algorithms for the Construction and Analysis of Systems

De Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Proceedings of theTheoryandPracticeofSoftware,14thInternationalConferenceonTools and Algorithms for the Construction and Analysis of Systems. p. 337–340. TACAS’08/ETAPS’08, Springer-Verlag, Berlin, Heidelberg (2008)

2008

[12] [12]

DeepMind, G.: Gemini 3: Our most intelligent AI model that brings any idea to life.https://deepmind.google/models/gemini/(2025), re- trieved Jan 26, 2026

2025

[13] [13]

In: International Sympo- sium on Formal Techniques in Real-Time and Fault-Tolerant Systems

Dutertre, B., Sorea, M.: Modeling and verification of a fault-tolerant real- time startup protocol using calendar automata. In: International Sympo- sium on Formal Techniques in Real-Time and Fault-Tolerant Systems. pp. 199–214. Springer (2004)

2004

[14] [14]

arXiv preprint arXiv:2406.09170 (2024) DateSAT: A Framework for Solving Date and Period Constraints 23

Fatemi, B., Kazemi, M., Tsitsulin, A., Malkan, K., Yim, J., Palowitch, J., Seo, S., Halcrow, J., Perozzi, B.: Test of time: A benchmark for evaluating LLMs on temporal reasoning. arXiv preprint arXiv:2406.09170 (2024) DateSAT: A Framework for Solving Date and Period Constraints 23

work page arXiv 2024

[15] [15]

PRIMETIME : Limits of LLMs in Temporal Primitives

Gaere, E., Wangenheim, F.: DATETIME: A new benchmark to measure LLM translation and reasoning capabilities. arXiv preprint arXiv:2504.16155 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[16] [16]

Goebel, R., Kano, Y., Kim, M.Y., Rabelo, J., Satoh, K., Yoshioka, M.: Summary of the competition on legal information, extraction/entailment (COLIEE)2023.In:ProceedingsoftheNineteenthInternationalConference on Artificial Intelligence and Law. pp. 472–480 (2023)

2023

[17] [17]

In: European Symposium on Programming

Goutagny, P., Fromherz, A., Monat, R.: CUTECat: Concolic execution for computational law. In: European Symposium on Programming. pp. 31–61. Springer (2025)

2025

[18] [18]

Advances in Neural Information Processing Systems36(2024)

Guha,N.,Nyarko,J.,Ho,D.,Ré,C.,Chilton,A.,Chohlas-Wood,A.,Peters, A., Waldon, B., Rockmore, D., Zambrano, D., et al.: Legalbench: A collab- oratively built benchmark for measuring legal reasoning in large language models. Advances in Neural Information Processing Systems36(2024)

2024

[19] [19]

In: Proceedings of the eighteenth international symposium on Software testing and analysis

Kiezun, A., Ganesh, V., Guo, P.J., Hooimeijer, P., Ernst, M.D.: Hampi: a solver for string constraints. In: Proceedings of the eighteenth international symposium on Software testing and analysis. pp. 105–116 (2009)

2009

[20] [20]

Proceedings of the ACM on Programming Languages 7(OOPSLA1), 286–315 (2023)

Lattuada, A., Hance, T., Cho, C., Brun, M., Subasinghe, I., Zhou, Y., How- ell, J., Parno, B., Hawblitzel, C.: Verus: Verifying Rust programs using linear ghost types. Proceedings of the ACM on Programming Languages 7(OOPSLA1), 286–315 (2023)

2023

[21] [21]

In: International Conference on Computer Aided Verification

Liang, T., Reynolds, A., Tinelli, C., Barrett, C., Deters, M.: A dpll (t) the- ory solver for a theory of strings and regular expressions. In: International Conference on Computer Aided Verification. pp. 646–662. Springer (2014)

2014

[22] [22]

Journal of Open Source Software 4(43), 1891 (11 2019).https://doi.org/10.21105/joss.01891, http://dx.doi.org/10.21105/joss.01891

MacIver, D., Hatfield-Dodds, Z., Contributors, M.: Hypothesis: A new approach to property-based testing. Journal of Open Source Software 4(43), 1891 (11 2019).https://doi.org/10.21105/joss.01891, http://dx.doi.org/10.21105/joss.01891

work page doi:10.21105/joss.01891 2019

[23] [23]

In: European Symposium on Programming

Monat, R., Fromherz, A., Merigoux, D.: Formalizing date arithmetic and statically detecting ambiguities for the law. In: European Symposium on Programming. pp. 421–450. Springer (2024)

2024

[24] [24]

Mora, F., Berzish, M., Kulczynski, M., Nowotka, D., Ganesh, V.: Z3str4: A multi-armedstringsolver.In:InternationalSymposiumonFormalMethods. pp. 389–406. Springer (2021)

2021

[25] [25]

Philosophical Transactions of the Royal Society A382(2270), 20230159 (2024)

Nay, J.J., Karamardian, D., Lawsky, S.B., Tao, W., Bhat, M., Jain, R., Lee, A.T., Choi, J.H., Kasai, J.: Large language models as tax attorneys: a case study in legal capabilities emergence. Philosophical Transactions of the Royal Society A382(2270), 20230159 (2024)

2024

[26] [26]

Code of Laws of the United States of America26(121) (2018)

Office, G.P.: Internal Revenue Code. Code of Laws of the United States of America26(121) (2018)

2018

[27] [27]

OpenAI: Introducing GPT-5.2.https://openai.com/index/ introducing-gpt-5-2/(2025), retrieved Jan 26, 2026

2025

[28] [28]

In: Proceedings of the 2024 ACM SIGPLAN International Sym- posium on New Ideas, New Paradigms, and Reflections on Programming and Software

Padhye, R.: Software engineering methods for AI-driven deductive legal reasoning. In: Proceedings of the 2024 ACM SIGPLAN International Sym- posium on New Ideas, New Paradigms, and Reflections on Programming and Software. pp. 85–95 (2024) 24 L. Cui et al

2024

[29] [29]

Pierce, R.: Birthday puzzle.https://www.mathsisfun.com/ puzzles/birthday.html(2012), retrieved Jan 10, 2026

2012

[30] [30]

Presburger, M.: Über die vollständigkeit eines gewissen systems der arith- metik ganzer zahlen, in welchem die addition als einzige operation hervor- tritt.In:ComptesRendusduICongrèsdesMathématiciensdesPaysSlaves, Warszawa. pp. 92–101. Skład Główny (1929)

1929

[31] [31]

https://www.reuters.com/world/asia-pacific/leap-year- glitch-shuts-some-new-zealand-fuel-pumps-2024-02-29/ (Feb 2024)

Reuters: ’leap year glitch’ shuts some new zealand fuel pumps. https://www.reuters.com/world/asia-pacific/leap-year- glitch-shuts-some-new-zealand-fuel-pumps-2024-02-29/ (Feb 2024)

2024

[32] [32]

In: 2010 IEEE Symposium on Security and Privacy

Saxena, P., Akhawe, D., Hanna, S., Mao, F., McCamant, S., Song, D.: A symbolic execution framework for javascript. In: 2010 IEEE Symposium on Security and Privacy. pp. 513–528. IEEE (2010)

2010

[33] [33]

In: 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR)

Tiwari, S., Chen, S., Joukov, A., Vandervelde, P., Li, A., Padhye, R.: It’s about time: An empirical study of date and time bugs in open-source python software. In: 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR). pp. 39–51 (2025).https://doi.org/10. 1109/MSR66628.2025.00020

work page arXiv 2025

[34] [34]

In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security

Trinh, M.T., Chu, D.H., Jaffar, J.: S3: A symbolic string solver for vul- nerability detection in web applications. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. pp. 1232– 1243 (2014)

2014

[35] [35]

Computers & Operations Research142, 105731 (2022)

Xiong, H., Shi, S., Ren, D., Hu, J.: A survey of job shop scheduling problem: The types and models. Computers & Operations Research142, 105731 (2022)

2022

[36] [36]

In: Proceedings of the eighteenth international conference on artificial in- telligence and law

Yoshioka, M., Aoki, Y., Suzuki, Y.: BERT-based ensemble methods with data augmentation for legal textual entailment in COLIEE statute law task. In: Proceedings of the eighteenth international conference on artificial in- telligence and law. pp. 278–284 (2021)

2021

[37] [37]

Zamudio Amaya, J.A., Smytzek, M., Zeller, A.: Fandango: Evolv- ing language-based testing. Proc. ACM Softw. Eng.2(ISSTA) (Jun 2025).https://doi.org/10.1145/3728915,https://doi.org/ 10.1145/3728915

work page doi:10.1145/3728915 2025

[38] [38]

Zheng, Y., Ganesh, V., Subramanian, S., Tripp, O., Berzish, M., Dolby, J., Zhang, X.: Z3str2: an efficient solver for strings, regular expressions, and lengthconstraints.FormalMethodsinSystemDesign50(2),249–288(2017)

2017

[39] [39]

->" implication 5equality: disjunction | disjunction

Zheng, Y., Zhang, X., Ganesh, V.: Z3-str: A z3-based string solver for web application analysis. In: Proceedings of the 2013 9th Joint Meeting on Foun- dations of Software Engineering. pp. 114–124 (2013) DateSAT: A Framework for Solving Date and Period Constraints 25 Appendix Organization:Appendix A presents the full grammar of theDateSATcon- straint lang...

2013