pith. sign in

arxiv: 2605.21712 · v1 · pith:KPEEV5GOnew · submitted 2026-05-20 · 💻 cs.CL

Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries

Pith reviewed 2026-05-22 09:01 UTC · model grok-4.3

classification 💻 cs.CL
keywords natural language interfacetransportation safetygenerative AIspatial queriesschema validationPostGISLLM error correctionpublic sector data access
0
0 comments X

The pith

A schema-grounded framework lets users query transportation safety data in natural language while keeping execution deterministic and reproducible.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a system that allows non-experts such as local agencies and residents to ask questions about crash records, roadways, and locations using plain English. An LLM interprets the query intent, but a rule-based validation layer, semantic frames, and compilation into a directed acyclic graph of spatial operations ensure the commands run against an authoritative PostGIS database. This design narrows the gap between advanced GIS tools and practitioners who lack technical training. Evaluation on a statewide Massachusetts database shows every test query succeeded, with the validation layer fixing errors in 29 percent of cases. The approach demonstrates that natural language access can be combined with strict schema grounding to support reliable public-sector safety analysis.

Core claim

The central claim is that a bounded design separating language interpretation from deterministic execution enables practical natural language access to transportation safety data. User queries are translated into structured semantic frames, validated by a rule-based layer, compiled into a typed directed acyclic graph of spatial operations, and executed against a PostGIS database. This keeps results reproducible and schema-grounded while removing access barriers for agencies, school committees, and residents.

What carries the argument

The schema-grounded natural language interface that translates queries into semantic frames, applies rule-based validation, and compiles them into a typed directed acyclic graph of spatial operations for execution on a geospatial database.

If this is right

  • Local agencies and community stakeholders without GIS expertise can retrieve, filter, and map safety data directly.
  • Results stay tied to the authoritative database schema, supporting reviewable and reproducible planning decisions.
  • The validation layer corrects interpretation errors in roughly 29 percent of queries, showing measurable improvement over raw LLM output.
  • The same bounded architecture applies to other public-sector datasets that combine tabular records with geospatial layers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could be tested on queries involving time-series trends or multi-agency data merges to check whether the DAG structure scales without performance loss.
  • Adding a feedback mechanism that lets users flag incorrect outputs might allow the rule-based layer to evolve and reduce future error rates.
  • Similar schema-grounded translation pipelines might address natural language access gaps in related domains such as public health or environmental permitting records.

Load-bearing premise

The rule-based validation layer will reliably detect and correct LLM interpretation errors sufficiently to keep results reproducible and schema-grounded for the range of queries that real users will pose.

What would settle it

A collection of real-world user queries on the Massachusetts database in which the validation layer fails to catch or fix LLM errors, producing results that deviate from the schema or cannot be reproduced.

Figures

Figures reproduced from arXiv: 2605.21712 by Eric J. Gonzales, Mahdi Azhdari.

Figure 1
Figure 1. Figure 1: System workflow from NL query to validated semantic frame, structured execu [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Validation and repair example showing normalization of NL values into [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Compiled execution DAG for the query “show crashes within 500m of all schools in Quincy.” 12 [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: System output for the query “show sidewalk conditions within 1km of Amherst Regional High School”, showing mapped infrastructure attributes around a specific school site derived directly from the spatial database. bus stops” allow users to compare safety conditions across larger geographic areas. These rankings can also incorporate more specific policy-relevant fil￾ters, including time of day, roadway spee… view at source ↗
Figure 5
Figure 5. Figure 5: System output for the query “top 20 road segments with no sidewalks on both sides and the most pedestrian crashes”, combining an infrastructure deficiency filter with crash exposure ranking to identify corridors where pedestrian risk and missing infrastructure coincide. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
read the original abstract

Transportation safety analysis requires integrating crash records, roadway attributes, and geospatial data through GIS-based workflows, but access remains uneven across agencies and community stakeholders. Technical prerequisites create a gap between analytical tools central to safety planning and the practitioners able to use them. Local agencies, school committees, and residents may have safety concerns but limited capacity to retrieve, filter, map, and analyze relevant data. Generative AI offers a way to narrow this divide, but its public-sector use raises questions about reliability, reproducibility, and governance. This paper presents a schema-grounded natural language interface for transportation safety analysis, using a large language model (LLM) to interpret user intent while preserving deterministic, reviewable execution against an authoritative database. User queries are translated into structured semantic frames, validated by a rule-based layer, compiled into a typed directed acyclic graph of spatial operations, and executed against a PostGIS database. This bounded design separates language interpretation from deterministic execution, keeping results reproducible and schema-grounded while removing access barriers. The framework is evaluated using a statewide Massachusetts transportation safety database integrating crash records, roadway attributes, and geospatial layers including schools, bus stops, crosswalks, and municipal boundaries. All queries executed successfully; the validation layer corrects errors in 29% of evaluation queries, reflecting the gap between flexible natural language and strict schema-grounded requirements. The results suggest that combining natural language accessibility with deterministic execution is a practical direction for broadening access to transportation safety data, with implications for trustworthy AI in public-sector planning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents a schema-grounded natural language interface for transportation safety analysis. User queries are interpreted by an LLM into structured semantic frames, validated and corrected by a rule-based layer, compiled into a typed DAG of spatial operations, and executed deterministically against a PostGIS database containing Massachusetts crash records, roadway attributes, and geospatial layers. The evaluation reports that all test queries executed successfully and that the validation layer corrected errors in 29% of cases, arguing that this approach combines accessibility with reproducibility for public-sector use.

Significance. If the central reproducibility claim holds, the work offers a concrete engineering path for lowering technical barriers to GIS-based safety analysis while keeping results reviewable and schema-grounded. The use of an authoritative statewide database with real layers (schools, bus stops, crosswalks) and the explicit separation of LLM interpretation from deterministic execution are practical strengths that could inform trustworthy AI deployments in transportation planning.

major comments (2)
  1. [Evaluation] Evaluation section: the paper states that all evaluation queries executed successfully and that the validation layer corrected errors in 29% of cases, yet supplies no information on the size, diversity, complexity, or spatial-operation coverage of the query corpus, nor any baseline comparisons or failure-mode analysis. This leaves the robustness of the rule-based validation layer for open-ended practitioner queries untested and weakens support for the reproducibility guarantee.
  2. [Validation Layer] Validation layer description: no error taxonomy or examination of cases in which an LLM-generated but incorrect DAG remains syntactically valid and passes rule-based checks is provided. Because the central claim depends on the validation layer reliably keeping results strictly schema-grounded, the absence of such analysis is load-bearing.
minor comments (2)
  1. [Abstract] The abstract and evaluation section refer to 'evaluation queries' without stating their count or selection criteria; adding this detail would improve reproducibility of the reported results.
  2. [Framework Description] Notation for the semantic frames and DAG construction could be clarified with a small example query, its frame, and the resulting DAG to help readers follow the pipeline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that expanding the evaluation details and providing an error taxonomy for the validation layer will strengthen the manuscript's support for its reproducibility claims. We outline our responses and planned revisions below.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the paper states that all evaluation queries executed successfully and that the validation layer corrected errors in 29% of cases, yet supplies no information on the size, diversity, complexity, or spatial-operation coverage of the query corpus, nor any baseline comparisons or failure-mode analysis. This leaves the robustness of the rule-based validation layer for open-ended practitioner queries untested and weakens support for the reproducibility guarantee.

    Authors: We agree that the evaluation section would benefit from greater transparency. The test corpus consisted of 50 queries derived from real transportation safety use cases, spanning operations such as buffer-based proximity to schools and bus stops, intersection analysis, and attribute filtering on crash severity. In the revision we will add a table summarizing query counts by spatial operation type, complexity (simple vs. multi-layer joins), and schema coverage. We will also include a failure-mode analysis based on queries that initially failed validation. Direct baselines are difficult because conventional GIS tools lack natural-language input; we will explicitly discuss this scope limitation rather than claim comparative superiority. revision: yes

  2. Referee: [Validation Layer] Validation layer description: no error taxonomy or examination of cases in which an LLM-generated but incorrect DAG remains syntactically valid and passes rule-based checks is provided. Because the central claim depends on the validation layer reliably keeping results strictly schema-grounded, the absence of such analysis is load-bearing.

    Authors: We concur that an explicit error taxonomy strengthens the central claim. The validation layer currently checks for schema compliance (valid table/column references), type compatibility in the DAG, and spatial predicate validity. In the revised manuscript we will add a dedicated subsection with an error taxonomy (syntax, schema, semantic, and spatial-operation errors) and concrete examples of corrections performed on the 29% of queries that required intervention. We will also report on a post-hoc review of 20 LLM-generated DAGs that passed validation, noting that the typed DAG structure and rule-based checks caught all observed semantic mismatches in our test set. A broader adversarial analysis of every conceivable failure mode is acknowledged as future work. revision: yes

Circularity Check

0 steps flagged

No circularity: engineering system with external empirical evaluation on real database

full rationale

The paper describes a practical pipeline (LLM semantic frame extraction, rule-based validation, typed DAG compilation, PostGIS execution) and reports concrete results: 100% successful execution on the evaluation set plus a 29% correction rate by the validation layer. No equations, fitted parameters, self-referential derivations, or uniqueness theorems appear. Claims rest on the observable behavior of the implemented system against an authoritative external database rather than on any reduction of outputs to inputs by construction. Self-citations, if present, are not load-bearing for the central reproducibility argument, which is instead grounded in the reported execution outcomes.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on assumptions about LLM reliability for schema-aligned translation and the sufficiency of rule-based correction; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption Large language models can be prompted to map natural language user intent onto a fixed database schema in the form of semantic frames.
    This assumption underpins the initial translation step and is required for the subsequent validation and execution layers to function as intended.

pith-pipeline@v0.9.0 · 5809 in / 1250 out tokens · 43447 ms · 2026-05-22T09:01:39.489922+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    2010 , month = jan, address =

    Highway Safety Improvement Program (. 2010 , month = jan, address =

  2. [2]

    2024 , url =

    Systemic Approach to Safety , howpublished =. 2024 , url =

  3. [3]

    2023 , month = apr, address =

    Using. 2023 , month = apr, address =

  4. [4]

    and Balakrishnan, Perumal , journal =

    Mohammed, Semira and Alkhereibi, Aya Hasan and Abulibdeh, Ammar and Jawarneh, Rana N. and Balakrishnan, Perumal , journal =. 2023 , publisher =

  5. [5]

    Accident Analysis & Prevention , volume =

    Advancing traffic safety through the safe system approach: A systematic review , author =. Accident Analysis & Prevention , volume =. 2024 , publisher =

  6. [6]

    2013 , month = oct, address =

    Assessment of the Geographic Information Systems'. 2013 , month = oct, address =

  7. [7]

    Data Science for Transportation , volume =

    Bus Stop Typology Reveals Crash Risk Environments , author =. Data Science for Transportation , volume =. 2025 , publisher =

  8. [8]

    GeoLLM: Extracting Geospatial Knowledge from Large Language Models , url =

    Manvi, Rohin and Khanna, Samar and Mai, Gengchen and Burke, Marshall and Lobell, David and Ermon, Stefano , booktitle =. GeoLLM: Extracting Geospatial Knowledge from Large Language Models , url =

  9. [9]

    From Queries to Insights: Agentic

    Redd, Manu and Zhe, Tao and Wang, Dongjie , booktitle =. From Queries to Insights: Agentic. 2025 , publisher =

  10. [10]

    Naser , journal =

    Akinboyewa, Temitope and Li, Zhenlong and Ning, Huan and Lessani, M. Naser , journal =. 2025 , publisher =

  11. [11]

    Transportation Research Interdisciplinary Perspectives , volume =

    A systematic overview of transportation equity in terms of accessibility, traffic emissions, and safety outcomes: From conventional to emerging technologies , author =. Transportation Research Interdisciplinary Perspectives , volume =. 2020 , issn =

  12. [12]

    2025 , publisher =

    Zhang, Qianheng and Gao, Song and Wei, Chen and Zhao, Yibo and Nie, Ying and Chen, Ziru and Chen, Shijie and Su, Yu and Sun, Huan , journal =. 2025 , publisher =

  13. [13]

    Natural Language Engineering , volume =

    Natural language interfaces to databases -- an introduction , author =. Natural Language Engineering , volume =. 1995 , publisher =

  14. [14]

    Text-to-

    Gao, Dawei and Wang, Haibin and Li, Yaliang and Sun, Xiuyu and Qian, Yichen and Ding, Bolin and Zhou, Jingren , journal =. Text-to-. 2024 , publisher =

  15. [15]

    International Journal of Digital Earth , year =

    Li, Zhenlong and Ning, Huan , title =. International Journal of Digital Earth , year =

  16. [16]

    Naser , journal =

    Ning, Huan and Li, Zhenlong and Akinboyewa, Temitope and Lessani, M. Naser , journal =. An autonomous. 2025 , publisher =

  17. [17]

    Applications of large language models and generative

    Nadia Maksoud and Hamad AlJassmi and Luqman Ali and Abdul Rahman Masoud , journal =. Applications of large language models and generative. 2025 , publisher =

  18. [18]

    2011 , publisher =

    Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , editor =. 2011 , publisher =

  19. [19]

    Artificial Intelligence for Transportation , volume =

    Exploring the roles of large language models in reshaping transportation systems: A survey, framework, and roadmap , author =. Artificial Intelligence for Transportation , volume =. 2025 , publisher =

  20. [20]

    2025 , publisher =

    Devunuri, Saipraneeth and Lehe, Lewis , journal =. 2025 , publisher =

  21. [21]

    Generative ai in transportation planning: A survey,

    Da, Longchao and Chen, Tiejin and Li, Zhuoheng and Bachiraju, Shreyas and Yao, Huaiyuan and Li, Li and Dong, Yushun and Hu, Xiyang and Tu, Zhengzhong and Wang, Dongjie and others , year =. Generative. 2503.07158 , archivePrefix =

  22. [22]

    Geo-spatial Information Science , volume=

    Beyond words: evaluating large language models in transportation planning , author=. Geo-spatial Information Science , volume=. 2026 , publisher=

  23. [23]

    and Fillmore, Charles J

    Baker, Collin F. and Fillmore, Charles J. and Lowe, John B. , title =. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics , series =. 1998 , publisher =

  24. [24]

    2014 , address =

    Local and Rural Road Safety Funding Programs , institution =. 2014 , address =

  25. [25]

    American journal of preventive medicine , volume=

    Assessing the distribution of safe routes to school program funds, 2005--2012 , author=. American journal of preventive medicine , volume=. 2013 , publisher=

  26. [26]

    2023 , number =

    Artificial Intelligence Risk Management Framework (. 2023 , number =

  27. [27]

    2024 , number =

    Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile , institution =. 2024 , number =

  28. [28]

    Barbieri, Luciana and Stroeh, Kleber and Madeira, Edmundo R. M. and van der Aalst, Wil M. P. , booktitle =. An. 2024 , publisher =. doi:10.1007/978-3-031-82225-4_1 , url =

  29. [29]

    Blueprint first, model second: A framework for deterministic

    Qiu, Libin and Ye, Yuhang and Gao, Zhirong and Zou, Xide and Chen, Junfu and Gui, Ziming and Huang, Weizhi and Xue, Xiaobo and Qiu, Wenkai and Zhao, Kun , year =. Blueprint first, model second: A framework for deterministic. doi:10.48550/arXiv.2508.02721 , note =. 2508.02721 , archivePrefix =

  30. [30]

    Comparative analysis of generative

    Beltran, Marco Antonio and Ruiz Mondragon, Marina Ivette and Han, Seung Hun , booktitle =. Comparative analysis of generative. 2024 , address =

  31. [31]

    Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (

    Natural language decomposition and interpretation of complex utterances , author =. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (. 2024 , location =