pith. sign in

arxiv: 2509.03335 · v3 · submitted 2025-09-03 · 💻 cs.LG

EvolveSignal: A Large Language Model Powered Coding Agent for Discovering Traffic Signal Control Strategies

Pith reviewed 2026-05-18 19:09 UTC · model grok-4.3

classification 💻 cs.LG
keywords traffic signal controlfixed-time strategiesLLM coding agentprogram synthesisevolutionary searchtraffic simulationWebster method
0
0 comments X

The pith

An LLM-powered coding agent discovers traffic signal strategies that reduce average delay by 20.1% over Webster's method.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that large language models can serve as coding agents to automatically explore and refine code representations of fixed-time traffic signal logic. Rather than inventing new formulas, the method generates variations on existing heuristic structures and uses simulator feedback plus evolutionary selection to keep the best ones. A reader would care because fixed-time signals are common and inexpensive yet often suboptimal when traffic patterns change, and manual redesign is slow and error-prone. If the approach works, it could let engineers obtain stronger yet still readable control rules without repeated hand-tuning. The reported tests on one intersection show both the performance lift and the specific code changes that produced it.

Core claim

EvolveSignal casts traffic signal strategy discovery as program synthesis. Each candidate is written as a Python function with fixed input and output signatures. An LLM proposes code edits, the resulting programs are scored in a traffic simulator, and evolutionary search retains the stronger variants. In the intersection experiments the evolved strategies lower average vehicle delay by 20.1 percent and average stops by 47.1 percent relative to Webster's method. Follow-on analyses show that the process surfaces concrete, human-readable adjustments such as tightened cycle-length limits, explicit right-turn demand terms, and rebalanced green splits.

What carries the argument

LLM-guided program synthesis that treats control logic as mutable Python functions and drives improvement through repeated external simulator scoring and evolutionary selection.

If this is right

  • The evolved strategies remain interpretable, letting traffic engineers inspect and adopt the discovered modifications.
  • Automated code evolution removes the need for repeated manual re-timing when demand changes.
  • The same framework can surface which specific changes, such as cycle bounds or green rescaling, drive the gains.
  • Performance advantages appear under the heterogeneous and congested conditions tested in simulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same code-evolution loop could be applied to discover heuristics for ramp metering or network-level coordination.
  • Direct comparison against live sensor data would test how well simulator rankings predict field outcomes.
  • Adding real-time traffic measurements into the evaluation loop might allow the agent to refine strategies on the fly.

Load-bearing premise

The traffic simulator used to score candidate strategies produces results that generalize to real intersections and changing demand patterns.

What would settle it

Real-world field measurements at the same intersection showing whether the evolved strategies still reduce delay and stops relative to Webster's method under live traffic.

Figures

Figures reproduced from arXiv: 2509.03335 by Hao Wang, Jian Xu, Leizhen Wang, Nan Zheng, Peibo Duan, Yue Wang, Zhenliang Ma.

Figure 1
Figure 1. Figure 1: The problem definition in Python performance while providing natural language reasoning and explanations [3], [4]. However, practical deployment remains challenging due to high costs, privacy concerns, and the com￾plex infrastructure required to scale RL and LLM models. This study proposes a novel approach to better harness the potential of AI for traffic signal control, ensuring both scalability and real-… view at source ↗
Figure 2
Figure 2. Figure 2: The EvolveSignal framework where Score(·) is a task-specific evaluation metric that quan￾tifies traffic efficiency, such as average throughput, delay reduction, or stop minimization. This formulation enables the automatic discovery of in￾terpretable, effective, and adaptable fixed-time signal con￾trol strategies, suitable for deployment in infrastructure￾constrained environments. B. Overview of the Propose… view at source ↗
Figure 3
Figure 3. Figure 3: Simulated isolated intersection in SUMO TABLE I: Traffic demand scenarios and Critical Ratio Sum (CRS). All values are in veh/h, formatted as through/left/right. Scenario S1 S2 S3 North (N) 1550/210/30 2600/230/30 2500/80/60 South (S) 1450/180/50 2450/250/50 2400/90/70 East (E) 1450/200/40 600/80/40 400/330/150 West (W) 1400/180/40 650/90/40 450/340/120 CRSa 0.868 0.865 0.872 a CRS, based on Webster, refle… view at source ↗
Figure 5
Figure 5. Figure 5: Example system message prompt template [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example user message prompt template [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Representative LLM response [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Implementation of the initial program (initial_program.py) [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Implementation of the discovered program ( [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Example LLM response introducing the CLB modification [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Example LLM response introducing the RTI and PAR modifications [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Example LLM response introducing the SLF modification [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Example LLM response introducing the MGF modification [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Examples of discarded modifications observed during evolution [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Evolution path of the best discovered program in a 600-iteration run (alternative experiment) [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗
read the original abstract

In traffic engineering, fixed-time traffic signal control remains widely used for its low cost, stability, and interpretability. However, its design relies on hand-crafted formulas (e.g., Webster) and manual re-timing by engineers to adapt to demand changes, which is labor-intensive and often yields suboptimal results under heterogeneous or congested conditions. This paper introduces EvolveSignal, an LLM-powered coding agent for automatically discovering interpretable heuristic strategies for fixed-time traffic signal control. Rather than deriving entirely new analytical formulations, the proposed framework focuses on exploring code-level variations of existing control logic and identifying effective combinations of heuristic modifications. We formulate the problem as program synthesis, where candidate strategies are represented as Python functions with fixed input-output structures and iteratively optimized through external evaluations (e.g., a traffic simulator) and evolutionary search. Experiments on a signalized intersection demonstrate that the discovered strategies outperform a classical baseline (Webster's method), reducing average delay by 20.1\% and average stops by 47.1\%. Beyond performance, ablation and incremental analyses reveal that EvolveSignal can identify meaningful modifications, such as adjusting cycle length bounds, incorporating right-turn demand, and rescaling green allocations, that provide useful insights for traffic engineers. This work highlights the potential of LLM-driven program synthesis for supporting interpretable and automated heuristic design in traffic signal control.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces EvolveSignal, an LLM-powered coding agent that formulates traffic signal control as program synthesis: candidate strategies are represented as Python functions with fixed I/O structure and iteratively refined via evolutionary search guided by external evaluations in a traffic simulator. On a single signalized intersection, the discovered strategies are reported to outperform Webster's method by reducing average delay 20.1% and average stops 47.1%. Ablation analyses are said to identify interpretable modifications such as adjusting cycle-length bounds, incorporating right-turn demand, and rescaling green allocations.

Significance. If the empirical gains prove robust, the work offers a practical route to automated, interpretable heuristic design in traffic engineering, moving beyond hand-crafted formulas while retaining the transparency valued by practitioners. The emphasis on code-level variations of existing logic rather than wholly new analytic forms is a constructive framing that could yield transferable insights for engineers.

major comments (2)
  1. [Experiments] Experimental section: the headline performance claims (20.1% delay reduction, 47.1% stop reduction) are stated without any indication of the number of independent simulator runs, standard deviations, confidence intervals, or statistical tests. This information is load-bearing for the central empirical assertion that the evolved strategies outperform the baseline.
  2. [Method and Experiments] Simulator and evaluation setup: the paper relies on iterative optimization through a single external traffic simulator yet supplies no cross-validation against field data, alternative simulators, stochastic demand profiles, or sensitivity analysis to simulator parameters. Because the performance numbers derive entirely from these external evaluations, the lack of fidelity checks directly affects whether the reported gains reflect robust control logic or simulator-specific artifacts.
minor comments (2)
  1. [Abstract and Section 4] The abstract and method description refer to 'ablation and incremental analyses' without specifying which components were ablated or how increments were measured; a brief enumeration in the main text would improve clarity.
  2. [Method] Notation for the evolutionary operators (mutation, crossover, selection) is introduced informally; a short pseudocode block or explicit parameter table would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our empirical results and evaluation methodology. We address each major comment below and indicate the corresponding revisions.

read point-by-point responses
  1. Referee: [Experiments] Experimental section: the headline performance claims (20.1% delay reduction, 47.1% stop reduction) are stated without any indication of the number of independent simulator runs, standard deviations, confidence intervals, or statistical tests. This information is load-bearing for the central empirical assertion that the evolved strategies outperform the baseline.

    Authors: We agree that the absence of these statistical details weakens the central claims. In the revised manuscript we will report the number of independent simulator runs (conducted with distinct random seeds), include standard deviations and 95% confidence intervals for all reported metrics, and add the results of paired statistical tests (e.g., t-tests) comparing the evolved strategies against Webster’s method. These additions will appear in the experimental section and in updated result tables. revision: yes

  2. Referee: [Method and Experiments] Simulator and evaluation setup: the paper relies on iterative optimization through a single external traffic simulator yet supplies no cross-validation against field data, alternative simulators, stochastic demand profiles, or sensitivity analysis to simulator parameters. Because the performance numbers derive entirely from these external evaluations, the lack of fidelity checks directly affects whether the reported gains reflect robust control logic or simulator-specific artifacts.

    Authors: We acknowledge that reliance on a single simulator constitutes a limitation for claims of robustness. In the revision we will add a sensitivity analysis with respect to key simulator parameters and will include additional experiments that employ stochastic demand profiles. Cross-validation against field data or alternative simulators, however, lies outside the present scope, which centers on demonstrating the LLM-driven program-synthesis approach; we will explicitly discuss this limitation and the need for future real-world validation in an expanded discussion section. revision: partial

Circularity Check

0 steps flagged

No significant circularity; central performance claims rest on external simulator evaluations

full rationale

The paper formulates strategy discovery as program synthesis via LLM-generated Python functions, iteratively refined by evolutionary search and external evaluations in a traffic simulator. The reported gains (20.1% delay reduction, 47.1% stop reduction versus Webster's method) are measured directly from those independent simulator runs on a signalized intersection, rather than being defined by or fitted to the discovery process itself. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation chain; the empirical results remain falsifiable against the external benchmark and do not reduce to quantities constructed from the method's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the external traffic simulator providing a faithful proxy for real intersections and on the evolutionary search locating strategies that are not merely overfit to the single test case.

axioms (1)
  • domain assumption The traffic simulator accurately models real-world intersection behavior under the tested demand patterns
    All reported performance numbers rest on external evaluations inside this simulator.

pith-pipeline@v0.9.0 · 5786 in / 1236 out tokens · 35489 ms · 2026-05-18T19:09:57.488512+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 4 internal anchors

  1. [1]

    A survey on traffic signal control methods,

    H. Wei, G. Zheng, V . Gayah, and Z. Li, “A survey on traffic signal control methods,” arXiv preprint arXiv:1904.08117 , 2019

  2. [2]

    Human-centric multimodal deep (hmd) traffic signal control,

    L. Wang, Z. Ma, C. Dong, and H. Wang, “Human-centric multimodal deep (hmd) traffic signal control,” IET Intelligent Transport Systems , vol. 17, no. 4, pp. 744–753, 2023

  3. [3]

    The crossroads of llm and traffic control: A study on large language models in adaptive traffic signal control,

    M. Movahedi and J. Choi, “The crossroads of llm and traffic control: A study on large language models in adaptive traffic signal control,” IEEE Transactions on Intelligent Transportation Systems , 2024

  4. [4]

    Llmlight: Large language models as traffic signal control agents,

    S. Lai, Z. Xu, W. Zhang, H. Liu, and H. Xiong, “Llmlight: Large language models as traffic signal control agents,” in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 1, 2025, pp. 2335–2346

  5. [5]

    Traffic signal phase and timing estimation using trajectory data from radar vision integrated camera,

    W. Zhou, Y . Wang, M. Liu, T. Liu, P. Zhang, and Z. Ma, “Traffic signal phase and timing estimation using trajectory data from radar vision integrated camera,” IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 11, pp. 18 279–18 291, Nov 2024

  6. [6]

    Chat2spat: A large language model based tool for automating traffic signal control plan management,

    Y . Wang, M. Zhou, G. Huang, R. Zhuo, C. Yi, and Z. Ma, “Chat2spat: A large language model based tool for automating traffic signal control plan management,” arXiv preprint arXiv:2507.05283 , 2025

  7. [7]

    Traffic signal settings,

    F. V . Webster, “Traffic signal settings,” Tech. Rep., 1958

  8. [8]

    Vissim: A microscopic simulation tool to evaluate actu- ated signal control including bus priority,

    M. Fellendorf, “Vissim: A microscopic simulation tool to evaluate actu- ated signal control including bus priority,” in64th Institute of transporta- tion engineers annual meeting , vol. 32. Springer Berlin/Heidelberg, Germany, 1994, pp. 1–9

  9. [9]

    A review of the self-adaptive traffic signal control system based on future traffic environment,

    Y . Wang, X. Yang, H. Liang, and Y . Liu, “A review of the self-adaptive traffic signal control system based on future traffic environment,”Journal of Advanced Transportation , vol. 2018, no. 1, p. 1096123, 2018

  10. [10]

    A multi-band approach to arterial traffic signal optimization,

    N. H. Gartner, S. F. Assman, F. Lasaga, and D. L. Hou, “A multi-band approach to arterial traffic signal optimization,” Transportation Research Part B: Methodological, vol. 25, no. 1, pp. 55–74, 1991

  11. [11]

    Coordinated control model for oversaturated arterial intersections,

    H. Wang and X. Peng, “Coordinated control model for oversaturated arterial intersections,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 12, pp. 24 157–24 175, 2022

  12. [12]

    The scoot on-line traffic signal optimisation technique,

    P. Hunt, D. Robertson, R. Bretherton, and M. C. Royle, “The scoot on-line traffic signal optimisation technique,” Traffic Engineering & Control, vol. 23, no. 4, 1982

  13. [13]

    Multi-agent deep reinforcement learning for large-scale traffic signal control,

    T. Chu, J. Wang, L. Codec `a, and Z. Li, “Multi-agent deep reinforcement learning for large-scale traffic signal control,” IEEE transactions on intelligent transportation systems , vol. 21, no. 3, pp. 1086–1095, 2019

  14. [14]

    Adaptive traffic light control with deep reinforcement learning: An evaluation of traffic flow and energy consumption,

    L. Koch, T. Brinkmann, M. Wegener, K. Badalian, and J. Andert, “Adaptive traffic light control with deep reinforcement learning: An evaluation of traffic flow and energy consumption,” IEEE transactions on intelligent transportation systems, vol. 24, no. 12, pp. 15 066–15 076, 2023

  15. [15]

    Combat urban congestion via collaboration: Heterogeneous gnn-based marl for coordinated platooning and traffic signal control,

    X. Peng, S. Chen, H. Gao, H. Wang, and H. M. Zhang, “Combat urban congestion via collaboration: Heterogeneous gnn-based marl for coordinated platooning and traffic signal control,” IEEE Transactions on Intelligent Transportation Systems, 2025

  16. [16]

    Multi-objective rein- forcement learning approach for improving safety at intersections with adaptive traffic signal control,

    Y . Gong, M. Abdel-Aty, J. Yuan, and Q. Cai, “Multi-objective rein- forcement learning approach for improving safety at intersections with adaptive traffic signal control,” Accident Analysis & Prevention , vol. 144, p. 105655, 2020

  17. [17]

    Large language models for urban transportation: Basics, methods, and applications,

    Z. Ma, L. Wang, Z. Qin, and Y . Ling, “Large language models for urban transportation: Basics, methods, and applications,” in Mobility Patterns, Big Data and Transportation Analytics , 2nd ed. Elsevier, 2025

  18. [18]

    Ai-driven day-to-day route choice,

    L. Wang, P. Duan, Z. He, C. Lyu, X. Chen, N. Zheng, L. Yao, and Z. Ma, “Ai-driven day-to-day route choice,” arXiv preprint arXiv:2412.03338 , 2024

  19. [19]

    A foundational individual mobility prediction model based on open-source large language models,

    Z. Qin, L. Wang, F. C. Pereira, and Z. Ma, “A foundational individual mobility prediction model based on open-source large language models,” arXiv preprint arXiv:2503.16553 , 2025

  20. [20]

    Lingotrip: Spatiotemporal con- text prompt driven large language model for individual trip prediction,

    Z. Qin, P. Zhang, L. Wang, and Z. Ma, “Lingotrip: Spatiotemporal con- text prompt driven large language model for individual trip prediction,” Journal of Public Transportation , vol. 27, p. 100117, 2025

  21. [21]

    AlphaEvolve: A coding agent for scientific and algorithmic discovery

    A. Novikov, N. V ˜u, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. Ruiz, A. Mehrabian et al., “Alphaevolve: A coding agent for scientific and algorithmic discovery,” arXiv preprint arXiv:2506.13131 , 2025

  22. [22]

    Openevolve: an open-source evolutionary coding agent,

    A. Sharma, “Openevolve: an open-source evolutionary coding agent,”

  23. [23]

    Available: https://github.com/codelion/openevolve

    [Online]. Available: https://github.com/codelion/openevolve

  24. [24]

    Microscopic traffic simulation using sumo,

    P. A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y .-P. Fl ¨otter¨od, R. Hilbrich, L. L ¨ucken, J. Rummel, P. Wagner, and E. Wießner, “Microscopic traffic simulation using sumo,” in 2018 21st international conference on intelligent transportation systems (ITSC) . Ieee, 2018, pp. 2575–2582

  25. [25]

    Illuminating search spaces by mapping elites

    J.-B. Mouret and J. Clune, “Illuminating search spaces by mapping elites,” arXiv preprint arXiv:1504.04909 , 2015

  26. [26]

    DeepSeek-V3 Technical Report

    A. Liu, B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan et al., “Deepseek-v3 technical report,” arXiv preprint arXiv:2412.19437, 2024

  27. [27]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Bi et al., “Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning,” arXiv preprint arXiv:2501.12948 , 2025. 7 APPENDIX A. Detailed Description of Framework Modules This appendix provides extended details of the four modules in the proposed EvolveSi...