Traffic-r1: Reinforced llms bring human-like reasoning to traffic signal control systems

Xingchen Zou, Yuhao Yang, Zheng Chen, Xixuan Hao, Yiqi Chen, Chao Huang, Yuxuan Liang · 2025 · arXiv 2508.02344

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

CuraLight: Debate-Guided Data Curation for LLM-Centered Traffic Signal Control

cs.AI · 2026-04-07 · unverdicted · novelty 7.0

CuraLight uses RL-generated trajectories and multi-LLM debate to curate training data for an LLM traffic-signal controller, yielding 5-7% gains in travel time, queue length, and waiting time over baselines in SUMO simulations of real networks.

SignalClaw: LLM-Guided Evolutionary Synthesis of Interpretable Traffic Signal Control Skills

cs.AI · 2026-04-07 · unverdicted · novelty 7.0

SignalClaw synthesizes interpretable, composable traffic signal control skills through LLM-guided evolution that matches top baselines on routine SUMO scenarios and outperforms them on emergency and transit events while remaining editable by engineers.

ReRec: Reasoning-Augmented LLM-based Recommendation Assistant via Reinforcement Fine-tuning

cs.IR · 2026-04-09 · unverdicted · novelty 6.0

ReRec uses reinforcement fine-tuning with dual-graph reward shaping, reasoning-aware advantage estimation, and online curriculum scheduling to improve LLM reasoning and performance in recommendation tasks.

citing papers explorer

Showing 3 of 3 citing papers.

CuraLight: Debate-Guided Data Curation for LLM-Centered Traffic Signal Control cs.AI · 2026-04-07 · unverdicted · none · ref 20
CuraLight uses RL-generated trajectories and multi-LLM debate to curate training data for an LLM traffic-signal controller, yielding 5-7% gains in travel time, queue length, and waiting time over baselines in SUMO simulations of real networks.
SignalClaw: LLM-Guided Evolutionary Synthesis of Interpretable Traffic Signal Control Skills cs.AI · 2026-04-07 · unverdicted · none · ref 14
SignalClaw synthesizes interpretable, composable traffic signal control skills through LLM-guided evolution that matches top baselines on routine SUMO scenarios and outperforms them on emergency and transit events while remaining editable by engineers.
ReRec: Reasoning-Augmented LLM-based Recommendation Assistant via Reinforcement Fine-tuning cs.IR · 2026-04-09 · unverdicted · none · ref 65
ReRec uses reinforcement fine-tuning with dual-graph reward shaping, reasoning-aware advantage estimation, and online curriculum scheduling to improve LLM reasoning and performance in recommendation tasks.

Traffic-r1: Reinforced llms bring human-like reasoning to traffic signal control systems

fields

years

verdicts

representative citing papers

citing papers explorer