Tabular RL on a Non-Markovian Rewards Decision Process formulation matches deep RL performance on real metro expansion in Xi'an and Amsterdam while cutting episodes by 18x and carbon emissions by 12x on average.
A simulation environment and reinforce- ment learning method for waste reduction.arXiv preprint arXiv:2205.15455, 2022
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Smart Transportation Without Neurons -- Fair Metro Network Expansion with Tabular Reinforcement Learning
Tabular RL on a Non-Markovian Rewards Decision Process formulation matches deep RL performance on real metro expansion in Xi'an and Amsterdam while cutting episodes by 18x and carbon emissions by 12x on average.