ST-Vision-LLM reframes spatiotemporal traffic forecasting as vision-language fusion, using visual encoders on traffic grids and efficient numerical tokenization to achieve 15.6% better long-term accuracy and 30% gains in few-shot cross-domain settings.
Attentive Crowd Flow Machines,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Vision-LLMs for Spatiotemporal Traffic Forecasting
ST-Vision-LLM reframes spatiotemporal traffic forecasting as vision-language fusion, using visual encoders on traffic grids and efficient numerical tokenization to achieve 15.6% better long-term accuracy and 30% gains in few-shot cross-domain settings.