To each route its own ETA: A generative modeling framework for ETA prediction

Charul; Pravesh Biyani

arxiv: 1906.09925 · v1 · pith:KDS247HLnew · submitted 2019-06-24 · 💻 cs.LG · eess.SP· stat.ML

To each route its own ETA: A generative modeling framework for ETA prediction

Charul , Pravesh Biyani This is my paper

Pith reviewed 2026-05-25 17:12 UTC · model grok-4.3

classification 💻 cs.LG eess.SPstat.ML

keywords ETA predictiongenerative modelbus transitdeep learningreal-time updatespublic transportationprobability distributionroute-specific modeling

0 comments

The pith

A generative model trained on one bus route's historical data learns trip time distributions and updates ETAs using real-time trip information.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that a deep generative model, built solely from the historical records of a single bus route, can capture the full probability distribution of arrival times and then condition updates on partial trip data as the journey unfolds. This matters for cities where large cab-style GPS collections do not exist and where bus data is often sparse, noisy, or incomplete. The approach is presented as self-contained, requiring no external sources, so any transit agency could deploy it per route. A sympathetic reader would therefore expect improved ETA reliability in data-scarce public-transit settings without new infrastructure.

Core claim

We train a deep learning based generative model that learns the probability distribution of ETA data across trips and conditional on the current trip information updates the ETA information on the go. Our plug and play model not only captures the non-linearity of the task well but that any transit agency can use without needing any other external data source. The experiments run over three routes, data collected in the city of Delhi illustrates the promise of our approach.

What carries the argument

A deep generative model that learns the probability distribution of ETA values for one route and conditions successive updates on observed trip progress.

If this is right

The model directly captures non-linear patterns in travel times without hand-crafted features.
Any transit agency can apply the same pipeline using only its own route logs.
Real-time conditioning allows ETA revisions at any point during a trip.
The framework tolerates the typical imperfections found in operational bus data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Agencies could maintain separate models per route rather than building one city-wide system.
The same generative structure might be tested on other sequential prediction tasks that have limited per-entity history.
If the learned distributions prove stable, agencies could simulate schedule changes by sampling from the model.

Load-bearing premise

Historical data collected on a single bus route is sufficient to train a generative model that generalizes to future trips even when the data contains outliers, anomalies, and missing values.

What would settle it

On held-out trips from the same Delhi routes, if the model's updated ETA predictions show larger average error than a simple historical average or linear regression baseline, the claim would not hold.

Figures

Figures reproduced from arXiv: 1906.09925 by Charul, Pravesh Biyani.

**Figure 1.** Figure 1: Matrix X of points where the probability distribution of one point depends on the observed values of the previous points. The generation proceeds row by row and pixel by pixel. Similarly, we can determine the probability of pixel xi conditioned on xi−1...x1. Likewise, the travel times of a bus route can be seen as an image of size T × K with rows as trips and the columns denote the travel times between con… view at source ↗

**Figure 2.** Figure 2: Example dataset with 4 elements and inferencing the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 5.** Figure 5: Inferencing the Travel Time IV. RESULTS We now discuss the performance of the proposed maskCNN algorithm for the ETA estimation task for a bus route network. We compare our technique with the state-of-the-art approaches like time series prediction, deep learning, as well as the matrix completion approaches below: 1) ARIMA (Autoregressive Integrated Moving Average) [7]. 2) LSTM (Long Short Term Memory) [19… view at source ↗

**Figure 4.** Figure 4: Different masks used in mask-CNN F. ETA prediction using the trained model Once the model is trained using the historic travel time data, we are now ready to provide ETA estimation for every trip in the route [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 7.** Figure 7: Routes used for data collection B. Training Parameters We employ a variety of masks based on the dependencies we want to capture in the dataset. We use three different kinds of the mask in our evaluation (mask A1 and B1 for mask 1, mask A2 and B2 for mask 2, mask A3 and B3 for mask 3).The masks 1, 2 and 3 for filter dimension 5 is shown in [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of Masked CNN for a bus route [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

read the original abstract

Accurate expected time of arrival (ETA) information is crucial in maintaining the quality of service of public transit. Recent advances in artificial intelligence (AI) has led to more effective models for ETA estimation that rely heavily on a large GPS datasets. More importantly, these are mainly cabs based datasets which may not be fit for bus-based public transport. Consequently, the latest methods may not be applicable for ETA estimation in cities with the absence of large training data set. On the other hand, the ETA estimation problem in many cities needs to be solved in the absence of big datasets that also contains outliers, anomalies and may be incomplete. This work presents a simple but robust model for ETA estimation for a bus route that only relies on the historical data of the particular route. We propose a system that generates ETA information for a trip and updates it as the trip progresses based on the real-time information. We train a deep learning based generative model that learns the probability distribution of ETA data across trips and conditional on the current trip information updates the ETA information on the go. Our plug and play model not only captures the non-linearity of the task well but that any transit agency can use without needing any other external data source. The experiments run over three routes, data collected in the city of Delhi illustrates the promise of our approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a generative model for single-route bus ETA from historical data alone, but the abstract supplies no architecture, metrics, or baselines.

read the letter

The main takeaway is that this work trains a deep generative model on historical data from one bus route to produce and update ETAs in real time without pulling in large external datasets. That setup directly targets cities where cab-style GPS corpora do not exist and where bus data can be sparse or noisy. The Delhi experiments on three routes show the authors tried to keep the claim grounded in actual transit operations rather than synthetic benchmarks. The generative framing is a reasonable match for the task because it can in principle output distributions and condition on partial trip information as the bus moves. Those are the parts that feel useful on first read. The soft spots are more substantial. The abstract never names the generative architecture, the loss, how missing values or outliers are handled, or any quantitative comparison to simple baselines such as route-average travel times or basic regression. Without those pieces it is hard to judge whether the model actually improves on what a transit agency already does or whether the learned distribution simply reproduces the training data. The central assumption that single-route history is enough to generalize under anomalies therefore sits on thin evidence. If the full paper contains the missing implementation details and controlled results, the contribution becomes clearer; on the abstract alone the work reads as a preliminary proposal rather than a finished method. This is mainly for people already working on applied time-series models in transportation or for agencies that need lightweight, route-specific tools. A reader looking for a ready-to-deploy system will not get enough from the current version. I would still send it to referees so the authors can supply the architecture, training procedure, and numbers that are currently absent.

Referee Report

3 major / 0 minor

Summary. The manuscript proposes a deep generative model trained solely on historical ETA data from individual bus routes to learn the probability distribution over trip ETAs; conditional on real-time trip information, the model updates ETA predictions on the go. It claims this route-specific approach captures non-linearity, accommodates anomalies and missing values, requires no external data sources, and is demonstrated via experiments on three Delhi routes.

Significance. If the generative model can be shown to produce usable conditional distributions from limited single-route corpora despite noise, the work would offer a practical, low-data alternative to cab-centric ETA methods for public-transit agencies.

major comments (3)

[Abstract] Abstract: the central claim that a deep generative model 'learns the probability distribution of ETA data across trips' and 'updates the ETA information on the go' is unsupported because the abstract (and, per the provided description, the manuscript) supplies no architecture, loss function, training objective, or mechanism for conditioning or handling incomplete observations.
[Abstract] Abstract: the assertion that the model 'captures the non-linearity of the task well' and handles 'outliers, anomalies and may be incomplete' data is load-bearing yet unaccompanied by any quantitative metrics, baseline comparisons, ablation on anomaly injection, or cross-validation results on the three Delhi routes.
[Abstract] Abstract: the weakest assumption—that historical data collected on a single bus route suffices to train a generative model that generalizes to future trips—is presented without any description of data preprocessing, outlier modeling, or robustness experiments, directly undermining the claim of applicability in data-scarce settings.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed feedback on the abstract. We agree that the abstract should more explicitly reference the model's technical elements and experimental support from the manuscript. We will revise the abstract to address these concerns while preserving its brevity. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that a deep generative model 'learns the probability distribution of ETA data across trips' and 'updates the ETA information on the go' is unsupported because the abstract (and, per the provided description, the manuscript) supplies no architecture, loss function, training objective, or mechanism for conditioning or handling incomplete observations.

Authors: The manuscript body details the deep generative model architecture, training objective, conditioning mechanism on real-time progress, and handling of incomplete observations. To make these elements evident from the abstract alone, we will revise the abstract to concisely summarize the generative modeling approach and conditioning process. revision: yes
Referee: [Abstract] Abstract: the assertion that the model 'captures the non-linearity of the task well' and handles 'outliers, anomalies and may be incomplete' data is load-bearing yet unaccompanied by any quantitative metrics, baseline comparisons, ablation on anomaly injection, or cross-validation results on the three Delhi routes.

Authors: The experiments section reports quantitative metrics, baseline comparisons, and results across the three Delhi routes, including robustness aspects. We will revise the abstract to include key performance indicators and note the experimental validation on real data. revision: yes
Referee: [Abstract] Abstract: the weakest assumption—that historical data collected on a single bus route suffices to train a generative model that generalizes to future trips—is presented without any description of data preprocessing, outlier modeling, or robustness experiments, directly undermining the claim of applicability in data-scarce settings.

Authors: The manuscript describes data collection from individual routes, preprocessing steps, and robustness considerations in the data and experiments sections. We will update the abstract to briefly reference the route-specific historical data and preprocessing approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity; standard generative model training on route data

full rationale

The paper's core claim is that a deep generative model can be trained on historical single-route bus data to learn ETA distributions and perform conditional updates. This is a conventional ML setup: fit parameters to observed trip data, then generate predictions for new or ongoing trips. No equations, self-citations, or uniqueness theorems are quoted that would make any prediction equivalent to the training inputs by construction. The approach is presented as plug-and-play and empirically tested on three Delhi routes, remaining externally falsifiable. No load-bearing self-referential steps appear in the provided abstract or description.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that a deep generative model trained solely on limited route-specific historical data can produce usable conditional distributions despite data quality issues.

free parameters (1)

deep generative model parameters
Neural network weights and latent variables are fitted to the historical ETA observations for each route.

axioms (2)

domain assumption Historical trips on a given route are statistically representative of future trips on the same route
Invoked when the model is trained only on past route data and expected to generalize.
domain assumption Generative models can learn useful distributions from incomplete and anomalous time-series data without external covariates
Stated as a strength of the approach in the abstract.

pith-pipeline@v0.9.0 · 5769 in / 1426 out tokens · 41011 ms · 2026-05-25T17:12:37.081540+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 3 internal anchors

[1]

A literature review of the passenger beneﬁts of real-time transit information,

C. Brakewood and K. Watkins, “A literature review of the passenger beneﬁts of real-time transit information,” Transport Reviews, pp. 1–30, 2018

work page 2018
[2]

Pixel Recurrent Neural Networks

A. v. d. Oord, N. Kalchbrenner, and K. Kavukcuoglu, “Pixel recurrent neural networks,” arXiv preprint arXiv:1601.06759 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[3]

Travel Time Estimation Using Floating Car Data

R. Sevlian and R. Rajagopal, “Travel time estimation using ﬂoating car data,” arXiv preprint arXiv:1012.4249 , 2010

work page internal anchor Pith review Pith/arXiv arXiv 2010
[4]

Trafﬁc estimation and prediction based on real time ﬂoating car data,

C. De Fabritiis, R. Ragona, and G. Valenti, “Trafﬁc estimation and prediction based on real time ﬂoating car data,” in Intelligent Transportation Systems, 2008. ITSC 2008. 11th International IEEE Conference on. IEEE, 2008, pp. 197–203

work page 2008
[5]

Spatiotemporal patterns in large-scale trafﬁc speed prediction,

M. T. Asif, J. Dauwels, C. Y . Goh, A. Oran, E. Fathi, M. Xu, M. M. Dhanya, N. Mitrovic, and P. Jaillet, “Spatiotemporal patterns in large-scale trafﬁc speed prediction,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 2, pp. 794–804, 2014

work page 2014
[6]

Route travel time estimation using low-frequency ﬂoating car data,

M. Rahmani, E. Jenelius, and H. N. Koutsopoulos, “Route travel time estimation using low-frequency ﬂoating car data,” in Intelligent Trans- portation Systems-(ITSC), 2013 16th International IEEE Conference on. IEEE, 2013, pp. 2292–2297

work page 2013
[7]

Application of the arima models to urban roadway travel time prediction-a case study,

D. Billings and J.-S. Yang, “Application of the arima models to urban roadway travel time prediction-a case study,” in Systems, Man and Cybernetics, 2006. SMC’06. IEEE International Conference on, vol. 3. IEEE, 2006, pp. 2529–2534

work page 2006
[8]

Travel time estimation for ambulances using bayesian data augmentation,

B. S. Westgate, D. B. Woodard, D. S. Matteson, S. G. Henderson et al. , “Travel time estimation for ambulances using bayesian data augmentation,” The Annals of Applied Statistics , vol. 7, no. 2, pp. 1139–1161, 2013

work page 2013
[9]

Learning the dynamics of arterial trafﬁc from probe data using a dynamic bayesian network,

A. Hoﬂeitner, R. Herring, P. Abbeel, and A. Bayen, “Learning the dynamics of arterial trafﬁc from probe data using a dynamic bayesian network,” IEEE Transactions on Intelligent Transportation Systems , vol. 13, no. 4, pp. 1679–1693, 2012

work page 2012
[10]

Utilizing real-world trans- portation data for accurate trafﬁc prediction,

B. Pan, U. Demiryurek, and C. Shahabi, “Utilizing real-world trans- portation data for accurate trafﬁc prediction,” in Data Mining (ICDM), 2012 IEEE 12th International Conference on . IEEE, 2012, pp. 595– 604

work page 2012
[11]

Trafﬁc ﬂow prediction with big data: A deep learning approach

Y . Lv, Y . Duan, W. Kang, Z. Li, F.-Y . Wang et al. , “Trafﬁc ﬂow prediction with big data: A deep learning approach.” IEEE Trans. Intelligent Transportation Systems, vol. 16, no. 2, pp. 865–873, 2015

work page 2015
[12]

A simple and effective method for predicting travel times on freeways,

J. Rice and E. Van Zwet, “A simple and effective method for predicting travel times on freeways,” in Intelligent Transportation Systems, 2001. Proceedings. 2001 IEEE . IEEE, 2001, pp. 227–232

work page 2001
[13]

Travel time prediction using k nearest neighbor method with combined data from vehicle detector system and automatic toll collection system,

J. Myung, D.-K. Kim, S.-Y . Kho, and C.-H. Park, “Travel time prediction using k nearest neighbor method with combined data from vehicle detector system and automatic toll collection system,” Transportation Research Record, vol. 2256, no. 1, pp. 51–59, 2011

work page 2011
[14]

Dynamic travel time prediction with real-time and historic data,

S. I.-J. Chien and C. M. Kuchipudi, “Dynamic travel time prediction with real-time and historic data,” Journal of transportation engineer- ing, vol. 129, no. 6, pp. 608–616, 2003

work page 2003
[15]

Travel time prediction with support vector regression,

C.-H. Wu, C.-C. Wei, D.-C. Su, M.-H. Chang, and J.-M. Ho, “Travel time prediction with support vector regression,” in Intelligent Trans- portation Systems, 2003. Proceedings. 2003 IEEE , vol. 2. IEEE, 2003, pp. 1438–1442

work page 2003
[16]

A gradient boosting method to improve travel time prediction,

Y . Zhang and A. Haghani, “A gradient boosting method to improve travel time prediction,” Transportation Research Part C: Emerging Technologies, vol. 58, pp. 308–324, 2015

work page 2015
[17]

Development of recurrent neural network considering temporal-spatial input dynamics for freeway travel time modeling,

X. Zeng and Y . Zhang, “Development of recurrent neural network considering temporal-spatial input dynamics for freeway travel time modeling,” Computer-Aided Civil and Infrastructure Engineering , vol. 28, no. 5, pp. 359–371, 2013

work page 2013
[18]

Trafﬁc speed prediction and congestion source exploration: A deep learning method,

J. Wang, Q. Gu, J. Wu, G. Liu, and Z. Xiong, “Trafﬁc speed prediction and congestion source exploration: A deep learning method,” in Data Mining (ICDM), 2016 IEEE 16th International Conference on. IEEE, 2016, pp. 499–508

work page 2016
[19]

Travel time prediction with lstm neural network,

Y . Duan, Y . Lv, and F.-Y . Wang, “Travel time prediction with lstm neural network,” in Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016, pp. 1053–1058

work page 2016
[20]

Travel cost inference from sparse, spatio temporally correlated time series using markov models,

B. Yang, C. Guo, and C. S. Jensen, “Travel cost inference from sparse, spatio temporally correlated time series using markov models,” Proceedings of the VLDB Endowment , vol. 6, no. 9, pp. 769–780, 2013

work page 2013
[21]

J. Y . Zheng Wang, Kun Fu. (2018) Learning to estimate the travel time. [Online]. Available: http://www.kdd.org/kdd2018/ accepted-papers/view/learning-to-estimate-the-travel-time

work page 2018
[22]

Http: a new framework for bus travel time prediction based on historical trajectories,

W.-C. Lee, W. Si, L.-J. Chen, and M. C. Chen, “Http: a new framework for bus travel time prediction based on historical trajectories,” in Proceedings of the 20th International Conference on Advances in Geographic Information Systems . ACM, 2012, pp. 279–288

work page 2012
[23]

A simple baseline for travel time estimation using large-scale trip data,

H. Wang, Y .-H. Kuo, D. Kifer, and Z. Li, “A simple baseline for travel time estimation using large-scale trip data,” in Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems . ACM, 2016, p. 61

work page 2016
[24]

Tutorial on variational autoencoders,

C. Doersch, “Tutorial on variational autoencoders,” arXiv preprint arXiv:1606.05908, 2016

work page arXiv 2016
[25]

Generative adversarial nets,

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Advances in neural information processing systems , 2014, pp. 2672– 2680

work page 2014
[26]

Imagenet classiﬁcation with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classiﬁcation with deep convolutional neural networks,” in Advances in neural information processing systems , 2012, pp. 1097–1105

work page 2012
[27]

Online variational bayesian subspace ﬁltering,

Charul, U. Bhatt, P. Biyani, and K. Rajawat, “Online variational bayesian subspace ﬁltering,” in Proc. of the IEEE ICASSP, May. 2019

work page 2019
[28]

An overview of gradient descent optimization algorithms

S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[29]

When will you arrive? estimating travel time based on deep neural networks

D. Wang, J. Zhang, W. Cao, J. Li, and Y . Zheng, “When will you arrive? estimating travel time based on deep neural networks.” AAAI, 2018

work page 2018

[1] [1]

A literature review of the passenger beneﬁts of real-time transit information,

C. Brakewood and K. Watkins, “A literature review of the passenger beneﬁts of real-time transit information,” Transport Reviews, pp. 1–30, 2018

work page 2018

[2] [2]

Pixel Recurrent Neural Networks

A. v. d. Oord, N. Kalchbrenner, and K. Kavukcuoglu, “Pixel recurrent neural networks,” arXiv preprint arXiv:1601.06759 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[3] [3]

Travel Time Estimation Using Floating Car Data

R. Sevlian and R. Rajagopal, “Travel time estimation using ﬂoating car data,” arXiv preprint arXiv:1012.4249 , 2010

work page internal anchor Pith review Pith/arXiv arXiv 2010

[4] [4]

Trafﬁc estimation and prediction based on real time ﬂoating car data,

C. De Fabritiis, R. Ragona, and G. Valenti, “Trafﬁc estimation and prediction based on real time ﬂoating car data,” in Intelligent Transportation Systems, 2008. ITSC 2008. 11th International IEEE Conference on. IEEE, 2008, pp. 197–203

work page 2008

[5] [5]

Spatiotemporal patterns in large-scale trafﬁc speed prediction,

M. T. Asif, J. Dauwels, C. Y . Goh, A. Oran, E. Fathi, M. Xu, M. M. Dhanya, N. Mitrovic, and P. Jaillet, “Spatiotemporal patterns in large-scale trafﬁc speed prediction,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 2, pp. 794–804, 2014

work page 2014

[6] [6]

Route travel time estimation using low-frequency ﬂoating car data,

M. Rahmani, E. Jenelius, and H. N. Koutsopoulos, “Route travel time estimation using low-frequency ﬂoating car data,” in Intelligent Trans- portation Systems-(ITSC), 2013 16th International IEEE Conference on. IEEE, 2013, pp. 2292–2297

work page 2013

[7] [7]

Application of the arima models to urban roadway travel time prediction-a case study,

D. Billings and J.-S. Yang, “Application of the arima models to urban roadway travel time prediction-a case study,” in Systems, Man and Cybernetics, 2006. SMC’06. IEEE International Conference on, vol. 3. IEEE, 2006, pp. 2529–2534

work page 2006

[8] [8]

Travel time estimation for ambulances using bayesian data augmentation,

B. S. Westgate, D. B. Woodard, D. S. Matteson, S. G. Henderson et al. , “Travel time estimation for ambulances using bayesian data augmentation,” The Annals of Applied Statistics , vol. 7, no. 2, pp. 1139–1161, 2013

work page 2013

[9] [9]

Learning the dynamics of arterial trafﬁc from probe data using a dynamic bayesian network,

A. Hoﬂeitner, R. Herring, P. Abbeel, and A. Bayen, “Learning the dynamics of arterial trafﬁc from probe data using a dynamic bayesian network,” IEEE Transactions on Intelligent Transportation Systems , vol. 13, no. 4, pp. 1679–1693, 2012

work page 2012

[10] [10]

Utilizing real-world trans- portation data for accurate trafﬁc prediction,

B. Pan, U. Demiryurek, and C. Shahabi, “Utilizing real-world trans- portation data for accurate trafﬁc prediction,” in Data Mining (ICDM), 2012 IEEE 12th International Conference on . IEEE, 2012, pp. 595– 604

work page 2012

[11] [11]

Trafﬁc ﬂow prediction with big data: A deep learning approach

Y . Lv, Y . Duan, W. Kang, Z. Li, F.-Y . Wang et al. , “Trafﬁc ﬂow prediction with big data: A deep learning approach.” IEEE Trans. Intelligent Transportation Systems, vol. 16, no. 2, pp. 865–873, 2015

work page 2015

[12] [12]

A simple and effective method for predicting travel times on freeways,

J. Rice and E. Van Zwet, “A simple and effective method for predicting travel times on freeways,” in Intelligent Transportation Systems, 2001. Proceedings. 2001 IEEE . IEEE, 2001, pp. 227–232

work page 2001

[13] [13]

Travel time prediction using k nearest neighbor method with combined data from vehicle detector system and automatic toll collection system,

J. Myung, D.-K. Kim, S.-Y . Kho, and C.-H. Park, “Travel time prediction using k nearest neighbor method with combined data from vehicle detector system and automatic toll collection system,” Transportation Research Record, vol. 2256, no. 1, pp. 51–59, 2011

work page 2011

[14] [14]

Dynamic travel time prediction with real-time and historic data,

S. I.-J. Chien and C. M. Kuchipudi, “Dynamic travel time prediction with real-time and historic data,” Journal of transportation engineer- ing, vol. 129, no. 6, pp. 608–616, 2003

work page 2003

[15] [15]

Travel time prediction with support vector regression,

C.-H. Wu, C.-C. Wei, D.-C. Su, M.-H. Chang, and J.-M. Ho, “Travel time prediction with support vector regression,” in Intelligent Trans- portation Systems, 2003. Proceedings. 2003 IEEE , vol. 2. IEEE, 2003, pp. 1438–1442

work page 2003

[16] [16]

A gradient boosting method to improve travel time prediction,

Y . Zhang and A. Haghani, “A gradient boosting method to improve travel time prediction,” Transportation Research Part C: Emerging Technologies, vol. 58, pp. 308–324, 2015

work page 2015

[17] [17]

Development of recurrent neural network considering temporal-spatial input dynamics for freeway travel time modeling,

X. Zeng and Y . Zhang, “Development of recurrent neural network considering temporal-spatial input dynamics for freeway travel time modeling,” Computer-Aided Civil and Infrastructure Engineering , vol. 28, no. 5, pp. 359–371, 2013

work page 2013

[18] [18]

Trafﬁc speed prediction and congestion source exploration: A deep learning method,

J. Wang, Q. Gu, J. Wu, G. Liu, and Z. Xiong, “Trafﬁc speed prediction and congestion source exploration: A deep learning method,” in Data Mining (ICDM), 2016 IEEE 16th International Conference on. IEEE, 2016, pp. 499–508

work page 2016

[19] [19]

Travel time prediction with lstm neural network,

Y . Duan, Y . Lv, and F.-Y . Wang, “Travel time prediction with lstm neural network,” in Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016, pp. 1053–1058

work page 2016

[20] [20]

Travel cost inference from sparse, spatio temporally correlated time series using markov models,

B. Yang, C. Guo, and C. S. Jensen, “Travel cost inference from sparse, spatio temporally correlated time series using markov models,” Proceedings of the VLDB Endowment , vol. 6, no. 9, pp. 769–780, 2013

work page 2013

[21] [21]

J. Y . Zheng Wang, Kun Fu. (2018) Learning to estimate the travel time. [Online]. Available: http://www.kdd.org/kdd2018/ accepted-papers/view/learning-to-estimate-the-travel-time

work page 2018

[22] [22]

Http: a new framework for bus travel time prediction based on historical trajectories,

W.-C. Lee, W. Si, L.-J. Chen, and M. C. Chen, “Http: a new framework for bus travel time prediction based on historical trajectories,” in Proceedings of the 20th International Conference on Advances in Geographic Information Systems . ACM, 2012, pp. 279–288

work page 2012

[23] [23]

A simple baseline for travel time estimation using large-scale trip data,

H. Wang, Y .-H. Kuo, D. Kifer, and Z. Li, “A simple baseline for travel time estimation using large-scale trip data,” in Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems . ACM, 2016, p. 61

work page 2016

[24] [24]

Tutorial on variational autoencoders,

C. Doersch, “Tutorial on variational autoencoders,” arXiv preprint arXiv:1606.05908, 2016

work page arXiv 2016

[25] [25]

Generative adversarial nets,

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Advances in neural information processing systems , 2014, pp. 2672– 2680

work page 2014

[26] [26]

Imagenet classiﬁcation with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classiﬁcation with deep convolutional neural networks,” in Advances in neural information processing systems , 2012, pp. 1097–1105

work page 2012

[27] [27]

Online variational bayesian subspace ﬁltering,

Charul, U. Bhatt, P. Biyani, and K. Rajawat, “Online variational bayesian subspace ﬁltering,” in Proc. of the IEEE ICASSP, May. 2019

work page 2019

[28] [28]

An overview of gradient descent optimization algorithms

S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[29] [29]

When will you arrive? estimating travel time based on deep neural networks

D. Wang, J. Zhang, W. Cao, J. Li, and Y . Zheng, “When will you arrive? estimating travel time based on deep neural networks.” AAAI, 2018

work page 2018