pith. sign in

arxiv: 1907.02786 · v1 · pith:MHEL6LMEnew · submitted 2019-07-03 · 💻 cs.LG · cs.SI

Sequence to Sequence with Attention for Influenza Prevalence Prediction using Google Trends

Pith reviewed 2026-05-25 10:07 UTC · model grok-4.3

classification 💻 cs.LG cs.SI
keywords influenza predictiongoogle trendssequence to sequenceattention mechanismtime series forecastingepidemiologymachine learning
0
0 comments X

The pith

Seq2Seq models with attention using Google Trends data predict influenza prevalence with 0.996 correlation over multiple weeks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper applies sequence-to-sequence models with an attention mechanism to forecast the number of influenza cases using Google Trends search data. It establishes that the attention component markedly improves accuracy for predictions spanning multiple weeks ahead. Google Trends inputs compensate for underreported cases in official statistics, enabling better performance than earlier methods on long-range forecasts. The resulting model reaches a Pearson correlation of 0.996 and root-mean-square error of 0.67.

Core claim

The paper establishes that incorporating an attention mechanism into a sequence-to-sequence model trained on Google Trends data allows accurate prediction of influenza-infected people over multiple weeks, achieving a Pearson correlation of 0.996 and RMSE of 0.67, outperforming prior approaches.

What carries the argument

The sequence-to-sequence model equipped with an attention mechanism that processes Google Trends time series to forecast influenza prevalence.

If this is right

  • Attention allows the model to focus on relevant past search trends for distant forecasts.
  • Google Trends inputs reduce the impact of dark figures in official influenza statistics.
  • The approach yields state-of-the-art results with 0.996 correlation and 0.67 RMSE.
  • Prediction accuracy holds for periods beyond one month but remains limited at epidemic peaks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same attention-based structure could be tested on search data for other seasonal respiratory illnesses.
  • Integrating additional real-time signals might address the remaining weakness in peak timing.
  • The method points toward attention models as a general tool for epidemiological time-series tasks with sparse official counts.

Load-bearing premise

Google Trends data can compensate for unreported influenza cases and thereby improve prediction accuracy over multiple weeks.

What would settle it

A test on held-out future influenza seasons where the model's Pearson correlation drops below 0.9 would falsify the claim of state-of-the-art long-range accuracy.

Figures

Figures reproduced from arXiv: 1907.02786 by Akihiko Ishikawa, Kenjiro Kondo, Masashi Kimura.

Figure 1
Figure 1. Figure 1: LSTM architecture [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Time series prediction by LSTM 3.2 Seq2Seq Model Unlike LSTM, the Seq2Seq [11] model can predict multiple time steps. This model is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 5
Figure 5. Figure 5: Unweighted ILI time series split for training [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Pearson correlation between actual data and predicted data. X-axis represents the predicted week with respect to the input ILI. The blue, orange, green, and red bars represent the base line, simple LSTM, Seq2Seq, and Seq2Seq2 with attention, respectively [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: RMSE between actual data and predicted data. X-axis represents how the predicted week with respect to the far the predicted week from input ILI. The blue orange, green and red bars represents the base line, simpe LSTM, Seq2Seq, and Seq2Seq with attention, respectively. (a) California (b) Georgia (c) Illinois (e) New York [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Predicted time series of unweighted ILI. Orange graph represents actual ILI. Other graphs represent with lag 1 to 4 weeks ahead. REFERENCES [1] Alessa, A. and Faezipour, M. 2018. A review of influenza detection and prediction through social networking sites. Theoretical biology & medical modelling, 15(1), 2. https://doi:10.1186/s12976-017-0074-5 [2] Cameiro, H. A. and Mylonakis, E. 2009. Google Trends: A W… view at source ↗
read the original abstract

Early prediction of the prevalence of influenza reduces its impact. Various studies have been conducted to predict the number of influenza-infected people. However, these studies are not highly accurate especially in the distant future such as over one month. To deal with this problem, we investigate the sequence to sequence (Seq2Seq) with attention model using Google Trends data to assess and predict the number of influenza-infected people over the course of multiple weeks. Google Trends data help to compensate the dark figures including the statistics and improve the prediction accuracy. We demonstrate that the attention mechanism is highly effective to improve prediction accuracy and achieves state-of-the art results, with a Pearson correlation and root-mean-square error of 0.996 and 0.67, respectively. However, the prediction accuracy of the peak of influenza epidemic is not sufficient, and further investigation is needed to overcome this problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The paper proposes a sequence-to-sequence (Seq2Seq) model augmented with an attention mechanism that incorporates Google Trends data to forecast influenza prevalence over multiple weeks. It reports achieving a Pearson correlation of 0.996 and RMSE of 0.67, which it presents as state-of-the-art, while noting that peak-prediction accuracy remains insufficient.

Significance. If the metrics reflect genuine out-of-sample skill rather than in-sample fit, the work would demonstrate a practical way to leverage readily available search data for multi-week influenza nowcasting, potentially improving early-warning systems in digital epidemiology. The explicit acknowledgment of the peak-prediction shortfall is a strength, as it identifies a concrete direction for follow-up.

major comments (3)
  1. [Abstract] Abstract: The reported Pearson correlation of 0.996 and RMSE of 0.67 are given without any mention of dataset size, train/test split, cross-validation protocol, or error bars. This absence makes it impossible to evaluate whether the numbers represent out-of-sample predictive skill or in-sample fit on historical sequences.
  2. [Abstract] Abstract: No baseline comparisons (e.g., ARIMA, plain LSTM, or non-attention Seq2Seq) are supplied, so the claim that the attention mechanism is “highly effective” and yields state-of-the-art results cannot be substantiated from the given evidence.
  3. [Abstract] Abstract: The statement that Google Trends data “help to compensate the dark figures … and improve the prediction accuracy” is presented as a conclusion without any quantitative ablation or comparison showing the incremental contribution of the Trends features over ILI data alone.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. The comments correctly identify areas where additional methodological context, comparisons, and evidence are needed to support the claims. We will revise the manuscript to address each point.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The reported Pearson correlation of 0.996 and RMSE of 0.67 are given without any mention of dataset size, train/test split, cross-validation protocol, or error bars. This absence makes it impossible to evaluate whether the numbers represent out-of-sample predictive skill or in-sample fit on historical sequences.

    Authors: We agree that the abstract lacks sufficient context on the evaluation setup. The full manuscript describes the data and protocol in the Methods section, but we will revise the abstract to briefly specify the dataset size, train/test split, cross-validation approach, and confirm that the metrics are out-of-sample. Error bars from cross-validation will be added where feasible. revision: yes

  2. Referee: [Abstract] Abstract: No baseline comparisons (e.g., ARIMA, plain LSTM, or non-attention Seq2Seq) are supplied, so the claim that the attention mechanism is “highly effective” and yields state-of-the-art results cannot be substantiated from the given evidence.

    Authors: The absence of explicit baselines in the abstract weakens the claim. We will add comparisons against ARIMA, plain LSTM, and non-attention Seq2Seq in the Experiments section of the revised manuscript and update the abstract to reference these results supporting the attention mechanism's contribution. revision: yes

  3. Referee: [Abstract] Abstract: The statement that Google Trends data “help to compensate the dark figures … and improve the prediction accuracy” is presented as a conclusion without any quantitative ablation or comparison showing the incremental contribution of the Trends features over ILI data alone.

    Authors: We acknowledge that the abstract presents this as a conclusion without supporting ablation evidence. In the revision, we will include a quantitative ablation study comparing performance with and without Google Trends features and revise the abstract to cite this evidence. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper reports an empirical Seq2Seq+attention model trained on Google Trends and influenza statistics to produce multi-week forecasts. Reported metrics (Pearson 0.996, RMSE 0.67) are standard held-out performance numbers from a fitted neural network; no first-principles derivation, uniqueness theorem, or ansatz is invoked. No equations, self-citations, or parameter-fitting steps are shown that reduce the central claim to its own inputs by construction. The work is therefore self-contained against external test data and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities beyond the implicit assumption that search trends proxy unreported cases.

axioms (1)
  • domain assumption Google Trends volumes serve as a reliable proxy for unreported influenza cases
    Stated directly in abstract as compensating dark figures.

pith-pipeline@v0.9.0 · 5677 in / 1006 out tokens · 43795 ms · 2026-05-25T10:07:00.049130+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 1 internal anchor

  1. [1]

    used the Google Flu Trends with linear regression models. They found that Google Flu Trends data have a lower RMSE as a predictor variable and the lowest value is achieved when all other variables are included in the model in the forecasting experiments for the first five weeks of 2013 (with RMSE = 57.61). Google Flu Trends data are useful to predict infl...

  2. [2]

    He stated that real-time forecasting of epidemics has not been widely studied

    reported the development of a simple method that can be used for real-time epidemic forecasting with a discrete time stochastic model. He stated that real-time forecasting of epidemics has not been widely studied. In this study, a discrete time stochastic model accounting for demographic stochasticity and conditional measurement was developed. This model ...

  3. [3]

    influenza

    RESULTS AND DISCUSSION In this section, we illustrate the experimental results of the proposed models and discuss them. 4.1 Experimental Conditions We used the unweighted percentage of the people infected with influenza-like illnesses (unweighted ILI) disclosed by the CDC as the number of people infected by influenza. We collected the unweighted ILI of si...

  4. [4]

    https://doi:10.1186/s12976-017-0074-5

  5. [5]

    1557–1564, https://doi.org/10.1086/630200

    Google Trends: A Web-Based Tool for Real-Time Surveillance of Disease Outbreaks, Clinical Infectious Diseases, Volume 49, Issue 10, 15 November, pp. 1557–1564, https://doi.org/10.1086/630200

  6. [6]

    2014, The parable of Google flu: traps in big data analysis

    Lazer D., Kennedy, R., King, G., and Vespignani, A. 2014, The parable of Google flu: traps in big data analysis. Science, 343(6176), pp. 1203–1205. https://doi.org/10.1126/science.1248506

  7. [7]

    M., Bentley, D., and Muelleman, R

    Araz, O. M., Bentley, D., and Muelleman, R. L. 2014, Using Google Flu Trends data in forecasting influenza-like–illness related ED visits in Omaha, Nebraska, The American Journal of Emergency Medicine, Volume 32, Issue 9, pp. 1016-1023. https://doi.org/10.1016/j.ajem.2014.05.052

  8. [8]

    J., and Brownstein, J

    Mclver, D. J., and Brownstein, J. S. 2014, Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time, LoS Comp Biol 10(4), e1003,581

  9. [9]

    Nishiura. H., 2011, Real-time forecasting of an epidemic using a discrete time stochastic model: a case study of pandemic influenza (H1N1-2009), BioMedical Engineering OnLine201110:15, https://doi.org/10.1186/1475-925X-10-15

  10. [10]

    Dugas, A. F. Jalalpour. M., Gel. Y., Levin, S., Torcaso. F., Igusa, T. and Rortman, R. E. 2013, Influenza Forecasting with Google Flu Trends, PLoS ONE 8(2): e56176., https://doi.org/10.1371/journal.pone.0056176

  11. [11]

    and Wang, Y

    Liu, L., Han, M., Zhou, Y. and Wang, Y. 2018, LSTM Recurrent Neural Networks for Influenza Trends Prediction, Bioinformatics Research and Applications. ISBRA

  12. [12]

    Springer, Cham

    Lecture Notes in Computer Science, vol 10847. Springer, Cham. https://doi.org/10.1007/978-3-319-94968-0_25

  13. [13]

    1997, Long short-term memory.” Neural computation 9.8

    Hochreiter, S., and Schmidhuber, J. 1997, Long short-term memory.” Neural computation 9.8

  14. [14]

    Cognitive` Science

    Finding Structure in Time. Cognitive` Science. 14 (2): 179–211. doi:10.1016/0364-0213(90)90002-E

  15. [15]

    and Le, Q.V

    Sutskever, I., Vinyals, O. and Le, Q.V. 2014, Sequence to Sequence Learning with Neural Networks, Advances in Neural Information Processing Systems 27 (NIPS 2014)

  16. [16]

    Neural Machine Translation by Jointly Learning to Align and Translate

    Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473v7 [cs.CL] 19 May

  17. [17]

    Williams, R. J. and Zipser, D. 1989, A learning algorithm for continually running fully recurrent neural networks. Neural computation, 1(2), 270–280

  18. [18]

    and, Salakhutdinov, R

    Srivastava N., Hinton G., Krizhevsky, A., Sutskever, I. and, Salakhutdinov, R. 2014, Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Journal of Machine Learning Research, vol. 15 pp. 1929-1958, ,http://jmlr.org/papers/v15/srivastava14a.html

  19. [19]

    Lara-Ramírez, E. E. Rodiguez-Perez, M. A., Perez-Rodriguez, M. A and Adeleke, A. 2013, Time Series Analysis of Onchocerciasis Data from Mexico: A Trend towards Elimination, PLoS Negl Trop Dis, vol. 7, p. e2033