A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction

Yao Qin , Dongjin Song , Haifeng Chen , Wei Cheng , Guofei Jiang , Garrison Cottrell

Authors on Pith no claims yet

classification 💻 cs.LG stat.ML

keywords seriestimeattentiondrivingdual-stagerelevantattention-basedbeen

read the original abstract

The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relevant driving series to make predictions. In this paper, we propose a dual-stage attention-based recurrent neural network (DA-RNN) to address these two issues. In the first stage, we introduce an input attention mechanism to adaptively extract relevant driving series (a.k.a., input features) at each time step by referring to the previous encoder hidden state. In the second stage, we use a temporal attention mechanism to select relevant encoder hidden states across all time steps. With this dual-stage attention scheme, our model can not only make predictions effectively, but can also be easily interpreted. Thorough empirical studies based upon the SML 2010 dataset and the NASDAQ 100 Stock dataset demonstrate that the DA-RNN can outperform state-of-the-art methods for time series prediction.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AdaMamba: Adaptive Frequency-Gated Mamba for Long-Term Time Series Forecasting
cs.AI 2026-04 unverdicted novelty 7.0

AdaMamba adds input-dependent frequency bases and a unified time-frequency forgetting gate to Mamba, yielding higher forecasting accuracy than prior methods on standard long-term time series benchmarks.
Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia's National Electricity Market
cs.LG 2026-04 conditional novelty 3.0

GBRT reaches R² 0.88 on price forecasting and 0.96 on demand but every model exceeds 90% MAPE for prices, underscoring the difficulty of the task.