A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction
read the original abstract
The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relevant driving series to make predictions. In this paper, we propose a dual-stage attention-based recurrent neural network (DA-RNN) to address these two issues. In the first stage, we introduce an input attention mechanism to adaptively extract relevant driving series (a.k.a., input features) at each time step by referring to the previous encoder hidden state. In the second stage, we use a temporal attention mechanism to select relevant encoder hidden states across all time steps. With this dual-stage attention scheme, our model can not only make predictions effectively, but can also be easily interpreted. Thorough empirical studies based upon the SML 2010 dataset and the NASDAQ 100 Stock dataset demonstrate that the DA-RNN can outperform state-of-the-art methods for time series prediction.
This paper has not been read by Pith yet.
Forward citations
Cited by 8 Pith papers
-
AdaMamba: Adaptive Frequency-Gated Mamba for Long-Term Time Series Forecasting
AdaMamba adds input-dependent frequency bases and a unified time-frequency forgetting gate to Mamba, yielding higher forecasting accuracy than prior methods on standard long-term time series benchmarks.
-
Deep Time Series Models: A Comprehensive Survey and Benchmark
This survey and benchmark of deep time series models using the released TSLib library finds that models with specific structures perform well only on distinct analysis tasks.
-
DyWPE: Signal-Aware Dynamic Wavelet Positional Encoding for Time Series Transformers
DyWPE generates positional embeddings for time series transformers from the input signal via Discrete Wavelet Transform and outperforms standard positional encodings on ten datasets, especially longer sequences and bi...
-
CASE-NET: Deep Spatio-Temporal Representation Learning via Causal Attention and Channel Recalibration for Multivariate Time Series Classification
CASE-NET combines a causal temporal encoder with adaptive channel recalibration and reports new state-of-the-art accuracy on four of six evaluated multivariate time series tasks.
-
Hermes: A Multi-Scale Spatial-Temporal Hypergraph Network for Stock Time Series Forecasting
Hermes is a multi-scale spatial-temporal hypergraph network that improves stock forecasting accuracy by capturing inter-industry lead-lag dependencies and fusing information across scales.
-
Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia's National Electricity Market
GBRT reaches R² 0.88 on price forecasting and 0.96 on demand but every model exceeds 90% MAPE for prices, underscoring the difficulty of the task.
-
Deep Learning for Electricity Price Forecasting: A Review of Day-Ahead, Intraday, and Balancing Electricity Markets
A structured review organizes deep learning models for electricity price forecasting via a backbone-head-loss taxonomy and identifies gaps in intraday and balancing market applications.
-
Positional Encoding in Transformer-Based Time Series Models: A Survey
A survey of positional encoding methods in transformer-based time series models that evaluates fixed, learnable, relative, and hybrid approaches on classification tasks and links effectiveness to data characteristics.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.