DA-LSTM: A Long Short-Term Memory with Depth Adaptive to Non-uniform Information Flow in Sequential Data

Ka-Ho Chow; S.-H. Gary Chan; Yifeng Zhang

arxiv: 1903.02082 · v1 · pith:XVFAOARQnew · submitted 2019-01-18 · 💻 cs.NE · cs.LG· stat.ML

DA-LSTM: A Long Short-Term Memory with Depth Adaptive to Non-uniform Information Flow in Sequential Data

Yifeng Zhang , Ka-Ho Chow , S.-H. Gary Chan This is my paper

classification 💻 cs.NE cs.LGstat.ML

keywords informationlstmda-lstmlongmemoryshort-termcomputationconvergence

0 comments

read the original abstract

Much sequential data exhibits highly non-uniform information distribution. This cannot be correctly modeled by traditional Long Short-Term Memory (LSTM). To address that, recent works have extended LSTM by adding more activations between adjacent inputs. However, the approaches often use a fixed depth, which is at the step of the most information content. This one-size-fits-all worst-case approach is not satisfactory, because when little information is distributed to some steps, shallow structures can achieve faster convergence and consume less computation resource. In this paper, we develop a Depth-Adaptive Long Short-Term Memory (DA-LSTM) architecture, which can dynamically adjust the structure depending on information distribution without prior knowledge. Experimental results on real-world datasets show that DA-LSTM costs much less computation resource and substantially reduce convergence time by $41.78\%$ and $46.01 \%$, compared with Stacked LSTM and Deep Transition LSTM, respectively.

This paper has not been read by Pith yet.

DA-LSTM: A Long Short-Term Memory with Depth Adaptive to Non-uniform Information Flow in Sequential Data

discussion (0)