Training Language Models Using Target-Propagation

Arthur Szlam; Marc'Aurelio Ranzato; Nicolas Vasilache; Ruoyu Sun; Sam Wiseman; Soumith Chintala; Sumit Chopra

arxiv: 1702.04770 · v1 · pith:6WU6HKV5new · submitted 2017-02-15 · 💻 cs.CL · cs.LG· cs.NE

Training Language Models Using Target-Propagation

Sam Wiseman , Sumit Chopra , Marc'Aurelio Ranzato , Arthur Szlam , Ruoyu Sun , Soumith Chintala , Nicolas Vasilache This is my paper

classification 💻 cs.CL cs.LGcs.NE

keywords bptttproptrainingaddressanalysisapproachapproachesback-propagation

0 comments

read the original abstract

While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps. We investigate whether Target Propagation (TPROP) style approaches can address these shortcomings. Unfortunately, extensive experiments suggest that TPROP generally underperforms BPTT, and we end with an analysis of this phenomenon, and suggestions for future work.

This paper has not been read by Pith yet.

Training Language Models Using Target-Propagation

discussion (0)