Dynamic Evaluation of Neural Sequence Models

Ben Krause; Emmanuel Kahembwe; Iain Murray; Steve Renals

arxiv: 1709.07432 · v2 · pith:UIGP5L44new · submitted 2017-09-21 · 💻 cs.NE · cs.CL

Dynamic Evaluation of Neural Sequence Models

Ben Krause , Emmanuel Kahembwe , Iain Murray , Steve Renals This is my paper

classification 💻 cs.NE cs.CL

keywords dynamicevaluationmodelsbitschardatasetsneuralrespectively

0 comments

read the original abstract

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Learning Inference Concurrency in DynamicGate MLP Structural and Mathematical Justification
cs.LG 2026-04 unverdicted novelty 4.0

DynamicGate MLP enables concurrent learning and inference by separating gating from representation parameters, so that even asynchronous updates produce outputs equivalent to a valid fixed model snapshot.
Multiplicative Models for Recurrent Language Modeling
cs.LG 2019-06 unverdicted novelty 3.0

New multiplicative RNN models are tested on char-level LM tasks to demonstrate the relevance of shared parametrization for the intermediate state.