pith. sign in

arxiv: 1709.07432 · v2 · pith:UIGP5L44new · submitted 2017-09-21 · 💻 cs.NE · cs.CL

Dynamic Evaluation of Neural Sequence Models

classification 💻 cs.NE cs.CL
keywords dynamicevaluationmodelsbitschardatasetsneuralrespectively
0
0 comments X
read the original abstract

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning Inference Concurrency in DynamicGate MLP Structural and Mathematical Justification

    cs.LG 2026-04 unverdicted novelty 4.0

    DynamicGate MLP enables concurrent learning and inference by separating gating from representation parameters, so that even asynchronous updates produce outputs equivalent to a valid fixed model snapshot.

  2. Multiplicative Models for Recurrent Language Modeling

    cs.LG 2019-06 unverdicted novelty 3.0

    New multiplicative RNN models are tested on char-level LM tasks to demonstrate the relevance of shared parametrization for the intermediate state.