Dynamic Evaluation of Neural Sequence Models

PMLR · 2017 · cs.NE · arXiv 1709.07432

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.

representative citing papers

Learning Inference Concurrency in DynamicGate MLP Structural and Mathematical Justification

cs.LG · 2026-04-15 · unverdicted · novelty 4.0

DynamicGate MLP enables concurrent learning and inference by separating gating from representation parameters, so that even asynchronous updates produce outputs equivalent to a valid fixed model snapshot.

Multiplicative Models for Recurrent Language Modeling

cs.LG · 2019-06-30 · unverdicted · novelty 3.0

New multiplicative RNN models are tested on char-level LM tasks to demonstrate the relevance of shared parametrization for the intermediate state.

citing papers explorer

Showing 2 of 2 citing papers.

Learning Inference Concurrency in DynamicGate MLP Structural and Mathematical Justification cs.LG · 2026-04-15 · unverdicted · none · ref 30
DynamicGate MLP enables concurrent learning and inference by separating gating from representation parameters, so that even asynchronous updates produce outputs equivalent to a valid fixed model snapshot.
Multiplicative Models for Recurrent Language Modeling cs.LG · 2019-06-30 · unverdicted · none · ref 14 · internal anchor
New multiplicative RNN models are tested on char-level LM tasks to demonstrate the relevance of shared parametrization for the intermediate state.

Dynamic Evaluation of Neural Sequence Models

fields

years

verdicts

representative citing papers

citing papers explorer