DynamicGate MLP enables concurrent learning and inference by separating gating from representation parameters, so that even asynchronous updates produce outputs equivalent to a valid fixed model snapshot.
Dynamic Evaluation of Neural Sequence Models
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
New multiplicative RNN models are tested on char-level LM tasks to demonstrate the relevance of shared parametrization for the intermediate state.
citing papers explorer
-
Multiplicative Models for Recurrent Language Modeling
New multiplicative RNN models are tested on char-level LM tasks to demonstrate the relevance of shared parametrization for the intermediate state.