Augmenting self-attention with persistent memory vectors allows removal of feed-forward layers from Transformers without degrading performance on character and word level language modeling benchmarks.
Long short-term memory
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Introduces ZSSLR problem and ASL-Text dataset; uses text embeddings with 3D-CNN and bi-LSTM video features for zero-shot sign recognition.
Simulus integrates flexible tokenization, intrinsic motivation, prioritized world model replay, and regression-as-classification to achieve state-of-the-art sample efficiency for planning-free world model agents on visual Atari 100K, DMC Proprioception 500K, and symbolic Craftax-1M benchmarks.
citing papers explorer
-
Augmenting Self-attention with Persistent Memory
Augmenting self-attention with persistent memory vectors allows removal of feed-forward layers from Transformers without degrading performance on character and word level language modeling benchmarks.
-
Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?
Introduces ZSSLR problem and ASL-Text dataset; uses text embeddings with 3D-CNN and bi-LSTM video features for zero-shot sign recognition.
-
Simulus: Combining Improvements in Sample-Efficient World Model Agents
Simulus integrates flexible tokenization, intrinsic motivation, prioritized world model replay, and regression-as-classification to achieve state-of-the-art sample efficiency for planning-free world model agents on visual Atari 100K, DMC Proprioception 500K, and symbolic Craftax-1M benchmarks.