A Neural Knowledge Language Model

Heeyoul Choi; Sungjin Ahn; Tanel P\"arnamaa; Yoshua Bengio

arxiv: 1608.00318 · v2 · pith:4KT7U5KInew · submitted 2016-08-01 · 💻 cs.CL · cs.LG

A Neural Knowledge Language Model

Sungjin Ahn , Heeyoul Choi , Tanel P\"arnamaa , Yoshua Bengio This is my paper

classification 💻 cs.CL cs.LG

keywords knowledgelanguagemodelwordsfactgenerateneuralnklm

0 comments

read the original abstract

Current language models have a significant limitation in the ability to encode and decode factual knowledge. This is mainly because they acquire such knowledge from statistical co-occurrences although most of the knowledge words are rarely observed. In this paper, we propose a Neural Knowledge Language Model (NKLM) which combines symbolic knowledge provided by the knowledge graph with the RNN language model. By predicting whether the word to generate has an underlying fact or not, the model can generate such knowledge-related words by copying from the description of the predicted fact. In experiments, we show that the NKLM significantly improves the performance while generating a much smaller number of unknown words.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Improving language models by retrieving from trillions of tokens
cs.CL 2021-12 unverdicted novelty 7.0

RETRO matches GPT-3 and Jurassic-1 performance on the Pile benchmark using 25 times fewer parameters by conditioning on retrieved chunks from a 2-trillion-token database.
Pointer Sentinel Mixture Models
cs.CL 2016-09 conditional novelty 7.0

Pointer sentinel-LSTM mixes context copying with softmax prediction to reach 70.9 perplexity on Penn Treebank using fewer parameters than standard LSTMs.
Learning to Theorize the World from Observation
cs.LG 2026-05 unverdicted novelty 6.0

NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
cs.CV 2023-08 unverdicted novelty 6.0

DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.