pith. sign in

arxiv: 1711.11027 · v2 · pith:3Q5V56EDnew · submitted 2017-11-29 · 💻 cs.CL · cs.AI· cs.LG

Embedding Words as Distributions with a Bayesian Skip-gram Model

classification 💻 cs.CL cs.AIcs.LG
keywords worddensitiesembeddingembeddingsmodelpriorbayesiandensity
0
0 comments X
read the original abstract

We introduce a method for embedding words as probability densities in a low-dimensional space. Rather than assuming that a word embedding is fixed across the entire text collection, as in standard word embedding methods, in our Bayesian model we generate it from a word-specific prior density for each occurrence of a given word. Intuitively, for each word, the prior density encodes the distribution of its potential 'meanings'. These prior densities are conceptually similar to Gaussian embeddings. Interestingly, unlike the Gaussian embeddings, we can also obtain context-specific densities: they encode uncertainty about the sense of a word given its context and correspond to posterior distributions within our model. The context-dependent densities have many potential applications: for example, we show that they can be directly used in the lexical substitution task. We describe an effective estimation method based on the variational autoencoding framework. We also demonstrate that our embeddings achieve competitive results on standard benchmarks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Alcmean's: Unsupervised community detection using local Laplacian, automatic detection of the number of centers

    cs.SI 2026-06 unverdicted novelty 4.0

    ALCMeans combines Laplacian energy-based automatic center identification with DeepWalk embeddings to perform unsupervised community detection without predefining the number of communities and reports 10-20% higher NMI...