pith. sign in

arxiv: 1606.00819 · v2 · pith:BF7TNGCYnew · submitted 2016-06-02 · 💻 cs.CL

Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations

classification 💻 cs.CL
keywords wordfactorizationlexvecmatrixnegativerepresentationssamplingtasks
0
0 comments X
read the original abstract

In this paper, we propose LexVec, a new method for generating distributed word representations that uses low-rank, weighted factorization of the Positive Point-wise Mutual Information matrix via stochastic gradient descent, employing a weighting scheme that assigns heavier penalties for errors on frequent co-occurrences while still accounting for negative co-occurrence. Evaluation on word similarity and analogy tasks shows that LexVec matches and often outperforms state-of-the-art methods on many of these tasks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.