pith. sign in

arxiv: 1804.05388 · v2 · pith:NSJJ3VIJnew · submitted 2018-04-15 · 💻 cs.CL

Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness

classification 💻 cs.CL
keywords datasetssimilaritymodelssemanticacrossvietnameseantonymsassess
0
0 comments X
read the original abstract

We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. ViSim-400 provides degrees of similarity across five semantic relations, as rated by human judges. The two datasets are verified through standard co-occurrence and neural network models, showing results comparable to the respective English datasets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.