pith. sign in

arxiv: cmp-lg/9604022 · v1 · submitted 1996-04-30 · cmp-lg · cs.CL

Unsupervised Learning of Word-Category Guessing Rules

classification cmp-lg cs.CL
keywords rulescorpuslearninglexiconmorphologicalunknownunsupervisedwords
0
0 comments X
read the original abstract

Words unknown to the lexicon present a substantial problem to part-of-speech tagging. In this paper we present a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words. Three complementary sets of word-guessing rules are induced from the lexicon and a raw corpus: prefix morphological rules, suffix morphological rules and ending-guessing rules. The learning was performed on the Brown Corpus data and rule-sets, with a highly competitive performance, were produced and compared with the state-of-the-art.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.