Structural Tags, Annealing and Automatic Word Classification

Belfast); F.J.Smith (Queen's University; John McMahon

arxiv: cmp-lg/9405029 · v1 · submitted 1994-05-30 · cmp-lg · cs.CL

Structural Tags, Annealing and Automatic Word Classification

John McMahon , F.J.Smith (Queen's University , Belfast) This is my paper

classification cmp-lg cs.CL

keywords classificationsystemwordalgorithmannealingautomaticcorpuscurrent

0 comments

read the original abstract

This paper describes an automatic word classification system which uses a locally optimal annealing algorithm and average class mutual information. A new word-class representation, the structural tag is introduced and its advantages for use in statistical language modelling are presented. A summary of some results with the one million word LOB corpus is given; the algorithm is also shown to discover the vowel-consonant distinction and displays an ability to cluster words syntactically in a Latin corpus. Finally, a comparison is made between the current classification system and several leading alternative systems, which shows that the current system performs respectably well.

This paper has not been read by Pith yet.

Structural Tags, Annealing and Automatic Word Classification

discussion (0)