Automatic Extraction of Tagset Mappings from Parallel-Annotated Corpora

Clive Souter; Eric Atwell (University of Leeds; John Hughes; UK)

arxiv: cmp-lg/9506006 · v1 · submitted 1995-06-08 · cmp-lg · cs.CL

Automatic Extraction of Tagset Mappings from Parallel-Annotated Corpora

John Hughes , Clive Souter , Eric Atwell (University of Leeds , UK) This is my paper

classification cmp-lg cs.CL

keywords annotationcorporaannotatedautomaticcorpusmappingsaccordingamalgam

0 comments

read the original abstract

This paper describes some of the recent work of project AMALGAM (automatic mapping among lexico-grammatical annotation models). We are investigating ways to map between the leading corpus annotation schemes in order to improve their resuability. Collation of all the included corpora into a single large annotated corpus will provide a more detailed language model to be developed for tasks such as speech and handwriting recognition. In particular, we focus here on a method of extracting mappings from corpora that have been annotated according to more than one annotation scheme.

This paper has not been read by Pith yet.

Automatic Extraction of Tagset Mappings from Parallel-Annotated Corpora

discussion (0)