pith. sign in

arxiv: cmp-lg/9506006 · v1 · submitted 1995-06-08 · cmp-lg · cs.CL

Automatic Extraction of Tagset Mappings from Parallel-Annotated Corpora

classification cmp-lg cs.CL
keywords annotationcorporaannotatedautomaticcorpusmappingsaccordingamalgam
0
0 comments X
read the original abstract

This paper describes some of the recent work of project AMALGAM (automatic mapping among lexico-grammatical annotation models). We are investigating ways to map between the leading corpus annotation schemes in order to improve their resuability. Collation of all the included corpora into a single large annotated corpus will provide a more detailed language model to be developed for tasks such as speech and handwriting recognition. In particular, we focus here on a method of extracting mappings from corpora that have been annotated according to more than one annotation scheme.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.