CRYSTAL: Inducing a Conceptual Dictionary
classification
cmp-lg
cs.CL
keywords
dictionarycrystalconcept-nodeconceptualdefinitionsextractionidentifyinformation
read the original abstract
One of the central knowledge sources of an information extraction system is a dictionary of linguistic patterns that can be used to identify the conceptual content of a text. This paper describes CRYSTAL, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus. Each of these concept-node definitions is generalized as far as possible without producing errors, so that a minimum number of dictionary entries cover the positive training instances. Because it tests the accuracy of each proposed definition, CRYSTAL can often surpass human intuitions in creating reliable extraction rules.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.