Statistical modality tagging from rule-based annotations and crowdsourcing

Benjamin Van Durme; Bonnie Dorr; Christine D. Piatko; Lori Levin; Michael Bloodgood; Mona Diab; Owen Rambow; Vinodkumar Prabhakaran

arxiv: 1503.01190 · v1 · pith:B23W2PCNnew · submitted 2015-03-04 · 💻 cs.CL · cs.LG· stat.ML

Statistical modality tagging from rule-based annotations and crowdsourcing

Vinodkumar Prabhakaran , Michael Bloodgood , Mona Diab , Bonnie Dorr , Lori Levin , Christine D. Piatko , Owen Rambow , Benjamin Van Durme This is my paper

classification 💻 cs.CL cs.LGstat.ML

keywords modalitytaggertrainingsentencesdatarule-basedannotationannotations

0 comments

read the original abstract

We explore training an automatic modality tagger. Modality is the attitude that a speaker might have toward an event or state. One of the main hurdles for training a linguistic tagger is gathering training data. This is particularly problematic for training a tagger for modality because modality triggers are sparse for the overwhelming majority of sentences. We investigate an approach to automatically training a modality tagger where we first gathered sentences based on a high-recall simple rule-based modality tagger and then provided these sentences to Mechanical Turk annotators for further annotation. We used the resulting set of training data to train a precise modality tagger using a multi-class SVM that delivers good performance.

This paper has not been read by Pith yet.

Statistical modality tagging from rule-based annotations and crowdsourcing

discussion (0)