pith. sign in

arxiv: 1606.08513 · v3 · pith:FKG5ATMHnew · submitted 2016-06-27 · 💻 cs.CL

SelQA: A New Benchmark for Selection-based Question Answering

classification 💻 cs.CL
keywords questionansweringcrowdsourcingannotationansweranswersdatasetdatasets
0
0 comments X
read the original abstract

This paper presents a new selection-based question answering dataset, SelQA. The dataset consists of questions generated through crowdsourcing and sentence length answers that are drawn from the ten most prevalent topics in the English Wikipedia. We introduce a corpus annotation scheme that enhances the generation of large, diverse, and challenging datasets by explicitly aiming to reduce word co-occurrences between the question and answers. Our annotation scheme is composed of a series of crowdsourcing tasks with a view to more effectively utilize crowdsourcing in the creation of question answering datasets in various domains. Several systems are compared on the tasks of answer sentence selection and answer triggering, providing strong baseline results for future work to improve upon.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.