SelQA: A New Benchmark for Selection-based Question Answering

Jinho D. Choi; Michael Zhai; Tomasz Jurczyk

arxiv: 1606.08513 · v3 · pith:FKG5ATMHnew · submitted 2016-06-27 · 💻 cs.CL

SelQA: A New Benchmark for Selection-based Question Answering

Tomasz Jurczyk , Michael Zhai , Jinho D. Choi This is my paper

classification 💻 cs.CL

keywords questionansweringcrowdsourcingannotationansweranswersdatasetdatasets

0 comments

read the original abstract

This paper presents a new selection-based question answering dataset, SelQA. The dataset consists of questions generated through crowdsourcing and sentence length answers that are drawn from the ten most prevalent topics in the English Wikipedia. We introduce a corpus annotation scheme that enhances the generation of large, diverse, and challenging datasets by explicitly aiming to reduce word co-occurrences between the question and answers. Our annotation scheme is composed of a series of crowdsourcing tasks with a view to more effectively utilize crowdsourcing in the creation of question answering datasets in various domains. Several systems are compared on the tasks of answer sentence selection and answer triggering, providing strong baseline results for future work to improve upon.

This paper has not been read by Pith yet.

SelQA: A New Benchmark for Selection-based Question Answering

discussion (0)