pith. sign in

arxiv: 1708.02125 · v1 · pith:VCVK2SOUnew · submitted 2017-08-07 · 💻 cs.DB

T-Crowd: Effective Crowdsourcing for Tabular Data

classification 💻 cs.DB
keywords crowdsourcingt-crowddataattributestruevaluesanswersattribute
0
0 comments X
read the original abstract

Crowdsourcing employs human workers to solve computer-hard problems, such as data cleaning, entity resolution, and sentiment analysis. When crowdsourcing tabular data, e.g., the attribute values of an entity set, a worker's answers on the different attributes (e.g., the nationality and age of a celebrity star) are often treated independently. This assumption is not always true and can lead to suboptimal crowdsourcing performance. In this paper, we present the T-Crowd system, which takes into consideration the intricate relationships among tasks, in order to converge faster to their true values. Particularly, T-Crowd integrates each worker's answers on different attributes to effectively learn his/her trustworthiness and the true data values. The attribute relationship information is also used to guide task allocation to workers. Finally, T-Crowd seamlessly supports categorical and continuous attributes, which are the two main datatypes found in typical databases. Our extensive experiments on real and synthetic datasets show that T-Crowd outperforms state-of-the-art methods in terms of truth inference and reducing the cost of crowdsourcing.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.