Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels
read the original abstract
Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided by crowdsourcing workers is Dawid-Skene estimator. In this paper, we prove convergence rates of a projected EM algorithm for the Dawid-Skene estimator. The revealed exponent in the rate of convergence is shown to be optimal via a lower bound argument. Our work resolves the long standing issue of whether Dawid-Skene estimator has sound theoretical guarantees besides its good performance observed in practice. In addition, a comparative study with majority voting illustrates both advantages and pitfalls of the Dawid-Skene estimator.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.