Screening Matters: A Comparative Study of Conventional and Crowdsourced Listening Tests

· 2026 · eess.AS · arXiv 2606.28114

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Subjective evaluation remains the most reliable way of testing speech and audio coding techniques. Crowdsourcing the listening task is a cost-efficient and fast way of conducting this evaluation, but the quality of the results tends to be inferior to that of conventional listening tests done in the controlled environment of a laboratory. In this paper, classical and neural speech codecs are evaluated to compare P.808 against P.800 DCR tests. A statistical analysis is conducted to investigate the effectiveness of selected screening methods. The analysis shows that the crowdsourced evaluation can be improved by employing postscreening methods based on anchor ordering and rating span, and continuous screening methods like traps and gold standard questions, thus giving more value to the ratings obtained for the codecs under test. Based on these outcomes, a set of suitable screenings is proposed, for cost-effective, simplified, and bias-free enhancement of listening results.

representative citing papers

Screening Matters: A Comparative Study of Conventional and Crowdsourced Listening Tests

eess.AS · 2026-06-26 · unverdicted · novelty 3.0

Screening methods based on anchor ordering, rating span, traps, and gold standard questions improve the reliability of crowdsourced listening tests for speech codecs relative to conventional lab tests.

citing papers explorer

Showing 1 of 1 citing paper.

Screening Matters: A Comparative Study of Conventional and Crowdsourced Listening Tests eess.AS · 2026-06-26 · unverdicted · none · ref 2 · internal anchor
Screening methods based on anchor ordering, rating span, traps, and gold standard questions improve the reliability of crowdsourced listening tests for speech codecs relative to conventional lab tests.

Screening Matters: A Comparative Study of Conventional and Crowdsourced Listening Tests

fields

years

verdicts

representative citing papers

citing papers explorer