pith. machine review for the scientific record. sign in

arxiv: 1406.3188 · v1 · pith:LJ3HNF3Inew · submitted 2014-06-12 · 💻 cs.IR

Assessing the Quality of Web Content

classification 💻 cs.IR
keywords taskchallengequalityapproachdiscoveryecmlenglishfrench
0
0 comments X
read the original abstract

This paper describes our approach towards the ECML/PKDD Discovery Challenge 2010. The challenge consists of three tasks: (1) a Web genre and facet classification task for English hosts, (2) an English quality task, and (3) a multilingual quality task (German and French). In our approach, we create an ensemble of three classifiers to predict unseen Web hosts whereas each classifier is trained on a different feature set. Our final NDCG on the whole test set is 0:575 for Task 1, 0:852 for Task 2, and 0:81 (French) and 0:77 (German) for Task 3, which ranks second place in the ECML/PKDD Discovery Challenge 2010.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.