pith. sign in

arxiv: 1712.06921 · v1 · pith:6YROMTFHnew · submitted 2017-12-19 · 💻 cs.IR

Ensemble Models for Detecting Wikidata Vandalism with Stacking - Team Honeyberry Vandalism Detector at WSDM Cup 2017

classification 💻 cs.IR
keywords modelsvandalismauc-rocstackingtasktechniqueswikidatawsdm
0
0 comments X
read the original abstract

The WSDM Cup 2017 is a binary classification task for classifying Wikidata revisions into vandalism and non-vandalism. This paper describes our method using some machine learning techniques such as under-sampling, feature selection, stacking and ensembles of models. We confirm the validity of each technique by calculating AUC-ROC of models using such techniques and not using them. Additionally, we analyze the results and gain useful insights into improving models for the vandalism detection task. The AUC-ROC of our final submission after the deadline resulted in 0.94412.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.