pith. sign in

arxiv: 1803.06390 · v1 · pith:2NU6JWXJnew · submitted 2018-03-16 · 💻 cs.CL · cs.IR· cs.LG

Corpus Statistics in Text Classification of Online Data

classification 💻 cs.CL cs.IRcs.LG
keywords dataclassificationcorpusonlineresultssetstextwork
0
0 comments X
read the original abstract

Transformation of Machine Learning (ML) from a boutique science to a generally accepted technology has increased importance of reproduction and transportability of ML studies. In the current work, we investigate how corpus characteristics of textual data sets correspond to text classification results. We work with two data sets gathered from sub-forums of an online health-related forum. Our empirical results are obtained for a multi-class sentiment analysis application.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.