A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

Aditya Joshi; Cecile Paris; C Raina MacIntyre; Ross Sparks; Sarvnaz Karimi

arxiv: 1906.05468 · v1 · pith:7R4SOQL4new · submitted 2019-06-13 · 💻 cs.CL · cs.IR

A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

Aditya Joshi , Sarvnaz Karimi , Ross Sparks , Cecile Paris , C Raina MacIntyre This is my paper

classification 💻 cs.CL cs.IR

keywords representationsclassificationcontext-basedproblemshealthsentencestatisticalused

0 comments

read the original abstract

Distributed representations of text can be used as features when training a statistical classifier. These representations may be created as a composition of word vectors or as context-based sentence vectors. We compare the two kinds of representations (word versus context) for three classification problems: influenza infection classification, drug usage classification and personal health mention classification. For statistical classifiers trained for each of these problems, context-based representations based on ELMo, Universal Sentence Encoder, Neural-Net Language Model and FLAIR are better than Word2Vec, GloVe and the two adapted using the MESH ontology. There is an improvement of 2-4% in the accuracy when these context-based representations are used instead of word-based representations.

This paper has not been read by Pith yet.

A Comparison of Word-based and Context-based Representations for Classification Problems in Health Informatics

discussion (0)