Tweet2Vec: Character-Based Distributed Representations for Social Media

· 2016 · cs.LG · arXiv 1605.03481

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Text from social media provides a set of challenges that can cause traditional NLP approaches to fail. Informal language, spelling errors, abbreviations, and special characters are all commonplace in these posts, leading to a prohibitively large vocabulary size for word-level approaches. We propose a character composition model, tweet2vec, which finds vector-space representations of whole tweets by learning complex, non-local dependencies in character sequences. The proposed model outperforms a word-level baseline at predicting user-annotated hashtags associated with the posts, doing significantly better when the input contains many out-of-vocabulary words or unusual character sequences. Our tweet2vec encoder is publicly available.

representative citing papers

Multitask Learning for Blackmarket Tweet Detection

cs.SI · 2019-07-09 · unverdicted · novelty 4.0

A multitask learning framework with soft parameter sharing between classification and regression tasks detects blackmarket tweets at F1-score 0.89.

citing papers explorer

Showing 1 of 1 citing paper.

Multitask Learning for Blackmarket Tweet Detection cs.SI · 2019-07-09 · unverdicted · none · ref 6 · internal anchor
A multitask learning framework with soft parameter sharing between classification and regression tasks detects blackmarket tweets at F1-score 0.89.

Tweet2Vec: Character-Based Distributed Representations for Social Media

fields

years

verdicts

representative citing papers

citing papers explorer