pith. sign in

arxiv: 1608.03030 · v1 · pith:323PFQC3new · submitted 2016-08-10 · 💻 cs.CL

Hierarchical Character-Word Models for Language Identification

classification 💻 cs.CL
keywords identificationlanguagehierarchicalbase-brevitychallengecharactercharacter-word
0
0 comments X
read the original abstract

Social media messages' brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.