Hierarchical Character-Word Models for Language Identification
classification
💻 cs.CL
keywords
identificationlanguagehierarchicalbase-brevitychallengecharactercharacter-word
read the original abstract
Social media messages' brevity and unconventional spelling pose a challenge to language identification. We introduce a hierarchical model that learns character and contextualized word-level representations for language identification. Our method performs well against strong base- lines, and can also reveal code-switching.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.