Including Dialects and Language Varieties in Author Profiling

Alina Maria Ciobanu; Liviu P. Dinu; Marcos Zampieri; Shervin Malmasi

arxiv: 1707.00621 · v1 · pith:7O2JLAY5new · submitted 2017-07-03 · 💻 cs.CL

Including Dialects and Language Varieties in Author Profiling

Alina Maria Ciobanu , Marcos Zampieri , Shervin Malmasi , Liviu P. Dinu This is my paper

classification 💻 cs.CL

keywords authorlanguageprofilingaccuracyapproachgenderidentificationsystem

0 comments

read the original abstract

This paper presents a computational approach to author profiling taking gender and language variety into account. We apply an ensemble system with the output of multiple linear SVM classifiers trained on character and word $n$-grams. We evaluate the system using the dataset provided by the organizers of the 2017 PAN lab on author profiling. Our approach achieved 75% average accuracy on gender identification on tweets written in four languages and 97% accuracy on language variety identification for Portuguese.

This paper has not been read by Pith yet.

Including Dialects and Language Varieties in Author Profiling

discussion (0)