pith. sign in

arxiv: 1712.03903 · v1 · pith:IFGMO5UQnew · submitted 2017-12-11 · 💻 cs.CL · cs.CY

A Novel Way of Identifying Cyber Predators

classification 💻 cs.CL cs.CY
keywords lstm-rnnvectorsgeneratelanguagemodelsentenceconversationshidden
0
0 comments X p. Extension
pith:IFGMO5UQ Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{IFGMO5UQ}

Prints a linked pith:IFGMO5UQ badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Recurrent Neural Networks with Long Short-Term Memory cell (LSTM-RNN) have impressive ability in sequence data processing, particularly for language model building and text classification. This research proposes the combination of sentiment analysis, new approach of sentence vectors and LSTM-RNN as a novel way for Sexual Predator Identification (SPI). LSTM-RNN language model is applied to generate sentence vectors which are the last hidden states in the language model. Sentence vectors are fed into another LSTM-RNN classifier, so as to capture suspicious conversations. Hidden state enables to generate vectors for sentences never seen before. Fasttext is used to filter the contents of conversations and generate a sentiment score so as to identify potential predators. The experiment achieves a record-breaking accuracy and precision of 100% with recall of 81.10%, exceeding the top-ranked result in the SPI competition.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.