Nonsymbolic Text Representation

Ehsaneddin Asgari; Heike Adel; Hinrich Schuetze

arxiv: 1610.00479 · v3 · pith:B6GXQDI2new · submitted 2016-10-03 · 💻 cs.CL

Nonsymbolic Text Representation

Hinrich Schuetze , Heike Adel , Ehsaneddin Asgari This is my paper

classification 💻 cs.CL

keywords textmodelrepresentationnonsymbolictrainingappliesapplyingattempts

0 comments

read the original abstract

We introduce the first generic text representation model that is completely nonsymbolic, i.e., it does not require the availability of a segmentation or tokenization method that attempts to identify words or other symbolic units in text. This applies to training the parameters of the model on a training corpus as well as to applying it when computing the representation of a new text. We show that our model performs better than prior work on an information extraction and a text denoising task.

This paper has not been read by Pith yet.

Nonsymbolic Text Representation

discussion (0)