An evaluation of Naive Bayesian anti-spam filtering

Constantine D. Spyropoulos; George Paliouras; Ion Androutsopoulos; John Koutsias; Konstantinos V. Chandrinos

arxiv: cs/0006013 · v1 · submitted 2000-06-07 · 💻 cs.CL · cs.AI

An evaluation of Naive Bayesian anti-spam filtering

Ion Androutsopoulos , John Koutsias , Konstantinos V. Chandrinos , George Paliouras , Constantine D. Spyropoulos This is my paper

classification 💻 cs.CL cs.AI

keywords bayesianevaluationfilternaiveanti-spambeensizeadditional

0 comments

read the original abstract

It has recently been argued that a Naive Bayesian classifier can be used to filter unsolicited bulk e-mail ("spam"). We conduct a thorough evaluation of this proposal on a corpus that we make publicly available, contributing towards standard benchmarks. At the same time we investigate the effect of attribute-set size, training-corpus size, lemmatization, and stop-lists on the filter's performance, issues that had not been previously explored. After introducing appropriate cost-sensitive evaluation measures, we reach the conclusion that additional safety nets are needed for the Naive Bayesian anti-spam filter to be viable in practice.

This paper has not been read by Pith yet.

An evaluation of Naive Bayesian anti-spam filtering

discussion (0)