Abstract- SMS Spam has been growing since mobile phone usage increases. Past researsches on SMS spam detection only classified SMS into two categories, spam and not spam. The binary classification of SMS spam prevents the user from seeing the spam messages that they do not really hate, e.g. an advertisement from their favorite product. In this paper, we propose multiclass classification of SMS into: reguler, into,ads,and fraud. We use content-based (top-N unigram) as well as non-content bassed featurs. The result shows that the best accuracy is achieved by logistic regression that is 97.5% accuracy with configuration of normalization preprocess and 4096 top-N unigram featurs.
|
|