Record Details

DOCUMENT CLASSIFICATION USING NAIVE BAYES

International Journal of Advanced Statistics and IT&c for Economics and Life Sciences

View Archive Info
 
 
Field Value
 
Title DOCUMENT CLASSIFICATION USING NAIVE BAYES
 
Creator Crețulescu, Radu
Morariu, Daniel
Breazu, Macarie
 
Description Document classification is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are represented as huge sparse vectors. Therefore, we have to apply some feature selection methods to reduce the dimensionality of the document-representation vector before applying the core classification algorithm. In this paper, we use Information Gain as feature selection method (that was proven in our previous paper to be the best) and we evaluate a simpler and faster classifier algorithm - Naïve Bayes. Because of the high computation requirements of the previously tested Support Vector Machine classifier, we evaluate now the accuracy lost and the time consumption improvement for the Naïve Bayes classifier. As we might expect experimental results are not of the same quality (on average the accuracy is 9% lower), but the classification time decreases from about 4 minutes for SVM to 7 seconds for Naïve Bayes. Therefore, Naive Bayes can be a good option when computation restrictions are important.
 
Publisher Lucia Blaga University of Sibiu
 
Contributor
 
Date 2016-06-01
 
Type info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Peer-reviewed Article
 
Format application/pdf
 
Identifier http://magazines.ulbsibiu.ro/ijasitels/index.php/IJASITELS/article/view/6
 
Source International Journal of Advanced Statistics and IT&c for Economics and Life Sciences; Vol 6, No 1 (2016): IJASITELS
2067-354X
 
Language eng
 
Relation http://magazines.ulbsibiu.ro/ijasitels/index.php/IJASITELS/article/view/6/8
 
Rights Copyright (c) 2017 IJASITELS