DOCUMENT CLASSIFICATION USING NAIVE BAYES
International Journal of Advanced Statistics and IT&c for Economics and Life Sciences
View Archive InfoField | Value | |
Title |
DOCUMENT CLASSIFICATION USING NAIVE BAYES
|
|
Creator |
Crețulescu, Radu
Morariu, Daniel Breazu, Macarie |
|
Description |
Document classification is the problem of classifying text documents into a set of predefined classes. After a preprocessing step the documents are represented as huge sparse vectors. Therefore, we have to apply some feature selection methods to reduce the dimensionality of the document-representation vector before applying the core classification algorithm. In this paper, we use Information Gain as feature selection method (that was proven in our previous paper to be the best) and we evaluate a simpler and faster classifier algorithm - Naïve Bayes. Because of the high computation requirements of the previously tested Support Vector Machine classifier, we evaluate now the accuracy lost and the time consumption improvement for the Naïve Bayes classifier. As we might expect experimental results are not of the same quality (on average the accuracy is 9% lower), but the classification time decreases from about 4 minutes for SVM to 7 seconds for Naïve Bayes. Therefore, Naive Bayes can be a good option when computation restrictions are important.
|
|
Publisher |
Lucia Blaga University of Sibiu
|
|
Contributor |
—
|
|
Date |
2016-06-01
|
|
Type |
info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion Peer-reviewed Article |
|
Format |
application/pdf
|
|
Identifier |
http://magazines.ulbsibiu.ro/ijasitels/index.php/IJASITELS/article/view/6
|
|
Source |
International Journal of Advanced Statistics and IT&c for Economics and Life Sciences; Vol 6, No 1 (2016): IJASITELS
2067-354X |
|
Language |
eng
|
|
Relation |
http://magazines.ulbsibiu.ro/ijasitels/index.php/IJASITELS/article/view/6/8
|
|
Rights |
Copyright (c) 2017 IJASITELS
|
|