16.22.002
003.56 - Decision theory-computer science
Disertasi - Reference
Text Mining
Tel-U Bandung - Gedung Manterawu Lantai 5 : Rak R1
329 kali
Text classification is a popular and important text mining task. Many document collections are multi-class and some are multi-label. Both multi-class and multilabel data collections can be dealt with by using binary classifications. A big challenge for text classification is the noisy text data. This problem becomes more severe in corpus with small set of training documents, moreover accompanied by few positive documents. A set of natural language text contains a lot of words. This results another important problem for text classification, namely, high dimension data. Therefore we must select features. A classifier must identify boundary between classes optimally. However, after the features are selected, the boundary is still unclear with regard to mixed positive and negative documents. Recently, relevance feature discovery (RFD) has been proposed as an effective pattern mining-based feature selection and weighting model. Document weights are significant for ranking relevant information. However, so far, an effective way to set the decision boundary for ranking relevant information for classification has not found. This thesis presents a promising boundary setting method for solving this challenging issue to produce an effective text classifier, called RFD? . A classifier combination to boost effectiveness of the RFD? model is also presented. The experiments carried out in the study demonstrate that the proposed classifier significantly outperforms existing, including state of the art, classifiers.
Tersedia 1 dari total 1 Koleksi
Nama | Moch Arif Bijaksana |
Jenis | Perorangan |
Penyunting | |
Penerjemah |
Nama | Queensland University Of Technology |
Kota | Queensland |
Tahun | 2015 |
Harga sewa | IDR 0,00 |
Denda harian | IDR 0,00 |
Jenis | Non-Sirkulasi |