Feature Selection for Text Classification Using Machine Learning Approaches

被引:13
|
作者
Thirumoorthy, K. [1 ]
Muneeswaran, K. [1 ]
机构
[1] Mepco Schlenk Engn Coll, Dept Comp Sci & Engn, Sivakasi, India
来源
关键词
Feature selection; Text classification; Filter-based approach;
D O I
10.1007/s40009-021-01043-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the present scenario, millions of internet users are contributing a huge amount of data in the form of unstructured text documents. In text classification, the high dimensional feature space, noise and irrelevant information of unstructured text documents are reducing the accuracy of text classifier. The feature selection scheme is adopted to address the high dimensional feature space problem of text classification. In this proposed research, a feature selection method based on the term frequency distribution measure is deployed. We have used the Naive Bayes and SVM classifiers with two benchmark datasets (WebKB and BBC). The experimental outcome confirms that the proposed feature selection method has a better classification accuracy when compared with other feature selection techniques.
引用
下载
收藏
页码:51 / 56
页数:6
相关论文
共 50 条
  • [11] Clustering Based Feature Selection using Extreme Learning Machines for Text Classification
    Roul, Rajendra Kumar
    Gugnani, Shashank
    Kalpeshbhai, Shah Mit
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [12] Machine Learning Approaches for the Classification of Spammed Text in Messages
    Mundra, Shikha
    Mundra, Ankit
    Saigal, Anshul
    Gupta, Punit
    Agarwal, Josh
    Goyal, Mayank Kumar
    SMART SYSTEMS: INNOVATIONS IN COMPUTING (SSIC 2021), 2022, 235 : 601 - 617
  • [13] Text Classification using Different Feature Extraction Approaches
    Dzisevic, Robert
    Sesok, Dmitrij
    2019 OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2019,
  • [14] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [15] FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION
    Khammas, Ban Mohammed
    Monemi, Alireza
    Bassi, Joseph Stephen
    Ismail, Ismahani
    Nor, Sulaiman Mohd
    Marsono, Muhammad Nadzir
    JURNAL TEKNOLOGI, 2015, 77 (01):
  • [16] Feature selection in a machine learning system for texture classification
    Baik, SW
    Bala, J
    ALGORITHMS FOR SYNTHETIC APERTURE RADAR IMAGERY V, 1998, 3370 : 261 - 268
  • [17] Feature Selection For Text Classification Using Genetic Algorithms
    Bidi, Noria
    Elberrichi, Zakaria
    PROCEEDINGS OF 2016 8TH INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION & CONTROL (ICMIC 2016), 2016, : 806 - 810
  • [18] Feature Selection by Using Heuristic Methods for Text Classification
    Sel, Ilhami
    Yeroglu, Celalettin
    Hanbay, Davut
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [19] Feature Selection for Text Classification Using Mutual Information
    Sel, Ilhami
    Karci, Ali
    Hanbay, Davut
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [20] Machine learning approaches for classification of colorectal cancer with and without feature selection method on microarray data
    Nazari, Elham
    Aghemiri, Mehran
    Avan, Amir
    Mehrabian, Amin
    Tabesh, Hamed
    GENE REPORTS, 2021, 25