Feature selection for text classification: A review

被引:0
|
作者
Xuelian Deng
Yuqing Li
Jian Weng
Jilian Zhang
机构
[1] Guangxi University of Chinese Medicine,College of Public Health and Management
[2] Jinan University,College of Information Science and Technology
[3] Jinan University,College of Cyber Security
来源
关键词
Feature Selection; Text classification; Text classifiers; Multimedia;
D O I
暂无
中图分类号
学科分类号
摘要
Big multimedia data is heterogeneous in essence, that is, the data may be a mixture of video, audio, text, and images. This is due to the prevalence of novel applications in recent years, such as social media, video sharing, and location based services (LBS), etc. In many multimedia applications, for example, video/image tagging and multimedia recommendation, text classification techniques have been used extensively to facilitate multimedia data processing. In this paper, we give a comprehensive review on feature selection techniques for text classification. We begin by introducing some popular representation schemes for documents, and similarity measures used in text classification. Then, we review the most popular text classifiers, including Nearest Neighbor (NN) method, Naïve Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), and Neural Networks. Next, we survey four feature selection models, namely the filter, wrapper, embedded and hybrid, discussing pros and cons of the state-of-the-art feature selection approaches. Finally, we conclude the paper and give a brief introduction to some interesting feature selection work that does not belong to the four models.
引用
收藏
页码:3797 / 3816
页数:19
相关论文
共 50 条
  • [1] Feature selection for text classification: A review
    Deng, Xuelian
    Li, Yuqing
    Weng, Jian
    Zhang, Jilian
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (03) : 3797 - 3816
  • [2] A Review on Feature Selection and Feature Extraction for Text Classification
    Shah, Foram P.
    Patel, Vibha
    [J]. PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 2264 - 2268
  • [3] Filter feature selection methods for text classification: a review
    Ming, Hong
    Heyong, Wang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (1) : 2053 - 2091
  • [4] Filter feature selection methods for text classification: a review
    Hong Ming
    Wang Heyong
    [J]. Multimedia Tools and Applications, 2024, 83 : 2053 - 2091
  • [5] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    [J]. 2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [6] Feature selection methods for text classification: a systematic literature review
    Pintas, Julliano Trindade
    Fernandes, Leandro A. F.
    Garcia, Ana Cristina Bicharra
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 6149 - 6200
  • [7] Arabic Text Classification: A Review Study on Feature Selection Methods
    Hijazi, Musab Mustafa
    Zeki, Akram
    Ismail, Amelia
    [J]. 2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 554 - 559
  • [8] Feature selection methods for text classification: a systematic literature review
    Julliano Trindade Pintas
    Leandro A. F. Fernandes
    Ana Cristina Bicharra Garcia
    [J]. Artificial Intelligence Review, 2021, 54 : 6149 - 6200
  • [9] Dynamic feature selection in text classification
    Doan, Son
    Horiguchi, Susumu
    [J]. INTELLIGENT CONTROL AND AUTOMATION, 2006, 344 : 664 - 675
  • [10] Contextual feature selection for text classification
    Paradis, Francois
    Nie, Jian-Yun
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (02) : 344 - 352