A Survey on Filter Techniques for Feature Selection in Text Mining

被引:14
|
作者
Bharti, Kusum Kumari [1 ]
Singh, Pramod Kumar [1 ]
机构
[1] ABV Indian Inst Informat Technol & Management Gwa, Computat Intelligence & Data Min Res Lab, Gwalior, Madhya Pradesh, India
关键词
Text mining; Text categorization; Text clustering; Feature extraction; Feature selection; Filter methods; PRINCIPAL COMPONENT ANALYSIS; PARTICLE SWARM OPTIMIZATION; INFORMATION GAIN; ALGORITHM; CLASSIFICATION;
D O I
10.1007/978-81-322-1602-5_154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large portion of a document is usually covered by irrelevant features. Instead of identifying actual context of the document, such features increase dimensions in the representation model and computational complexity of underlying algorithm, and hence adversely affect the performance. It necessitates a requirement of relevant feature selection in the given feature space. In this context, feature selection plays a key role in removing irrelevant features from the original feature space. Feature selection methods are broadly categorized into three groups: filter, wrapper, and embedded. Filter methods are widely used in text mining because of their simplicity, computational complexity, and efficiency. In this article, we provide a brief survey of filter feature selection methods along with some of the recent developments in this area.
引用
收藏
页码:1545 / 1559
页数:15
相关论文
共 50 条
  • [1] A Survey on Text Mining Techniques
    Tandel, Sayali Sunil
    Jamadar, Abhishek
    Dudugu, Siddharth
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 1022 - 1026
  • [2] A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis
    Lazar, Cosmin
    Taminau, Jonatan
    Meganck, Stijn
    Steenhoff, David
    Coletta, Alain
    Molter, Colin
    de Schaetzen, Virginie
    Duque, Robin
    Bersini, Hugues
    Nowe, Ann
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (04) : 1106 - 1119
  • [3] A SURVEY ON CLASSIFICATION TECHNIQUES FOR TEXT MINING
    Brindha, S.
    Sukumaran, S.
    Prabha, K.
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2016,
  • [4] Feature Selection and Feature Weight Estimate in Web Text Mining
    Pei, Zhili
    Qi, Jianhong
    Zhang, Xinhong
    Zhou, Yuxin
    Bai, Mingyu
    Wang, Qinghu
    Liu, Lisha
    Fan, Xiaojing
    Jiang, Mingyang
    [J]. 2ND INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY FOR EDUCATION (ICTE 2015), 2015, : 316 - 320
  • [5] A Survey on Evolutionary Techniques for Feature Selection
    Abdullah, A. Sheik
    Ramya, C.
    Priyadharsini, V.
    Reshma, C.
    Selvakumar, S.
    [J]. 2017 CONFERENCE ON EMERGING DEVICES AND SMART SYSTEMS (ICEDSS), 2017, : 58 - 62
  • [6] Improved feature selection approach TFIDF in text mining
    Jing, LP
    Huang, HK
    Shi, HB
    [J]. 2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 944 - 946
  • [7] On the Improvement of Feature Selection Techniques: The Fitness Filter
    Ferreira, Artur J.
    Figueiredo, Mario A. T.
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 365 - 372
  • [8] Filter feature selection methods for text classification: a review
    Ming, Hong
    Heyong, Wang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (1) : 2053 - 2091
  • [9] Filter feature selection methods for text classification: a review
    Hong Ming
    Wang Heyong
    [J]. Multimedia Tools and Applications, 2024, 83 : 2053 - 2091
  • [10] The effects of globalisation techniques on feature selection for text classification
    Parlak, Bekir
    Uysal, Alper Kursat
    [J]. JOURNAL OF INFORMATION SCIENCE, 2021, 47 (06) : 727 - 739