Ensemble feature selection for multi-label text classification: An intelligent order statistics approach

被引:13
|
作者
Miri, Mohsen [1 ]
Dowlatshahi, Mohammad Bagher [1 ]
Hashemi, Amin [1 ]
Rafsanjani, Marjan Kuchaki [2 ]
Gupta, Brij B. [3 ,4 ,5 ,6 ,7 ]
Alhalabi, W. [8 ]
机构
[1] Lorestan Univ, Fac Engn, Dept Comp Engn, Khorramabad, Iran
[2] Shahid Bahonar Univ Kerman, Fac Math & Comp, Dept Comp Sci, Kerman, Iran
[3] Asia Univ, Int Ctr AI & Cyber Secur Res & Innovat, Taichung 413, Taiwan
[4] Asia Univ, Dept Comp Sci & Informat Engn, Taichung 413, Taiwan
[5] Lebanese Amer Univ, Beirut, Lebanon
[6] Univ Petr & Energy Studies UPES, Ctr Interdisciplinary Res, Dehra Dun, Uttarakhand, India
[7] Skyline Univ Coll, Res & Innovat Dept, Sharjah, U Arab Emirates
[8] Univ Miami, Dept Elect & Comp Engn, Coral Gables, FL 33124 USA
关键词
ensemble feature selection; multi-label classification; multi-label feature selection; order statistic; text classification; GRAVITATIONAL SEARCH ALGORITHM; OPTIMIZATION;
D O I
10.1002/int.23044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Because of the overgrowth of data, especially in text format, the value and importance of multi-label text classification have increased. Aside from this, preprocessing and particularly intelligent feature selection (FS) are the most important step in classification. Each FS finds the best features based on its approach, but we try to use a multi-strategy approach to find more useful features. Evaluating and comparing features' importance and relevance makes using multiple strategy and methods more suitable than conventional approaches because each feature is measured based on several perspectives. Nevertheless, the ensemble FS merges the final performance results of various methods to take advantage of different methods' strengths and better classify. In this article, we have proposed an ensemble FS method for multi-label text data (MLTD) for the first time using the order statistics (EMFS) approach. We have utilized four multi-label FS (MLFS) algorithms with various particular performances to achieve a good result. In this method, as one of the most important statistics methods, Order Statistics was used to aggregate the ranks of different algorithms, which is robust against noise, redundant and inessential features. In the end, the performance of EMFS, executing six MLTDs, was evaluated according to six performance criteria (ranking-based and classification-based). Surprisingly, the proposed method was more accurate than others among all used MLTDs. The proposed method has improved by 1.5% compared to other methods. This value is based on the results obtained based on six evaluation criteria and all tested data sets.
引用
收藏
页码:11319 / 11341
页数:23
相关论文
共 50 条
  • [1] Multi-label text classification with an ensemble feature space
    Tandon, Kushagri
    Chatterjee, Niladri
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4425 - 4436
  • [2] Multi-label text classification with an ensemble feature space
    Tandon, Kushagri
    Chatterjee, Niladri
    [J]. Journal of Intelligent and Fuzzy Systems, 2022, 42 (05): : 4425 - 4436
  • [3] An Ensemble Embedded Feature Selection Method for Multi-Label Clinical Text Classification
    Guo, Yumeng
    Chung, Fulai
    Li, Guozheng
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 823 - 826
  • [4] A lightweight filter based feature selection approach for multi-label text classification
    Dhal P.
    Azad C.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (09) : 12345 - 12357
  • [5] A COPRAS-based Approach to Multi-Label Feature Selection for Text Classification
    Mohanrasu, S. S.
    Janani, K.
    Rakkiyappan, R.
    [J]. MATHEMATICS AND COMPUTERS IN SIMULATION, 2024, 222 : 3 - 23
  • [6] Improving Multi-Label Medical Text Classification by Feature Selection
    Glinka, Kinga
    Wozniak, Rafal
    Zakrzewska, Danuta
    [J]. 2017 IEEE 26TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES - INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2017, : 176 - 181
  • [7] Optimization approach for feature selection in multi-label classification
    Lim, Hyunki
    Lee, Jaesung
    Kim, Dae-Won
    [J]. PATTERN RECOGNITION LETTERS, 2017, 89 : 25 - 30
  • [8] Multi-Label Bioinformatics Data Classification With Ensemble Embedded Feature Selection
    Guo, Yumeng
    Chung, Fu-Lai
    Li, Guozheng
    Zhang, Lei
    [J]. IEEE ACCESS, 2019, 7 : 103863 - 103875
  • [9] Feature Selection for Multi-label Classification Problems
    Doquire, Gauthier
    Verleysen, Michel
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2011, PT I, 2011, 6691 : 9 - 16
  • [10] Feature Selection for Hierarchical Multi-label Classification
    da Silva, Luan V. M.
    Cerri, Ricardo
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XIX, IDA 2021, 2021, 12695 : 196 - 208