Comparing Feature Selection Methods by Using Rank Aggregation

被引:0
|
作者
Zheng, Wanwan [1 ]
Jin, Mingzhe [1 ]
机构
[1] Doshisha Univ, Grad Sch Culture & Informat Sci, Kyoto, Japan
关键词
feature selection; text classification; effectiveness; general versatility; ranking of feature selection methods; TRANSFORM; IMAGE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection (FS) is becoming critical in this data era. Selecting effective features from datasets is a particularly important part in text classification, data mining, pattern recognition and artificial intelligence. FS excludes irrelevant features from the classification task, reduces the dimensionality of a dataset, allows us to better understand data, improves the performance of machine learning techniques, and minimizes the computation requirement. Thus far, a large number of FS methods have been proposed, however the most effective one in practice remains unclear. Though it is conceivable that different categories of FS methods have different evaluation criteria for variables, there are few studies fixating on evaluating various categories of FS methods. This article gathers ten superior FS methods under four different categories, and fixates on evaluating and comparing them in general versatility (constant ability to select out the useful features) regarding authorship attribution problems. Besides, this article tries to identify which method is most effective. SVM (support vector machine) serves as the classifier. Different categories of features, different numbers of top variables in feature rankings, and different performance measures are employed to measure the effectiveness and general versatility of these methods together. Finally, rank aggregation method Schulze (SSD) is employed to make a ranking of the ten FS methods. The analysis results suggest that Mahalanobis distance is the best method on the whole.
引用
下载
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [11] Rank aggregation methods
    Lin, Shili
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (05): : 555 - 570
  • [12] Feature selection method using preferences aggregation
    Legrand, G
    Nicoloyannis, N
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2005, 3587 : 203 - 217
  • [13] A feature selection model based on genetic rank aggregation for text sentiment classification
    Onan, Aytug
    Korukoglu, Serdar
    JOURNAL OF INFORMATION SCIENCE, 2017, 43 (01) : 25 - 38
  • [14] Empirical Analysis of Rank Aggregation-Based Multi-Filter Feature Selection Methods in Software Defect Prediction
    Balogun, Abdullateef O.
    Basri, Shuib
    Mahamad, Saipunidzam
    Abdulkadir, Said Jadid
    Capretz, Luiz Fernando
    Imam, Abdullahi A.
    Almomani, Malek A.
    Adeyemo, Victor E.
    Kumar, Ganesh
    ELECTRONICS, 2021, 10 (02) : 1 - 16
  • [15] A fuzzy gaussian rank aggregation ensemble feature selection method for microarray data
    Venkatesh, B.
    Anuradha, J.
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2020, 24 (04) : 289 - 301
  • [16] Combining Multiple Feature Selection Methods for Text Categorization by Using Rank-Score Characteristics
    Li, Yanjun
    Hsu, D. Frank
    Chung, Soon M.
    ICTAI: 2009 21ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, 2009, : 508 - +
  • [17] Comparing multiple categories of feature selection methods for text classification
    Zheng, Wanwan
    Jin, Mingzhe
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2020, 35 (01) : 208 - 224
  • [18] Feature Selection for Learning-to-Rank using Simulated Annealing
    Allvi, Mustafa Wasif
    Hasan, Mahamudul
    Rayon, Lazim
    Shahabuddin, Mohammad
    Khan, Md Mosaddek
    Ibrahim, Muhammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (03) : 699 - 705
  • [19] Adaptive hybrid methods for Feature selection based on Aggregation of Information gain and Clustering methods
    Thangaiah, P. Ranjit Jeba
    Shriram, R.
    Vivekanandan, K.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2009, 9 (02): : 164 - 169
  • [20] A novel parallel feature rank aggregation algorithm for gene selection applied to microarray data classification
    Longkumer, Imtisenla
    Mazumder, Dilwar Hussain
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 112