Comparing Feature Selection Methods by Using Rank Aggregation

被引:0
|
作者
Zheng, Wanwan [1 ]
Jin, Mingzhe [1 ]
机构
[1] Doshisha Univ, Grad Sch Culture & Informat Sci, Kyoto, Japan
关键词
feature selection; text classification; effectiveness; general versatility; ranking of feature selection methods; TRANSFORM; IMAGE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection (FS) is becoming critical in this data era. Selecting effective features from datasets is a particularly important part in text classification, data mining, pattern recognition and artificial intelligence. FS excludes irrelevant features from the classification task, reduces the dimensionality of a dataset, allows us to better understand data, improves the performance of machine learning techniques, and minimizes the computation requirement. Thus far, a large number of FS methods have been proposed, however the most effective one in practice remains unclear. Though it is conceivable that different categories of FS methods have different evaluation criteria for variables, there are few studies fixating on evaluating various categories of FS methods. This article gathers ten superior FS methods under four different categories, and fixates on evaluating and comparing them in general versatility (constant ability to select out the useful features) regarding authorship attribution problems. Besides, this article tries to identify which method is most effective. SVM (support vector machine) serves as the classifier. Different categories of features, different numbers of top variables in feature rankings, and different performance measures are employed to measure the effectiveness and general versatility of these methods together. Finally, rank aggregation method Schulze (SSD) is employed to make a ranking of the ten FS methods. The analysis results suggest that Mahalanobis distance is the best method on the whole.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 50 条
  • [21] Identification of properties important to protein aggregation using feature selection
    Fang, Yaping
    Gao, Shan
    Tai, David
    Middaugh, C. Russell
    Fang, Jianwen
    BMC BIOINFORMATICS, 2013, 14
  • [22] Identification of properties important to protein aggregation using feature selection
    Yaping Fang
    Shan Gao
    David Tai
    C Russell Middaugh
    Jianwen Fang
    BMC Bioinformatics, 14
  • [23] LETOR Methods for Unsupervised Rank Aggregation
    Bhowmik, Avradeep
    Ghosh, Joydeep
    PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, : 1331 - 1340
  • [24] Methods of forward feature selection based on the aggregation of classifiers generated by single attribute
    Luo, Linkai
    Ye, Lingjun
    Luo, Meixiang
    Huang, Dengfeng
    Peng, Hong
    Yang, Fan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2011, 41 (07) : 435 - 441
  • [25] Comparing Key Rank Estimation Methods
    Young, Rebecca
    Mather, Luke
    Oswald, Elisabeth
    SMART CARD RESEARCH AND ADVANCED APPLICATIONS, CARDIS 2022, 2023, 13820 : 188 - 204
  • [26] Comparing of Feature Selection and Classification Methods on Report-Based Subhealth Data
    Huang, Li
    Yan, Shixing
    Yuan, Jiamin
    Zuo, Zhiya
    Xu, Fuping
    Lin, Yanzhao
    Yang, Mary Qu
    Yang, Zhimin
    Li, Guo-Zheng
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1356 - 1358
  • [27] Explicit Search Result Diversification Using Score and Rank Aggregation Methods
    Ozdemiray, Ahmet Murat
    Altingovde, Ismail Sengor
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2015, 66 (06) : 1212 - 1228
  • [28] COMBINATION OF MULTIPLE FEATURE SELECTION METHODS FOR TEXT CATEGORIZATION BY USING COMBINATORIAL FUSION ANALYSIS AND RANK-SCORE CHARACTERISTIC
    Li, Yanjun
    Hsu, D. Frank
    Chung, Soon M.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2013, 22 (02)
  • [29] R-GEFS: Condorcet Rank Aggregation with Graph Theoretic Ensemble Feature Selection Algorithm for Classification
    Bania, Rubul Kumar
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (09)
  • [30] Correlation and Relief Attribute Rank-based Feature Selection Methods for Detection of Alcoholic Disorder Using Electroencephalogram Signals
    Kumari, Nandini
    Anwar, Shamama
    Bhattacharjee, Vandana
    IETE JOURNAL OF RESEARCH, 2022, 68 (05) : 3816 - 3828