Feature ranking based consensus clustering for feature subset selection

被引:0
|
作者
Rani, D. Sandhya [1 ,2 ]
Rani, T. Sobha [2 ]
Bhavani, S. Durga [2 ]
Krishna, G. Bala [1 ]
机构
[1] CVR Coll Engn, Comp Sci & Engn, Hyderabad 501015, Telangana, India
[2] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad 500046, Telangana, India
关键词
Feature subset; Consensus clustering; Feature ranking; Large dataset; MUTUAL INFORMATION; CLASSIFICATION; ALGORITHM; RELEVANCE;
D O I
10.1007/s10489-024-05566-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature subset selection problem is an NP hard problem and there is a need for computationally efficient algorithms that find near optimal feature subsets which improve the performance of a classifier. Two major challenges for feature subset selection are high-dimensional data, that is, data with a large number of features and large datasets. Scalability of the feature selection algorithms in terms of accuracy for high dimensional data and the time taken for large datasets are important issues. We propose a consensus clustering based approach to feature selection that addresses these issues. There exist many greedy feature ranking algorithms in the literature that are computationally efficient. Each algorithm assigns a different ranking order to the features. A consensus among these rankings may provide a feature ranking that performs well with respect to time as well as accuracy. The goal of this work is to propose efficient algorithms that work on small as well as large datasets. The contributions of this work include: i. A fast and scalable approach for feature selection Feature ranking based on consensus clustering(FRCC), has been designed using the available feature ranking algorithms from the literature. ii. A parallelizable version of FRCC, namely, Hybrid Feature Selection(HFS), is proposed to address the feature reduction in large datasets. The implementation results show that FRCC clearly outperforms many recent algorithms in the literature on small as well as large dimensional data sets. HFS has been implemented on datasets with lakhs of instances and dimensionality in hundreds and thousands. HFS proves to be very effective in terms of feature reduction and accuracy in comparison to the results obtained by recent algorithms in the literature.
引用
下载
收藏
页码:8154 / 8169
页数:16
相关论文
共 50 条
  • [21] Differential Evolution based Feature Subset Selection
    Khushaba, Rami N.
    Al-Ani, Ahmed
    Al-Jumaily, Adel
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3674 - 3677
  • [22] Feature Subset Selection based on Filter Technique
    Bibi, K. Fathima
    Banu, M. Nazreen
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATIONS TECHNOLOGIES (ICCCT 15), 2015, : 1 - 6
  • [23] Feature subset selection based on the genetic algorithm
    Yang, Jingwei
    Wang, Sile
    Chen, Yingyi
    Lu, Sukui
    Yang, Wenzhu
    ADVANCED TECHNOLOGIES IN MANUFACTURING, ENGINEERING AND MATERIALS, PTS 1-3, 2013, 774-776 : 1532 - +
  • [24] A clustering-based feature selection via feature separability
    Jiang, Shengyi
    Wang, Lianxi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 927 - 937
  • [25] UNSUPERVISED FEATURE RANKING AND SELECTION BASED ON AUTOENCODERS
    Sharifipour, Sasan
    Fayyazi, Hossein
    Sabokrou, Mohammad
    Adeli, Ehsan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3172 - 3176
  • [26] Feature selection based on partition clustering
    Liu, Shuang
    Zhao, Qiang
    Wu, Xiang
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2014, 18 (02) : 135 - 142
  • [27] Unsupervised Feature Selection with Feature Clustering
    Cheung, Yiu-ming
    Jia, Hong
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 9 - 15
  • [28] Neighborhood Ranking-Based Feature Selection
    Ipkovich, Adam
    Abonyi, Janos
    IEEE ACCESS, 2024, 12 : 20152 - 20168
  • [29] Improving Incremental Wrapper-Based Feature Subset Selection by Using Re-ranking
    Bermejo, Pablo
    Gamez, Jose A.
    Puerta, Jose M.
    TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS, 2010, 6096 : 580 - 589
  • [30] Clustering-based feature selection
    School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, China
    Tien Tzu Hsueh Pao, 2008, SUPPL. (157-160):