A Mutual Information-Based Hybrid Feature Selection Method for Software Cost Estimation Using Feature Clustering

被引:7
|
作者
Liu, Qin [1 ]
Shi, Shihai [1 ]
Zhu, Hongming [1 ]
Xiao, Jiakai [1 ]
机构
[1] Tongji Univ, Sch Software Engn, Shanghai 200092, Peoples R China
关键词
software cost estimation; feature selection; mutual information; feature clustering;
D O I
10.1109/COMPSAC.2014.99
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Feature selection methods are designed to obtain the optimal feature subset from the original features to give the most accurate prediction. So far, supervised and unsupervised feature selection methods have been discussed and developed separately. However, these two methods can be combined together as a hybrid feature selection method for some data sets. In this paper, we propose a mutual information-based (MI-based) hybrid feature selection method using feature clustering. In the unsupervised learning stage, the original features are grouped into several clusters based on the feature similarity to each other with agglomerative hierarchical clustering. Then in the supervised learning stage, the feature in each cluster that can maximize the feature similarity with the response feature which represents the class label is selected as the representative feature. These representative features compose the feature subset. Our contribution includes 1) the newly proposed feature selection method and 2) the application of feature clustering for software cost estimation. The proposed method employs wrapper approaches, so it can evaluate the prediction performance of each feature subset to determine the optimal one. The experimental results in software cost estimation demonstrate that the proposed method can outperform at least 11.5% and 14.8% than the supervised feature selection method INMIFS and mRMRFS in ISBSG R8 and Desharnais data set in terms of PRED (0.25) value.
引用
收藏
页码:27 / 32
页数:6
相关论文
共 50 条
  • [1] A Mutual Information-Based Hybrid Feature Selection Method for Software Cost Estimation Using Feature Clustering
    Shi, Shihai
    Liu, Qin
    [J]. INTERNATIONAL JOINT CONFERENCE ON APPLIED MATHEMATICS, STATISTICS AND PUBLIC ADMINISTRATION (AMSPA 2014), 2014, : 481 - 490
  • [2] Mutual information-based filter hybrid feature selection method for medical datasets using feature clustering
    Sadegh Asghari
    Hossein Nematzadeh
    Ebrahim Akbari
    Homayun Motameni
    [J]. Multimedia Tools and Applications, 2023, 82 : 42617 - 42639
  • [3] Mutual information-based filter hybrid feature selection method for medical datasets using feature clustering
    Asghari, Sadegh
    Nematzadeh, Hossein
    Akbari, Ebrahim
    Motameni, Homayun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (27) : 42617 - 42639
  • [4] Supervised feature selection by clustering using conditional mutual information-based distances
    Martinez Sotoca, Jose
    Pla, Filiberto
    [J]. PATTERN RECOGNITION, 2010, 43 (06) : 2068 - 2081
  • [5] A Fuzzy Mutual Information-based Feature Selection Method for Classification
    Hogue, N.
    Ahmed, H. A.
    Bhattacharyya, D. K.
    Kalita, J. K.
    [J]. FUZZY INFORMATION AND ENGINEERING, 2016, 8 (03) : 355 - 384
  • [6] Heterogeneous feature subset selection using mutual information-based feature transformation
    Wei, Min
    Chow, Tommy W. S.
    Chan, Rosa H. M.
    [J]. NEUROCOMPUTING, 2015, 168 : 706 - 718
  • [7] Comments on supervised feature selection by clustering using conditional mutual information-based distances
    Vinh, Nguyen X.
    Bailey, James
    [J]. PATTERN RECOGNITION, 2013, 46 (04) : 1220 - 1225
  • [8] Mutual information-based feature selection for radiomics
    Oubel, Estanislao
    Beaumont, Hubert
    Iannessi, Antoine
    [J]. MEDICAL IMAGING 2016: PACS AND IMAGING INFORMATICS: NEXT GENERATION AND INNOVATIONS, 2016, 9789
  • [9] Feature redundancy term variation for mutual information-based feature selection
    Gao, Wanfu
    Hu, Liang
    Zhang, Ping
    [J]. APPLIED INTELLIGENCE, 2020, 50 (04) : 1272 - 1288
  • [10] Feature redundancy term variation for mutual information-based feature selection
    Wanfu Gao
    Liang Hu
    Ping Zhang
    [J]. Applied Intelligence, 2020, 50 : 1272 - 1288