Bagging and Feature Selection for Classification with Incomplete Data

被引:5
|
作者
Cao Truong Tran [1 ]
Zhang, Mengjie [1 ]
Andreae, Peter [1 ]
Xue, Bing [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
关键词
Incomplete data; Ensemble; Feature selection; Classification; Particle swam optimisation; C4.5; REPTree; MISSING DATA; ENSEMBLE;
D O I
10.1007/978-3-319-55849-3_31
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Missing values are an unavoidable issue of many real-world datasets. Dealing with missing values is an essential requirement in classification problem, because inadequate treatment with missing values often leads to large classification errors. Some classifiers can directly work with incomplete data, but they often result in big classification errors and generate complex models. Feature selection and bagging have been successfully used to improve classification, but they are mainly applied to complete data. This paper proposes a combination of bagging and feature selection to improve classification with incomplete data. To achieve this purpose, a wrapper-based feature selection which can directly work with incomplete data is used to select suitable feature subsets for bagging. The experiments on eight incomplete datasets were designed to compare the proposed method with three other popular methods that are able to deal with incomplete data using C4.5/REPTree as classifiers and using Particle Swam Optimisation as a search technique in feature selection. Results show that the combination of bagging and feature selection can not only achieve better classification accuracy than the other methods but also generate less complex models compared to the bagging method.
引用
收藏
页码:471 / 486
页数:16
相关论文
共 50 条
  • [1] Online feature selection and classification with incomplete data
    Kalkan, Habil
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2014, 22 (06) : 1625 - 1636
  • [2] Handling incomplete data classification using imputed feature selected bagging (IFBag) method
    Khan, Ahmad Jaffar
    Raza, Basit
    Shahid, Ahmad Raza
    Kumar, Yogan Jaya
    Faheem, Muhammad
    Alquhayz, Hani
    [J]. INTELLIGENT DATA ANALYSIS, 2021, 25 (04) : 825 - 846
  • [3] Stable bagging feature selection on medical data
    Salem Alelyani
    [J]. Journal of Big Data, 8
  • [4] Stable bagging feature selection on medical data
    Alelyani, Salem
    [J]. JOURNAL OF BIG DATA, 2021, 8 (01)
  • [5] A Classification Method for Incomplete Mixed Data Using Imputation and Feature Selection
    Li, Gengsong
    Zheng, Qibin
    Liu, Yi
    Li, Xiang
    Qin, Wei
    Diao, Xingchun
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [6] Feature Selection and Classification for High-Dimensional Incomplete Multimodal Data
    Deng, Wan-Yu
    Liu, Dan
    Dong, Ying-Ying
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [7] Improving performance of classification on incomplete data using feature selection and clustering
    Cao Truong Tran
    Zhang, Mengjie
    Andreae, Peter
    Xue, Bing
    Lam Thu Bui
    [J]. APPLIED SOFT COMPUTING, 2018, 73 : 848 - 861
  • [8] Classification of brain glioma by using SVMs bagging with feature selection
    Li, Guo-Zheng
    Liu, Tian-Yu
    Cheng, Victor S.
    [J]. DATA MINING FOR BIOMEDICAL APPLICATIONS, PROCEEDINGS, 2006, 3916 : 124 - 130
  • [9] Robust Feature Selection on Incomplete Data
    Zheng, Wei
    Zhu, Xiaofeng
    Zhu, Yonghua
    Zhang, Shichao
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 3191 - 3197
  • [10] Improving performance for classification with incomplete data using wrapper-based feature selection
    Tran C.T.
    Zhang M.
    Andreae P.
    Xue B.
    [J]. Evolutionary Intelligence, 2016, 9 (3) : 81 - 94