A New Representation in PSO for Discretization-Based Feature Selection

被引:145
|
作者
Tran, Binh [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Evolutionary Computat Res Grp, Wellington, New Zealand
关键词
Classification; discretization; feature selection (FS); high-dimensional data; particle swarm optimization (PSO); PARTICLE SWARM OPTIMIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1109/TCYB.2017.2714145
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In machine learning, discretization and feature selection (FS) are important techniques for preprocessing data to improve the performance of an algorithm on high-dimensional data. Since many FS methods require discrete data, a common practice is to apply discretization before FS. In addition, for the sake of efficiency, features are usually discretized individually (or univariate). This scheme works based on the assumption that each feature independently influences the task, which may not hold in cases where feature interactions exist. Therefore, univariate discretization may degrade the performance of the FS stage since information showing feature interactions may be lost during the discretization process. Initial results of our previous proposed method [evolve particle swarm optimization (EPSO)] showed that combining discretization and FS in a single stage using bare-bones particle swarm optimization (BBPSO) can lead to a better performance than applying them in two separate stages. In this paper, we propose a new method called potential particle swarm optimization (PPSO) which employs a new representation that can reduce the search space of the problem and a new fitness function to better evaluate candidate solutions to guide the search. The results on ten high-dimensional datasets show that PPSO select less than 5% of the number of features for all datasets. Compared with the two-stage approach which uses BBPSO for FS on the discretized data, PPSO achieves significantly higher accuracy on seven datasets. In addition, PPSO obtains better (or similar) classification performance than EPSO on eight datasets with a smaller number of selected features on six datasets. Furthermore, PPSO also outperforms the three compared (traditional) methods and performs similar to one method on most datasets in terms of both generalization ability and learning capacity.
引用
收藏
页码:1733 / 1746
页数:14
相关论文
共 50 条
  • [41] Improved PSO-Based Feature Construction Algorithm Using Feature Selection Methods
    Mahanipour, Afsaneh
    Nezamabadi-pour, Hossein
    [J]. 2017 2ND CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC), 2017, : 1 - 5
  • [42] New efficient initialization and updating mechanisms in PSO for feature selection and classification
    Ramesh Kumar Huda
    Haider Banka
    [J]. Neural Computing and Applications, 2020, 32 : 3283 - 3294
  • [43] Analog Circuit Test Point Selection Incorporating Discretization-Based Fuzzification and Extended Fault Dictionary to Handle Component Tolerances
    Yiqian Cui
    Junyou Shi
    Zili Wang
    [J]. Journal of Electronic Testing, 2016, 32 : 661 - 679
  • [44] Filter based Backward Elimination in Wrapper based PSO for Feature Selection in Classification
    Hoai Bach Nguyen
    Xue, Bing
    Liu, Ivy
    Zhang, Mengjie
    [J]. 2014 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2014, : 3111 - 3118
  • [45] Unsupervised Feature Selection Algorithm Based on Sparse Representation
    Cui, Guoqing
    Yang, Jie
    Zareapoor, Masoumeh
    Wang, Jiechen
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2016, : 1028 - 1033
  • [46] Feature Selection Tracking Algorithm Based on Sparse Representation
    Lou, Hui-dong
    Li, Wei-guang
    Hou, Yue-en
    Yao, Qing-he
    Ye, Guo-qiang
    Wan, Hao
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [47] Feature Selection and Pedestrian Detection Based on Sparse Representation
    Yao, Shihong
    Wang, Tao
    Shen, Weiming
    Pan, Shaoming
    Chong, Yanwen
    Ding, Fei
    [J]. PLOS ONE, 2015, 10 (08):
  • [48] Distributed Collaborative Feature Selection Based on Intermediate Representation
    Ye, Xiucai
    Li, Hongmin
    Imakura, Akira
    Sakurai, Tetsuya
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4142 - 4149
  • [49] High Dimensional Feature Selection Method of Dual Gbest Based on PSO
    Dong, Hongbin
    Pan, Yuyao
    Sun, Jing
    [J]. 2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [50] An effective hybrid model based on PSO-SVM algorithm with a new local search for feature selection
    Eslami, Ehsan
    Eftekhari, Mahdi
    [J]. 2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 404 - 409