Feature Selection Using Genetic Algorithm for Big Data

被引:3
|
作者
Saidi, Rania [1 ]
Ncir, Waad Bouaguel [1 ]
Essoussi, Nadia [1 ]
机构
[1] Univ Tunis, LARODEC, ISG, Tunis, Tunisia
关键词
Feature selection; Genetic algorithm; MapReduce; Parallel computing; Big Data;
D O I
10.1007/978-3-319-74690-6_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is a powerful technique for dimensionality reduction and an important step in successful machine learning applications. In the last few decades, data has become progressively larger in both numbers of instances and features which make it harder to deal with the feature selection problem. To cope with this new epoch of big data, new techniques need to be developed for addressing this problem effectively. Nonetheless, the suitability of current feature selection algorithms is extremely downgraded and are inapplicable, when data size exceeds hundreds of gigabytes. In this paper, we introduce a scalable implementation of a parallel feature selection approach using the genetic algorithm that has been done in parallel using MapReduce model. The experimental results showed that the proposed method can be suitable to improve the performance of feature selection.
引用
收藏
页码:352 / 361
页数:10
相关论文
共 50 条
  • [1] A New Approach for Wrapper Feature Selection Using Genetic Algorithm for Big Data
    Bouaguel, Waad
    [J]. INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2015, 2016, 5 : 75 - 83
  • [2] Hybrid Efficient Genetic Algorithm for Big Data Feature Selection Problems
    Mohammed, Tareq Abed
    Bayat, Oguz
    Ucan, Osman N.
    Alhayali, Shaymaa
    [J]. FOUNDATIONS OF SCIENCE, 2020, 25 (04) : 1009 - 1025
  • [3] Hybrid Efficient Genetic Algorithm for Big Data Feature Selection Problems
    Tareq Abed Mohammed
    Oguz Bayat
    Osman N. Uçan
    Shaymaa Alhayali
    [J]. Foundations of Science, 2020, 25 : 1009 - 1025
  • [4] Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance
    Bu, Fanyu
    Chen, Zhikui
    Zhang, Qingchen
    Wang, Xin
    [J]. 2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH), 2014, : 263 - 266
  • [5] An ACO–ANN based feature selection algorithm for big data
    R. Joseph Manoj
    M. D. Anto Praveena
    K. Vijayakumar
    [J]. Cluster Computing, 2019, 22 : 3953 - 3960
  • [6] A greedy feature selection algorithm for Big Data of high dimensionality
    Tsamardinos, Ioannis
    Borboudakis, Giorgos
    Katsogridakis, Pavlos
    Pratikakis, Polyvios
    Christophides, Vassilis
    [J]. MACHINE LEARNING, 2019, 108 (02) : 149 - 202
  • [7] A greedy feature selection algorithm for Big Data of high dimensionality
    Ioannis Tsamardinos
    Giorgos Borboudakis
    Pavlos Katsogridakis
    Polyvios Pratikakis
    Vassilis Christophides
    [J]. Machine Learning, 2019, 108 : 149 - 202
  • [8] Feature subset selection using a genetic algorithm
    Yang, JH
    Honavar, V
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (02): : 44 - 49
  • [9] Feature Selection Using Diploid Genetic Algorithm
    Jasuja A.
    [J]. Annals of Data Science, 2020, 7 (01) : 33 - 43
  • [10] Face feature selection using genetic algorithm
    Yin Hongtao
    Fu Ping
    Sha Xuejun
    [J]. ISTM/2009: 8TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-6, 2009, : 980 - 983