Streaming feature selection algorithms for big data: A survey

被引:33
|
作者
AlNuaimi, Noura [1 ]
Masud, Mohammad Mehedy [1 ]
Serhani, Mohamed Adel [1 ]
Zaki, Nazar [1 ]
机构
[1] United Arab Emirates Univ, Coll Informat Technol, Al Ain, U Arab Emirates
关键词
Big data; Redundant features; Relevant features; Streaming feature grouping; Streaming feature selection; ONLINE FEATURE-SELECTION; MUTUAL INFORMATION; GRANULATION; RELEVANCE; ENTROPY;
D O I
10.1016/j.aci.2019.01.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Organizations in many domains generate a considerable amount of heterogeneous data every day. Such data can be processed to enhance these organizations' decisions in real time. However, storing and processing large and varied datasets (known as big data) is challenging to do in real time. In machine learning, streaming feature selection has always been considered a superior technique for selecting the relevant subset features from highly dimensional data and thus reducing learning complexity. In the relevant literature, streaming feature selection refers to the features that arrive consecutively over time; despite a lack of exact figure on the number of features, numbers of instances are well-established. Many scholars in the field have proposed streaming-feature-selection algorithms in attempts to find the proper solution to this problem. This paper presents an exhaustive and methodological introduction of these techniques. This study provides a review of the traditional feature-selection algorithms and then scrutinizes the current algorithms that use streaming feature selection to determine their strengths and weaknesses. The survey also sheds light on the ongoing challenges in big-data research.
引用
收藏
页码:113 / 135
页数:23
相关论文
共 50 条
  • [41] A survey on feature selection methods for mixed data
    Saúl Solorio-Fernández
    J. Ariel Carrasco-Ochoa
    José Francisco Martínez-Trinidad
    Artificial Intelligence Review, 2022, 55 : 2821 - 2846
  • [42] A survey on feature selection methods for mixed data
    Solorio-Fernandez, Saul
    Carrasco-Ochoa, J. Ariel
    Martinez-Trinidad, Jose Francisco
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 2821 - 2846
  • [43] A Survey of Bitmap Index Compression Algorithms for Big Data
    Chen, Zhen
    Wen, Yuhao
    Cao, Junwei
    Zheng, Wenxun
    Chang, Jiahui
    Wu, Yinjun
    Ma, Ge
    Hakmaoui, Mourad
    Peng, Guodong
    TSINGHUA SCIENCE AND TECHNOLOGY, 2015, 20 (01) : 100 - 115
  • [44] A Survey on Job Scheduling Algorithms in Big Data Processing
    Gautam, Jyoti V.
    Prajapati, Harshadkumar B.
    Dabhi, Vipul K.
    Chaudhary, Sanjay
    2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [45] A Survey of Bitmap Index Compression Algorithms for Big Data
    Zhen Chen
    Yuhao Wen
    Junwei Cao
    Wenxun Zheng
    Jiahui Chang
    Yinjun Wu
    Ge Ma
    Mourad Hakmaoui
    Guodong Peng
    Tsinghua Science and Technology, 2015, 20 (01) : 100 - 115
  • [46] Online learning algorithms for big data analytics: A survey
    Li, Zhijie
    Li, Yuanxiang
    Wang, Feng
    He, Guoliang
    Kuang, Li
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (08): : 1707 - 1721
  • [47] Enhancing Big Data Feature Selection Using a Hybrid Correlation-Based Feature Selection
    Mohamad, Masurah
    Selamat, Ali
    Krejcar, Ondrej
    Crespo, Ruben Gonzalez
    Herrera-Viedma, Enrique
    Fujita, Hamido
    ELECTRONICS, 2021, 10 (23)
  • [48] A SURVEY OF MACHINE LEARNING ALGORITHMS FOR BIG DATA ANALYTICS
    Athmaja, S.
    Hanumanthappa, M.
    Kavitha, Vasantha
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [49] Big Data Security in Healthcare Survey on Frameworks and Algorithms
    Chandra, Sudipta
    Ray, Soumya
    Goswami, R. T.
    2017 7TH IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2017, : 89 - 94
  • [50] A Feature Based Comparison Study of Big Data Scheduling Algorithms
    Mana, Suja Cherukullapurath
    2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 24 - 26