MapReduce-based parallel GEP algorithm for efficient function mining in big data applications

被引:5
|
作者
Liu, Yang [1 ]
Ma, Chenxiao [1 ]
Xu, Lixiong [1 ]
Shen, Xiaodong [1 ]
Li, Maozhen [2 ,3 ]
Li, Pengcheng [4 ]
机构
[1] Sichuan Univ, Sch Elect Engn & Informat, Chengdu 610065, Sichuan, Peoples R China
[2] Brunel Univ London, Dept Elect & Comp Engn, Uxbridge UB8 3PH, Middx, England
[3] Tongji Univ, Minist Educ, Key Lab Embedded Syst & Serv Comp, Shanghai 200092, Peoples R China
[4] State Grid Shanxi Elect Power Co, Xian 710048, Shaanxi, Peoples R China
来源
关键词
big data; GEP; Hadoop framework; medoid; parallelization; CLASSIFICATION;
D O I
10.1002/cpe.4379
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Gene expression programming (GEP) algorithm is one of the most effective function mining algorithms in enabling the mathematical equation fitting for the input dataset. However, GEP algorithm encounters low efficiency issue in big data processing due to large overhead in its evolution when it handles the large-scale data. In order to solve the issue, this paper presents two parallelized GEP algorithms using MapReduce. Based on data separation, the first algorithm aims at speeding up the large-scale classification. However, it is lack of ability to output the mined equation explicitly. Therefore, based on the further improvements of the first algorithm, the second parallelized GEP algorithm aims at mining the equation efficiently and also outputs the equation explicitly and directly. The experimental results show that both algorithms are effective for processing large volume of data.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao Yimin
    Geng Junhao
    Deborah Simon Mwakapesa
    Yaser Ahangari Nanehkaran
    Zhang Chi
    Deng Xiaoheng
    Chen Zhigang
    [J]. Multimedia Systems, 2021, 27 : 709 - 722
  • [2] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
    Mao, Yimin
    Geng, Junhao
    Mwakapesa, Deborah Simon
    Nanehkaran, Yaser Ahangari
    Chi, Zhang
    Deng, Xiaoheng
    Chen, Zhigang
    [J]. MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
  • [3] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
  • [4] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    [J]. IAENG International Journal of Applied Mathematics, 2023, 53 (01)
  • [5] Knowledge Extraction from Big Data using MapReduce-based Parallel-Reduct Algorithm
    Chowdhury, Tapan
    Chakraborty, Susanta
    Setua, S. K.
    [J]. PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2016, : 240 - 246
  • [6] Knowledge process of health big data using MapReduce-based associative mining
    Choi, So-Young
    Chung, Kyungyong
    [J]. PERSONAL AND UBIQUITOUS COMPUTING, 2020, 24 (05) : 571 - 581
  • [7] A MapReduce-Based ELM for Regression in Big Data
    Wu, B.
    Yan, T. H.
    Xu, X. S.
    He, B.
    Li, W. H.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
  • [8] Knowledge process of health big data using MapReduce-based associative mining
    So-Young Choi
    Kyungyong Chung
    [J]. Personal and Ubiquitous Computing, 2020, 24 : 571 - 581
  • [9] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    [J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [10] Atrak: a MapReduce-based data warehouse for big data
    Mohammadhossein Barkhordari
    Mahdi Niamanesh
    [J]. The Journal of Supercomputing, 2017, 73 : 4596 - 4610