MapReduce-based parallel GEP algorithm for efficient function mining in big data applications

被引：5

作者：

Liu, Yang ^{[1
]}

Ma, Chenxiao ^{[1
]}

Xu, Lixiong ^{[1
]}

Shen, Xiaodong ^{[1
]}

Li, Maozhen ^{[2
,3
]}

Li, Pengcheng ^{[4
]}

机构：

[1] Sichuan Univ, Sch Elect Engn & Informat, Chengdu 610065, Sichuan, Peoples R China

[2] Brunel Univ London, Dept Elect & Comp Engn, Uxbridge UB8 3PH, Middx, England

[3] Tongji Univ, Minist Educ, Key Lab Embedded Syst & Serv Comp, Shanghai 200092, Peoples R China

[4] State Grid Shanxi Elect Power Co, Xian 710048, Shaanxi, Peoples R China

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2018年 / 30卷 / 23期

关键词：

big data; GEP; Hadoop framework; medoid; parallelization; CLASSIFICATION;

D O I：

10.1002/cpe.4379

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Gene expression programming (GEP) algorithm is one of the most effective function mining algorithms in enabling the mathematical equation fitting for the input dataset. However, GEP algorithm encounters low efficiency issue in big data processing due to large overhead in its evolution when it handles the large-scale data. In order to solve the issue, this paper presents two parallelized GEP algorithms using MapReduce. Based on data separation, the first algorithm aims at speeding up the large-scale classification. However, it is lack of ability to output the mined equation explicitly. Therefore, based on the further improvements of the first algorithm, the second parallelized GEP algorithm aims at mining the equation efficiently and also outputs the equation explicitly and directly. The experimental results show that both algorithms are effective for processing large volume of data.

引用

页数：11

共 50 条

[1] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
Mao Yimin
Geng Junhao
Deborah Simon Mwakapesa
Yaser Ahangari Nanehkaran
Zhang Chi
Deng Xiaoheng
Chen Zhigang
[J]. Multimedia Systems, 2021, 27 : 709 - 722
[2] PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining
Mao, Yimin
Geng, Junhao
Mwakapesa, Deborah Simon
Nanehkaran, Yaser Ahangari
Chi, Zhang
Deng, Xiaoheng
Chen, Zhigang
[J]. MULTIMEDIA SYSTEMS, 2021, 27 (04) : 709 - 722
[3] A MapReduce-based approach to social network big data mining
Qi, Fuli
[J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
[4] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
Zhang, Huajie
Song, Lei
Zhang, Sen
[J]. IAENG International Journal of Applied Mathematics, 2023, 53 (01)
[5] Knowledge Extraction from Big Data using MapReduce-based Parallel-Reduct Algorithm
Chowdhury, Tapan
Chakraborty, Susanta
Setua, S. K.
[J]. PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2016, : 240 - 246
[6] Knowledge process of health big data using MapReduce-based associative mining
Choi, So-Young
Chung, Kyungyong
[J]. PERSONAL AND UBIQUITOUS COMPUTING, 2020, 24 (05) : 571 - 581
[7] A MapReduce-Based ELM for Regression in Big Data
Wu, B.
Yan, T. H.
Xu, X. S.
He, B.
Li, W. H.
[J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
[8] Knowledge process of health big data using MapReduce-based associative mining
So-Young Choi
Kyungyong Chung
[J]. Personal and Ubiquitous Computing, 2020, 24 : 571 - 581
[9] Atrak: a MapReduce-based data warehouse for big data
Barkhordari, Mohammadhossein
Niamanesh, Mahdi
[J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
[10] Atrak: a MapReduce-based data warehouse for big data
Mohammadhossein Barkhordari
Mahdi Niamanesh
[J]. The Journal of Supercomputing, 2017, 73 : 4596 - 4610

← 1 2 3 4 5 →