Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences

被引：0

作者：

Liangxiu Han

Hwee Yong Ong

机构：

[1] Manchester Metropolitan University,School of Computing, Mathematics and Digital Technology

[2] University of Edinburgh,School of Informatics

来源：

Cluster Computing | 2015年 / 18卷

关键词：

Data-intensive computing; Parallel processing; MapReduce; Cloud computing; Data mining application in biomedical science;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Performance is an open issue in data intensive applications (e.g. data mining tasks). Parallel and distributed computing systems (e.g. multicore computing, grid computing, cloud computing,etc.), along with hybrid programming models (e.g. MapReduce, MPI, etc.), is seen a sought-after solution for accelerating data-intensive applications. One of main challenges is how to exploit these advanced technologies effectively in facilitating fundamental science discoveries such as those in Biomedical Sciences. This paper explores how MapReduce and Cloud computing can accelerate performance of data intensive applications through a real data mining use case in the Biomedical Sciences. We have first adapted the data mining task using MapReduce model and then deployed it onto the Cloud. We have built an analytic model based on the MapReduce computations to evaluate the efficiency and performance of the prototype. The results, from both experiments and the evaluation model, show the performance and scalability can be enhanced through these advanced technologies.

引用

页码：403 / 418

页数：15

共 50 条

[21] Parallel Mining Frequent Patterns over Big Transactional Data in Extended MapReduce
Chen, Hui
Lin, Tsau Young
Zhang, Zhibing
Zhong, Jie
2013 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2013, : 43 - 48
[22] Performance of Scalable Off-The-Shelf Hardware for Data-intensive Parallel Processing using MapReduce
Fadzil, Ahmad Firdaus Ahmad
Khalid, Noor Elaiza Abdul
Manaf, Mazani
2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 379 - 384
[23] Characterization of Power Usage and Performance in Data-Intensive Applications Using MapReduce over MPI
Davis, Joshua
Gao, Tao
Chandrasekaran, Sunita
Jagode, Heike
Danalis, Anthony
Dongarra, Jack
Balaji, Pavan
Taufer, Michela
PARALLEL COMPUTING: TECHNOLOGY TRENDS, 2020, 36 : 287 - 298
[24] A Parallel MapReduce Algorithm to Efficiently Support Itemset Mining on High Dimensional Data
Apiletti, Daniele
Baralis, Elena
Cerquitelli, Tania
Garza, Paolo
Pulvirenti, Fabio
Michiardi, Pietro
BIG DATA RESEARCH, 2017, 10 : 53 - 69
[25] Fuzzy MapReduce Data Mining algorithms
Reddy, Poli Venkata Subba
2018 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY), 2018, : 304 - 309
[26] Scientific data mining and processing using MapReduce in cloud environments
Kong, Xiangsheng
Kong, Xiangsheng, 1600, Journal of Chemical and Pharmaceutical Research, 3/668 Malviya Nagar, Jaipur, Rajasthan, India (06): : 1270 - 1276
[27] On using MapReduce to scale algorithms for Big Data analytics: a case study
Kijsanayothin, Phongphun
Chalumporn, Gantaphon
Hewett, Rattikorn
JOURNAL OF BIG DATA, 2019, 6 (01)
[28] Novel Weather Data Analysis Using Hadoop and MapReduce - A Case Study
Suryanarayana, V.
Sathish, B. S.
Ranganayakulu, A.
Ganesan, P.
2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 204 - 207
[29] On using MapReduce to scale algorithms for Big Data analytics: a case study
Phongphun Kijsanayothin
Gantaphon Chalumporn
Rattikorn Hewett
Journal of Big Data, 6
[30] Parallel Data Processing with MapReduce: A Survey
Lee, Kyong-Ha
Lee, Yoon-Joon
Choi, Hyunsik
Chung, Yon Dohn
Moon, Bongki
SIGMOD RECORD, 2011, 40 (04) : 11 - 20

← 1 2 3 4 5 →