Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences

被引:0
|
作者
Liangxiu Han
Hwee Yong Ong
机构
[1] Manchester Metropolitan University,School of Computing, Mathematics and Digital Technology
[2] University of Edinburgh,School of Informatics
来源
Cluster Computing | 2015年 / 18卷
关键词
Data-intensive computing; Parallel processing; MapReduce; Cloud computing; Data mining application in biomedical science;
D O I
暂无
中图分类号
学科分类号
摘要
Performance is an open issue in data intensive applications (e.g. data mining tasks). Parallel and distributed computing systems (e.g. multicore computing, grid computing, cloud computing,etc.), along with hybrid programming models (e.g. MapReduce, MPI, etc.), is seen a sought-after solution for accelerating data-intensive applications. One of main challenges is how to exploit these advanced technologies effectively in facilitating fundamental science discoveries such as those in Biomedical Sciences. This paper explores how MapReduce and Cloud computing can accelerate performance of data intensive applications through a real data mining use case in the Biomedical Sciences. We have first adapted the data mining task using MapReduce model and then deployed it onto the Cloud. We have built an analytic model based on the MapReduce computations to evaluate the efficiency and performance of the prototype. The results, from both experiments and the evaluation model, show the performance and scalability can be enhanced through these advanced technologies.
引用
收藏
页码:403 / 418
页数:15
相关论文
共 50 条
  • [1] Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences
    Han, Liangxiu
    Ong, Hwee Yong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (01): : 403 - 418
  • [2] Accelerating Biomedical Data-Intensive Applications using MapReduce
    Han, Liangxiu
    Ong, Hwee Yong
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 49 - 57
  • [3] Leveraging Data Intensive Applications on a Pervasive Computing Platform: the case of MapReduce
    Steffenel, Luiz Angelo
    Pinheiro, Manuele Kirch
    6TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT-2015), THE 5TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT-2015), 2015, 52 : 1034 - 1039
  • [4] An Improved GPU MapReduce Framework for Data Intensive Applications
    Nitu, Razvan
    Apostol, Elena
    Cristea, Valentin
    2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, : 355 - 362
  • [5] Parallel Associative Classification Data Mining Frameworks Based MapReduce
    Thabtah, Fadi
    Hammoud, Suhel
    Abdel-Jaber, Hussein
    PARALLEL PROCESSING LETTERS, 2015, 25 (02)
  • [6] MapReduce-based parallel GEP algorithm for efficient function mining in big data applications
    Liu, Yang
    Ma, Chenxiao
    Xu, Lixiong
    Shen, Xiaodong
    Li, Maozhen
    Li, Pengcheng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (23):
  • [7] Data mining and life sciences applications on the grid
    Cannataro, Mario
    Guzzi, Pietro Hiram
    Sarica, Alessia
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (03) : 216 - 238
  • [8] MapReduce Across Distributed Clusters for Data-intensive Applications
    Wang, Lizhe
    Tao, Jie
    Marten, Holger
    Streit, Achim
    Khan, Samee U.
    Kolodziej, Joanna
    Chen, Dan
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 2004 - 2011
  • [9] Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
    Ahmad, Maaz Bin Safeer
    Cheung, Alvin
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1205 - 1220
  • [10] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    IAENG International Journal of Applied Mathematics, 2023, 53 (01):