A parallel computing framework for big data

被引:4
|
作者
Chen, Guoliang [1 ,2 ]
Mao, Rui [1 ,2 ]
Lu, Kezhong [1 ,2 ]
机构
[1] Guangdong Prov Key Lab Popular High Performance C, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
基金
国家高技术研究发展计划(863计划);
关键词
NC-computing; metric space; data partitioning; parallel computing; SIMILARITY SEARCH; METRIC-SPACES; QUERIES;
D O I
10.1007/s11704-016-5003-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data has received great attention in research and application. However, most of the current efforts focus on system and application to handle the challenges of "volume" and "velocity", and not much has been done on the theoretical foundation and to handle the challenge of "variety". Based on metric-space indexing and computationalcomplexity theory, we propose a parallel computing framework for big data. This framework consists of three components, i.e., universal representation of big data by abstracting various data types into metric space, partitioning of big data based on pair-wise distances in metric space, and parallel computing of big data with the NC-class computing theory.
引用
收藏
页码:608 / 621
页数:14
相关论文
共 50 条
  • [1] A parallel computing framework for big data
    Guoliang Chen
    Rui Mao
    Kezhong Lu
    [J]. Frontiers of Computer Science, 2017, 11 : 608 - 621
  • [2] Apache Hama: An Emerging Bulk Synchronous Parallel Computing Framework for Big Data Applications
    Siddique, Kamran
    Akhtar, Zahid
    Yoon, Edward J.
    Jeong, Young-Sik
    Dasgupta, Dipankar
    Kim, Yangwoo
    [J]. IEEE ACCESS, 2016, 4 : 8879 - 8887
  • [3] Parallel and distributed computing for Big Data applications
    Senger, Hermes
    Geyer, Claudio
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (08): : 2412 - 2415
  • [4] Research on the Computing Framework in Big Data Environment
    Liu, Yunqing
    Zhang, Jianhua
    Han, Shuqing
    Zhu, Mengshuai
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2016, : 558 - 562
  • [5] Big Data Applications Using Workflows for Data Parallel Computing
    Wang, Jianwu
    Crawl, Daniel
    Altintas, Ilkay
    Li, Weizhong
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2014, 16 (04) : 11 - 21
  • [6] A feasible graph partition framework for parallel computing of big graph
    Liu, X.
    Zhou, Y.
    Guan, X.
    Shen, C.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 134 : 228 - 239
  • [8] Stochastic Semantics of Big Data (Parallel Computing and Visualization)
    Manakov, D.V.
    Vasev, P.A.
    [J]. Scientific Visualization, 2024, 16 (05): : 120 - 150
  • [9] Petroleum Geoscience Big Data and GPU Parallel Computing
    Han, Fei
    Sun, Sam Z.
    [J]. 2015 1ST IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2015, : 292 - 293
  • [10] PaPar: A Parallel Data Partitioning Framework for Big Data Applications
    Wang, Hao
    Zhang, Jing
    Zhang, Da
    Pumma, Sarunya
    Feng, Wu-chun
    [J]. 2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 605 - 614