A parallel computing framework for big data

被引:4
|
作者
Chen, Guoliang [1 ,2 ]
Mao, Rui [1 ,2 ]
Lu, Kezhong [1 ,2 ]
机构
[1] Guangdong Prov Key Lab Popular High Performance C, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
基金
国家高技术研究发展计划(863计划);
关键词
NC-computing; metric space; data partitioning; parallel computing; SIMILARITY SEARCH; METRIC-SPACES; QUERIES;
D O I
10.1007/s11704-016-5003-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data has received great attention in research and application. However, most of the current efforts focus on system and application to handle the challenges of "volume" and "velocity", and not much has been done on the theoretical foundation and to handle the challenge of "variety". Based on metric-space indexing and computationalcomplexity theory, we propose a parallel computing framework for big data. This framework consists of three components, i.e., universal representation of big data by abstracting various data types into metric space, partitioning of big data based on pair-wise distances in metric space, and parallel computing of big data with the NC-class computing theory.
引用
收藏
页码:608 / 621
页数:14
相关论文
共 50 条
  • [41] Formalizing computational intensity of big traffic data understanding and analysis for parallel computing
    Xia, Yingjie
    Chen, Jinlong
    Wang, Chunhui
    NEUROCOMPUTING, 2015, 169 : 158 - 168
  • [42] Spatial Indexing Algorithm for Big Data of Traffic Trajectories in Parallel Computing Mode
    Chen, Ying
    Wu, Xiao-Ling
    Journal of Network Intelligence, 2023, 8 (04): : 1062 - 1076
  • [43] A Parallel Random Forest Algorithm for Big Data in a Spark Cloud Computing Environment
    Chen, Jianguo
    Li, Kenli
    Tang, Zhuo
    Bilal, Kashif
    Yu, Shui
    Weng, Chuliang
    Li, Keqin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (04) : 919 - 933
  • [44] Recent Developments in Parallel and Distributed Computing for Remotely Sensed Big Data Processing
    Wu, Zebin
    Sun, Jin
    Zhang, Yi
    Wei, Zhihui
    Chanussot, Jocelyn
    PROCEEDINGS OF THE IEEE, 2021, 109 (08) : 1282 - 1305
  • [45] A Parallel Randomized Neural Network on In-memory Cluster Computing for Big Data
    Dai, Tongwu
    Li, Kenli
    Chen, Cen
    2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017,
  • [46] Dynamic load balancing of physiological data flow in big data network parallel computing environment
    Zhang X.-D.
    Xia X.-J.
    Lyu H.-F.
    Gong X.-C.
    Lian M.-J.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2020, 50 (01): : 247 - 254
  • [47] Application Of Cloud Computing In Biomedicine Big Data Analysis Cloud Computing In Big Data
    Yang, Tianyi
    Zhao, Yang
    2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [48] Design and Optimization of a Big Data Computing Framework based on CPU/GPU Cluster
    Zhai, Yanlong
    Guo, Ying
    Chen, Qiurui
    Yang, Kai
    Mbarushimana, Emmanuel
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 1039 - 1046
  • [49] A distributed computing framework for wind speed big data forecasting on Apache Spark
    Xu, Yinan
    Liu, Hui
    Long, Zhihao
    SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2020, 37
  • [50] A Gaussian process based big data processing framework in cluster computing environment
    Gunasekaran Manogaran
    Daphne Lopez
    Cluster Computing, 2018, 21 : 189 - 204