Multilevel Data Processing Using Parallel Algorithms for Analyzing Big Data in High-Performance Computing

被引:1
|
作者
Ahmad, Awais [1 ]
Paul, Anand [2 ]
Din, Sadia [2 ]
Rathore, M. Mazhar [2 ]
Choi, Gyu Sang [1 ]
Jeon, Gwanggil [3 ]
机构
[1] Yeungnam Univ, Dept Informat & Commun Engn, Gyeongbuk, South Korea
[2] Kyungpook Natl Univ, Sch Comp Sci & Engn, Daegu, South Korea
[3] Incheon Natl Univ, Dept Embedded Syst Engn, Incheon, South Korea
关键词
Big Data; HPC; Parallel Processing algorithm; Four-tier system architecture; DATA ANALYTICS; MAPREDUCE;
D O I
10.1007/s10766-017-0498-x
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The growing gap between users and the Big Data analytics requires innovative tools that address the challenges faced by big data volume, variety, and velocity. Therefore, it becomes computationally inefficient to analyze such massive volume of data. Moreover, advancements in the field of Big Data application and data science poses additional challenges, where High-Performance Computing solution has become a key issue and has attracted attention in recent years. However, these systems are either memoryless or computational inefficient. Therefore, keeping in view the aforementioned needs, there is a requirement for a system that can efficiently analyze a stream of Big Data within their requirements. Hence, this paper presents a system architecture that enhances the working of traditional MapReduce by incorporating parallel processing algorithm. Moreover, complete four-tier architecture is also proposed that efficiently aggregate the data, eliminate unnecessary data, and analyze the data by the proposed parallel processing algorithm. The proposed system architecture both read and writes operations that enhance the efficiency of the Input/Output operation. To check the efficiency of the proposed algorithms exploited in the proposed system architecture, we have implemented our proposed system using Hadoop and MapReduce. MapReduce is supported by a parallel algorithm that efficiently processes a huge volume of data sets. The system is implemented using MapReduce tool at the top of the Hadoop parallel nodes to generate and process graphs with near real-time. Moreover, the system is evaluated in terms of efficiency by considering the system throughput and processing time. The results show that the proposed system is more scalable and efficient.
引用
收藏
页码:508 / 527
页数:20
相关论文
共 50 条
  • [41] State of the Art High-Performance and High-Throughput Computing for Remote Sensing Big Data
    Zhang, Sheng
    Xue, Yong
    Zhou, Xiran
    Zhang, Xiaopeng
    Liu, Wenhao
    Li, Kaiyuan
    Liu, Runze
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE, 2022, 10 (04) : 125 - 149
  • [42] A High-Performance Parallel Approach to Image Processing in Distributed Computing
    Rakhimov, Mekhriddin
    Mamadjanov, Doniyor
    Mukhiddinov, Abulkosim
    [J]. 2020 IEEE 14TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2020), 2020,
  • [43] Advanced high performance algorithms for data processing
    Bogdanov, AV
    Boukhanovsky, AV
    [J]. COMPUTATIONAL SCIENCE - ICCS 2004, PT 1, PROCEEDINGS, 2004, 3036 : 239 - 246
  • [44] Cloud Computing for Big Data Processing
    Li, Xiaofang
    Zhuang, Yanbin
    Yang, Simon X.
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2017, 23 (04): : 545 - 546
  • [45] Big Data Processing on Volunteer Computing
    Lv, Zhihan
    Chen, Dongliang
    Singh, Amit Kumar
    [J]. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2021, 21 (04)
  • [46] Computing infrastructure for big data processing
    Liu, Ling
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2013, 7 (02) : 165 - 170
  • [47] Computing infrastructure for big data processing
    Ling Liu
    [J]. Frontiers of Computer Science, 2013, 7 : 165 - 170
  • [48] Parallel knowledge acquisition algorithms for big data using MapReduce
    Jin Qian
    Min Xia
    Xiaodong Yue
    [J]. International Journal of Machine Learning and Cybernetics, 2018, 9 : 1007 - 1021
  • [49] Parallel knowledge acquisition algorithms for big data using MapReduce
    Qian, Jin
    Xia, Min
    Yue, Xiaodong
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (06) : 1007 - 1021
  • [50] RESEARCH ON HIGH-PERFORMANCE COMPUTING NETWORK SEARCH SYSTEM BASED ON COMPUTER BIG DATA
    Chen X.
    Liu D.
    [J]. Scalable Computing, 2024, 25 (03): : 1833 - 1840