Analysis of Big Data Platform with OpenStack and Hadoop

被引:5
|
作者
Li, Xiaoyan [1 ]
Lu, Zhihui [1 ]
Wang, Nini [2 ]
Wu, Jie [2 ]
Huang, Shalin [3 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] Minist Educ, Engn Res Ctr Cyber Secur Auditing & Monitoring, Shanghai 200433, Peoples R China
[3] Wangsu Sci & Technol Co Ltd, Shanghai 200433, Peoples R China
来源
关键词
Hadoop; Benchmarks; Big data; HDFS; Cluster; Openstack; Cloud; PERFORMANCE; MAPREDUCE;
D O I
10.1007/978-3-319-49178-3_29
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the era of big data, the cloud infrastructure needs to strongly support big data. As a distributed computational framework, Hadoop is one of the de facto leading software tools for solving big data problems. The cloud infrastructure has been proven to be a good support for three-tier architecture applications. In this paper, we construct a Hadoop big data platform based on OpenStack cloud. At the same time, we design three experimental scenarios, carry out a set of experiments using the standard Hadoop benchmarks TestDFSIO, TeraSort and PI, and examine the performance. Our experiments reveal that the disk read operation of physical servers can be a bottleneck for TestDFSIO and TeraSort. Wider allocation of VMs over physical servers achieves better performance for read jobs of TestDFSIO and TeraSort. For CPU-intensive job PI, the best practice is to centralize the allocation of VMs over physical machines.
引用
收藏
页码:375 / 390
页数:16
相关论文
共 50 条
  • [1] OpenStack Platform and its Application in Big Data Processing
    Shao, Cen
    Liang, Bo
    Wang, Feng
    Deng, Hui
    Dai, Wei
    Wei, Shoulin
    Zhang, Xiaoli
    Yuan, Zhi
    [J]. 2015 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKS AND INTELLIGENT SYSTEMS (ICINIS), 2015, : 98 - 101
  • [2] Research on Industry Data Analysis Model Based on Hadoop Big Data Platform
    Xu, Hongsheng
    Fan, Ganglong
    Li, Ke
    [J]. PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT, INFORMATION AND COMPUTER SCIENCE (ICEMC 2017), 2017, 73 : 783 - 787
  • [3] Attack Models for Big Data Platform Hadoop
    Li, Ningwei
    Gao, Hang
    Liu, Liang
    Zhang, Fan
    Wang, Wenxuan
    [J]. 2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 154 - 159
  • [4] Analysis of Big Data Storage Tools for Data Lakes based on Apache Hadoop Platform
    Belov, Vladimir
    Nikulchev, Evgeny
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 551 - 557
  • [5] Design and Analysis of OpenStack Cloud Smart Factory Platform for Manufacturing Big Data Applications
    Ahn, Dae Jun
    Jeong, Jongpil
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT II: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 1-4, 2019, PROCEEDINGS, PART II, 2019, 11620 : 53 - 61
  • [6] The Hadoop Technology Applies in Power Big Data Platform
    Hu, Jianyong
    Chen, Jilin
    Xie, Mei
    Gao, Bo
    Yu, Zhihong
    Yan, Jianfeng
    Lv, Ying
    [J]. PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL AND ELECTRICAL ENGINEERING (AMEE 2017), 2017, 87 : 113 - 116
  • [7] Performance Challenges and Solutions in Big Data Platform Hadoop
    Singh, Balraj
    Verma, Harsh K.
    Madaan, Vishu
    [J]. Recent Advances in Computer Science and Communications, 2023, 16 (09)
  • [8] Power Big Data platform Based on Hadoop Technology
    Chen, Jilin
    Liu, Nana
    Chen, Yong
    Qiu, Weijiang
    [J]. PROCEEDINGS OF THE 2016 6TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS, ENVIRONMENT, BIOTECHNOLOGY AND COMPUTER (MMEBC), 2016, 88 : 571 - 576
  • [9] Developing a Cloud Computing Platform for Big Data: The OpenStack Nova case
    Teixeira, Jose
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [10] Block Storage Optimization and Parallel Data Processing and Analysis of Product Big Data Based on the Hadoop Platform
    Wang, Yajun
    Cheng, Shengming
    Zhang, Xinchen
    Leng, Junyu
    Liu, Jun
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021