Big data storage technologies: a survey

被引:55
|
作者
Siddiqa, Aisha [1 ]
Karim, Ahmad [2 ]
Gani, Abdullah [1 ]
机构
[1] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur 50603, Malaysia
[2] Bahauddin Zakariya Univ, Dept Informat Technol, Multan 60000, Pakistan
关键词
Big data; Big data storage; NoSQL databases; Distributed databases; CAP theorem; Scalability; Consistency-partition resilience; Availability-partition resilience; DATA REPLICATION; NOSQL DATABASES; COMMUNICATION; AVAILABILITY; SCALABILITY; CHALLENGES; SYSTEMS; CAP;
D O I
10.1631/FITEE.1500441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. The structural shift of the storage mechanism from traditional data management systems to NoSQL technology is due to the intention of fulfilling big data storage requirements. However, the available big data storage technologies are inefficient to provide consistent, scalable, and available solutions for continuously growing heterogeneous data. Storage is the preliminary process of big data analytics for real-world applications such as scientific experiments, healthcare, social networks, and e-business. So far, Amazon, Google, and Apache are some of the industry standards in providing big data storage solutions, yet the literature does not report an in-depth survey of storage technologies available for big data, investigating the performance and magnitude gains of these technologies. The primary objective of this paper is to conduct a comprehensive investigation of state-of-the-art storage technologies available for big data. A well-defined taxonomy of big data storage technologies is presented to assist data analysts and researchers in understanding and selecting a storage mechanism that better fits their needs. To evaluate the performance of different storage architectures, we compare and analyze the existing approaches using Brewer's CAP theorem. The significance and applications of storage technologies and support to other categories are discussed. Several future research challenges are highlighted with the intention to expedite the deployment of a reliable and scalable storage system.
引用
收藏
页码:1040 / 1070
页数:31
相关论文
共 50 条
  • [31] IoT-Based Health Big-Data Process Technologies: A Survey
    Yoo, Hyun
    Park, Roy C.
    Chung, Kyungyong
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (03): : 974 - 992
  • [32] Big data: Evaluation criteria for big data analytics technologies
    Muchemwa, Regis
    de la Harpe, Andre
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BUSINESS AND MANAGEMENT DYNAMICS 2016: SUSTAINABLE ECONOMIES IN THE INFORMATION ECONOMY, 2016, : 80 - 86
  • [33] A Sketch of Big Data Technologies
    Liu, Zaiying
    Yang, Ping
    Zhang, Lixiao
    [J]. 2013 SEVENTH INTERNATIONAL CONFERENCE ON INTERNET COMPUTING FOR ENGINEERING AND SCIENCE (ICICSE 2013), 2013, : 26 - 29
  • [34] Big Data Technologies at JPL
    Jones, Dayton L.
    [J]. COMPUTER, 2014, 47 (09) : 67 - 68
  • [35] Big Data: A Survey
    Chen, Min
    Mao, Shiwen
    Liu, Yunhao
    [J]. MOBILE NETWORKS & APPLICATIONS, 2014, 19 (02): : 171 - 209
  • [36] The Survey of Big Data
    Fu, Qi
    Tan, Jun
    Xie, Yufang
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ELECTRONIC TECHNOLOGY, 2015, 6 : 403 - 407
  • [37] Big Data: A Survey
    Min Chen
    Shiwen Mao
    Yunhao Liu
    [J]. Mobile Networks and Applications, 2014, 19 : 171 - 209
  • [38] Big data analytics and big data science: a survey
    Chen, Yong
    Chen, Hong
    Gorkhali, Anjee
    Lu, Yang
    Ma, Yiqian
    Li, Ling
    [J]. JOURNAL OF MANAGEMENT ANALYTICS, 2016, 3 (01) : 1 - 42
  • [40] Big data storage technologies: a case study for web-based LiDAR visualization
    Deibe, David
    Amor, Margarita
    Doallo, Ramon
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3831 - 3840