Performance Comparison of State of Art NoSql Technologies Using Apache Spark

被引:1
|
作者
ul Haque, Anwar [1 ]
Mahmood, Tariq [1 ]
Ikram, Nassar [2 ]
机构
[1] Inst Business Adm, Fac Comp Sci, Karachi, Pakistan
[2] Natl Univ Sci & Technol, Islamabad, Pakistan
来源
INTELLIGENT SYSTEMS AND APPLICATIONS, INTELLISYS, VOL 2 | 2019年 / 869卷
关键词
Component; AeroSpike; Apache spark; BigData; CouchBase; MongoDB; NoSql technologies; Redis;
D O I
10.1007/978-3-030-01057-7_44
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data is the new currency of digital world today. Data generated in last 2 years are more in size as compared to data generated in last 15 years. The nature of data generated have varying dimensions, size, speed and behavior along with being semi and full unstructured, it also contains various formats including text, document, excel, power point, web blogs, posts, chats, tweets, audio and video streams and long range numeric values, etc. Storing such type of data in legacy SQL based storage will not yield the benefit of currency. To take full advantage of data the IT industry is equipped with variety of State of Art NoSql (Not only Sql) databases. Each of them has their own specific features and limitations. In this research we have conducted an experiment on state of art NoSql technologies to find out a comparative analysis among them on the basis of performance, integration, ease of use and size of data loading/unloading capabilities. For experiment we used 3.4 TB of data which contains medical test records, lab diagnostics and prescriptions, long range pi values. The generated data was stored in AeroSpike, BerkeleyDB, CouchBase, HBase, MongoDB and Redis. The performance testing was done on queries like search in, equate, greater than, less than and other general arithmetic operations, etc. Those queries were executed using the Apache Spark on a cluster with a processing capacity of 54 cores and memory of 168 GB. The comparison provided some useful and defining results towards selection of NoSql stores for specific nature of jobs.
引用
收藏
页码:563 / 576
页数:14
相关论文
共 50 条
  • [41] Performance Comparison between Five NoSQL Databases
    Tang, Enqing
    Fan, Yushun
    2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2016, : 105 - 109
  • [42] Enhancing KBQA Performance in Large-Scale Chinese Knowledge Graphs Using Apache Spark
    Su, Yi-Jen
    Wu, Cheng-Wei
    Chen, Yi-Ju
    2024 6TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND THE INTERNET, ICCCI 2024, 2024, : 181 - 186
  • [43] Performance evaluation of DNN with other machine learning techniques in a cluster using Apache Spark and MLlib
    JayaLakshmi, A. N. M.
    Kishore, K. V. Krishna
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (01) : 1311 - 1319
  • [44] Performance Analysis of Java']Java Virtual Machine for Machine Learning Workloads using Apache Spark
    Hema, N.
    Srinivasa, K. G.
    Chidambaram, Saravanan
    Saraswat, Sandeep
    Saraswati, Sujoy
    Ramachandra, Ranganath
    Huttanagoudar, Jayashree B.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
  • [45] A Performance Comparison of Document Oriented NoSQL Databases
    Kumar, Sundhara K. B.
    Srividya
    Mohanavalli, S.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND SIGNAL PROCESSING (ICCCSP), 2017, : 71 - 76
  • [46] State-of-the-Art Geospatial Information Processing in NoSQL Databases
    Guo, Dongming
    Onstein, Erling
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2020, 9 (05)
  • [47] Data model evolution using object-NoSQL mappers: Folklore or state-of-the-art?
    2016, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States
  • [48] Data Model Evolution using Object-NoSQL Mappers: Folklore or State-of-the-Art?
    Ringlstetter, Andreas
    Scherzinger, Stefanie
    Bissyande, Tegawende F.
    2016 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON BIG DATA SOFTWARE ENGINEERING (BIGDSE 2016), 2016, : 33 - 36
  • [49] HRV-Spark: Computing Heart Rate Variability Measures Using Apache Spark
    Qu, Xufeng
    Wu, Yuanyuan
    Liu, Jinze
    Cui, Licong
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2235 - 2241
  • [50] Performance Evaluation of Machine Learning Algorithms in Apache Spark for Intrusion Detection
    Dobson, Anthony
    Roy, Kaushik
    Yuan, Xiaohong
    Xu, Jinsheng
    2018 28TH INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), 2018, : 374 - 379