A performance evaluation of NoSQL databases to manage proteomics data

被引:2
|
作者
Messaoudi, Chaimaa [1 ]
Fissoune, Rachida [1 ]
Badir, Hassan [1 ]
机构
[1] Abdelmalek Essaadi Univ, Natl Sch Appl Sci, BP 1818, Tangier 90000, Morocco
关键词
proteomics; MongoDB; multi-model; Neo4j; OrientDB; polyglot persistence; GRAPH DATABASES; BIOINFORMATICS; MODEL; BIOLOGY; CLOUD; SQL;
D O I
10.1504/IJDMB.2018.095556
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
NoSQL databases have recently been introduced as alternatives to traditional relational database management systems because of their capabilities in terms of storing data and query retrieval. Biological datasets can be modelled using various models, for example, graphs (protein-protein interaction) or documents (protein sequence information). Applications that involve these two data models can be combined into a single unique architecture either using the polyglot persistence approach or using a multi-model approach. This paper evaluates the performance of a polyglot persistence approach versus a multi-model store. The polyglot persistence approach combines a graph-oriented database (Neo4j) and a document-oriented database (MongoDB); and the multi-model system is OrientDB. The comparisons are made following these aspects: importation, single operations, and query performance. OrientDB demonstrates a potential to manage large proteomics dataset for query retrieval and graph importation. However, when updating records, OrientDB was found to be slow. There is no single store that performs better in all cases.
引用
收藏
页码:70 / 89
页数:20
相关论文
共 50 条
  • [41] Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases
    Abdelhedi, Fatma
    Jemmali, Rym
    Zurfluh, Gilles
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KMIS), VOL 3, 2021, : 64 - 72
  • [42] Considerations on NoSQL Databases
    Nicolau, Dragos
    ROMANIAN JOURNAL OF INFORMATION TECHNOLOGY AND AUTOMATIC CONTROL-REVISTA ROMANA DE INFORMATICA SI AUTOMATICA, 2018, 28 (03): : 53 - 62
  • [43] Evaluation of relational and NoSQL database architectures to manage genomic annotations
    Schulz, Wade L.
    Nelson, Brent G.
    Felker, Donn K.
    Durant, Thomas J. S.
    Torres, Richard
    JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 : 288 - 295
  • [44] Exploiting RDF Open Data Using NoSQL Graph Databases
    Bouhali, Raouf
    Laurent, Anne
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, 2015, 458 : 177 - 190
  • [45] Applying NoSQL Databases for Operationalizing Clinical Data Mining Models
    Mazurek, Marcin
    BEYOND DATABASES, ARCHITECTURES AND STRUCTURES, BDAS 2014, 2014, 424 : 527 - 536
  • [46] Data Partition Optimization for Column-Family NoSQL databases
    Ho, Li-Yung
    Hsieh, Meng-Ju
    Wu, Jan-Jan
    Liu, Pangfeng
    2015 IEEE INTERNATIONAL CONFERENCE ON SMART CITY/SOCIALCOM/SUSTAINCOM (SMARTCITY), 2015, : 668 - 675
  • [47] Data Modeling Guidelines for NoSQL Document-Store Databases
    Imam, Abdullahi Abubakar
    Basri, Shuib
    Ahmad, Rohiza
    Watada, Junzo
    Gonzlez-Aparicio, Maria T.
    Almomani, Malek Ahmad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (10) : 544 - 555
  • [48] NOSOLAP: Moving from Data Warehouse Requirements to NoSQL Databases
    Prakash, Deepika
    PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING (ENASE), 2019, : 452 - 458
  • [49] Data Integrity Verification in Column-Oriented NoSQL Databases
    Weintraub, Grisha
    Gudes, Ehud
    DATA AND APPLICATIONS SECURITY AND PRIVACY XXXII, DBSEC 2018, 2018, 10980 : 165 - 181
  • [50] Semantic Data Querying Over NoSQL Databases with Apache Spark
    Hassan, Mahmudul
    Bansal, Srividya K.
    2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2018, : 364 - 371