MetReS: a Metabolic Reconstruction Database for Cloud Computing

被引:1
|
作者
Vilaplana, Jordi [1 ,2 ]
Solsona, Francesc [1 ,2 ]
Teixido, Ivan [1 ,2 ]
Mateo, Jordi [1 ,2 ]
Usie, Anabel [3 ]
Torres, Nestor [4 ,5 ]
Comas, Jorge [4 ,5 ]
Alves, Rui [4 ,5 ]
机构
[1] Univ Lleida, Dept Comp Sci, Lleida, Spain
[2] Univ Lleida, INSPIRES, Lleida, Spain
[3] CEBAL, P-7800 Beja, Portugal
[4] Univ Lleida, Dept Basic Med Sci, Lleida, Spain
[5] Univ Lleida, IRBLleida, Lleida, Spain
关键词
INTERACTION NETWORKS;
D O I
10.1109/INCoS.2014.31
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
When designing a cloud infrastructure, it is critical to ensure beforehand that the system will be able to offer the desired level of QoS (Quality of Service). Our attention is focused here on efficient QoS accessing to a biological database in cloud computing systems. Our group developed two software applications that address important biological problems, Biblio-MetReS and Homol-MetReS. Biblio-MetReS is a data-mining tool that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re) annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. Reconstruction of molecular networks is essential to understand how organisms work at the molecular level and has strong implication, for example, in finding targets to treat different types of disease. In addition, the identification and functional annotation of the individual components of the network is crucial to understand what those targets might do in the context of the organism. These two software applications access the same database of organisms with annotated genes. The efficiency of the two applications is directly related to the design of the shared database. This database is continuously growing, as hundreds to thousands of new genomes are sequenced and annotated each year. The main goal of the current work was to improve the current database performance and to test if this improvement would scale to larger data-sets and more complex types of analysis that are not yet done by either of the applications. To achieve this goal, different database architectures were designed and analyzed. We started the study with a public relational database, MySQL, which was the current database server used by these applications. Then, due to the large size of the database, Apache Hadoop, a framework used for large-scale data processing, was considered and studied as an alternative. Although Big Data systems are not always a replacement of traditional relational databases, we proved by extensive tests the applicability of Apache Hadoop to a standard biological database containing some of the most frequently used types of information in molecular and systems biology. With time, as this database will continuously grow, the proposed solution will further improve its efficiency. Furthermore, this solution allows to extract additional valuable information from the data-sets that was not being currently considered.
引用
收藏
页码:653 / 658
页数:6
相关论文
共 50 条
  • [1] STUDY ON CLOUD COMPUTING AND CLOUD DATABASE
    Singh, Manjeet
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 708 - 713
  • [2] DATABASE MODEL PROTOTYPE FOR CLOUD COMPUTING
    Mocean, Loredana
    Popa, Silviu-Claudiu
    Bresfelean, Paul
    INTERNATIONAL CONFERENCE ON INFORMATICS IN ECONOMY, IE 2016: EDUCATION, RESEARCH & BUSINESS TECHNOLOGIES, 2016, : 257 - 266
  • [3] Scalable database management in cloud computing
    Kaur, Pankaj Deep
    Sharma, Gitanjali
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2015, 70 : 658 - 667
  • [4] Cloud Computing Architecture Design of Database Resource Pool Based on Cloud Computing
    Sui Yi
    Li Yuhe
    Wang Yu
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND COMPUTER AIDED EDUCATION (ICISCAE 2018), 2018, : 180 - 183
  • [5] MetReS, an Efficient Database for Genomic Applications
    Vilaplana, Jordi
    Alves, Rui
    Solsona, Francesc
    Mateo, Jordi
    Teixido, Ivan
    Pifarre, Marc
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2018, 25 (02) : 200 - 213
  • [6] Optimum Resource Allocation of Database in Cloud Computing
    Omara, Fatma A.
    Khattab, Sherif M.
    Sahal, Radhya
    EGYPTIAN INFORMATICS JOURNAL, 2014, 15 (01) : 1 - 12
  • [7] Bioinformatics Workflows With NoSQL Database in Cloud Computing
    Wercelens, Polyane
    da Silva, Waldeyr
    Hondo, Fernanda
    Castro, Klayton
    Walter, Maria Emilia
    Araujo, Aleteia
    Lifschitz, Sergio
    Holanda, Maristela
    EVOLUTIONARY BIOINFORMATICS, 2019, 15
  • [8] Research on Key Problems of Database in Cloud Computing
    Chen, Hua
    Ge, Jing
    2018 INTERNATIONAL CONFERENCE ON E-COMMERCE AND CONTEMPORARY ECONOMIC DEVELOPMENT (ECED 2018), 2018, : 62 - 65
  • [9] Verifiable Auditing for Outsourced Database in Cloud Computing
    Wang, Jianfeng
    Chen, Xiaofeng
    Huang, Xinyi
    You, Ilsun
    Xiang, Yang
    IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (11) : 3293 - 3303
  • [10] ANALYSIS ON CLOUD COMPUTING DATABASE IN CLOUD ENVIRONMENT - CONCEPT AND ADOPTION PARADIGM
    Ularu, Elena-Geanina
    Puican, Florina
    Velicanu, Manole
    INTERNATIONAL CONFERENCE ON INFORMATICS IN ECONOMY, 2012, : 128 - +