Atlas - a data warehouse for integrative bioinformatics

被引:93
|
作者
Shah, SP [1 ]
Huang, Y [1 ]
Xu, T [1 ]
Yuen, MMS [1 ]
Ling, J [1 ]
Ouellette, BFF [1 ]
机构
[1] Univ British Columbia, UBC Bioinformat Ctr, Vancouver, BC V5Z 1M9, Canada
关键词
D O I
10.1186/1471-2105-6-34
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description: The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins ( DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy,Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion: The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: http://bioinformatics.ubc.ca/ atlas/.
引用
下载
收藏
页数:16
相关论文
共 50 条
  • [1] Atlas – a data warehouse for integrative bioinformatics
    Sohrab P Shah
    Yong Huang
    Tao Xu
    Macaire MS Yuen
    John Ling
    BF Francis Ouellette
    BMC Bioinformatics, 6
  • [2] Clinical and genomic bioinformatics data warehouse
    Miller, G
    Penberthy, L
    Garrett, C
    Muller, R
    Smalara, K
    Cassel, B
    Smith, D
    Gonzalez, I
    FASEB JOURNAL, 2004, 18 (04): : A201 - A201
  • [3] Using Data Warehouse Technology in Crop Plant Bioinformatics
    Kuenne, Christian
    Grosse, Ivo
    Matthies, Inge
    Scholz, Uwe
    Sretenovic-Rajicic, Tatjana
    Stein, Nils
    Stephanik, Andreas
    Steuernagel, Burkhard
    Weise, Stephan
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2007, 4 (01):
  • [4] Integrative data mining: The new direction in bioinformatics
    Bertone, P
    Gerstein, M
    IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2001, 20 (04): : 33 - 40
  • [5] Systematic and Integrative Analysis of Proteomic Data using Bioinformatics Tools
    Rameshwari, Rashmi
    Prasad, T. V.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (05) : 29 - 35
  • [6] BioWarehouse: a bioinformatics database warehouse toolkit
    Thomas J Lee
    Yannick Pouliot
    Valerie Wagner
    Priyanka Gupta
    David WJ Stringer-Calvert
    Jessica D Tenenbaum
    Peter D Karp
    BMC Bioinformatics, 7
  • [7] BioWarehouse: a bioinformatics database warehouse toolkit
    Lee, TJ
    Pouliot, Y
    Wagner, V
    Gupta, P
    Stringer-Calvert, DWJ
    Tenenbaum, JD
    Karp, PD
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [8] An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework
    Chen, Yi-An
    Tripathi, Lokesh P.
    Mizuguchi, Kenji
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
  • [9] Integrative Bioinformatics: History and Future
    Chen, Ming
    Hofestaedt, Ralf
    Taubert, Jan
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2019, 16 (03)
  • [10] An Integrative Bioinformatics Analysis of Microarray Data for Identifying Differentially Expressed Genes in Preeclampsia
    Song, L. M.
    Long, M.
    Song, S. J.
    Wang, J. R.
    Zhao, G. W.
    Zhao, N.
    RUSSIAN JOURNAL OF GENETICS, 2022, 58 (07) : 866 - 875