Heterogeneous biological data integration with declarative query language

被引:4
|
作者
Nguyen, H. [1 ]
Michel, L. [2 ]
Thompson, J. D. [3 ]
Poch, O. [3 ]
机构
[1] IGBMC, F-67404 Illkirch Graffenstaden, France
[2] Observ Astron, Equipe Hautes Energies, Strasbourg, France
[3] ICube UMR7357, Fac Med, F-67085 Strasbourg, France
关键词
SYSTEM; BIOINFORMATICS; MANAGEMENT; DATABASES; FRAMEWORK; RESOURCE;
D O I
10.1147/JRD.2014.2309032
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The requirements for scalable data integration systems for modern biology are indisputable, due to the very large, heterogeneous, and complex datasets available in public databases. The management and fusion of this "big data" with local databases represents a major challenge, since it underlies the computational inferences and models that will be subsequently generated and validated experimentally. In this paper, we present an alternative conception for local data integration, called BIRD (Biological Integration and Retrieval Data), based on four concepts: (i) a hybrid flat file and relational database architecture permits the rapid management of large volumes of heterogeneous datasets; (ii) a generic data model allows the simultaneous organization and classification of local databases according to real-world requirements; (iii) configuration rules are used to divide and map each data resource into several data model entities; and (iv) a simple, declarative query language (BIRD-QL) facilitates information extraction from heterogeneous datasets. This flexible, generic design allows the integration of diverse data formats in a searchable database with high-level functionalities depending on the specific scientific context. It has been validated in the context of real world projects, notably the SM2PH (Structural Mutation to the Phenotypes of Human Pathologies) project.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] The researches on the query language of heterogeneous data integration system
    Li, GY
    Huang, H
    Zhang, J
    Xie, YW
    PROCEEDINGS OF 2003 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS I AND II, 2003, : 9 - 12
  • [2] Limit Datalog: A Declarative Query Language for Data Analysis
    Grau, Bernardo Cuenca
    Horrocks, Ian
    Kaminski, Mark
    Kostylev, Egor, V
    Motik, Boris
    SIGMOD RECORD, 2019, 48 (04) : 6 - 17
  • [3] PQL: A declarative query language over dynamic biological schemata
    Mork, P
    Shaker, R
    Halevy, A
    Tarczy-Homoch, P
    AMIA 2002 SYMPOSIUM, PROCEEDINGS: BIOMEDICAL INFORMATICS: ONE DISCIPLINE, 2002, : 533 - 537
  • [4] Declarative pruning in a functional query language
    Osorio, M
    Jayaraman, B
    Nieves, JC
    LOGIC PROGRAMMING: PROCEEDINGS OF THE 1999 INTERNATIONAL CONFERENCE ON LOGIC PROGRAMMING, 1999, : 588 - 602
  • [5] Query decomposition and optimization in heterogeneous data integration system
    Wang, Ning
    Wang, Nengbin
    2000, (11):
  • [6] Towards a declarative query and transformation language for XML and semistructured data: Simulation unification
    Bry, F
    Schaffert, S
    LOGICS PROGRAMMING, PROCEEDINGS, 2002, 2401 : 255 - 270
  • [7] Towards a "More Declarative" XML Query Language
    Li, Xuhui
    Liu, Mengchi
    Zhang, Yongfa
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT 2, 2010, 6262 : 375 - +
  • [8] An Eligibility Criteria Query Language for Heterogeneous Data Warehouses
    Bache, R.
    Taweel, A.
    Miles, S.
    Delaney, B. C.
    METHODS OF INFORMATION IN MEDICINE, 2015, 54 (01) : 41 - 44
  • [9] SWQL - A query language for data integration based on OWL
    Lehti, P
    Fankhauser, P
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2005: OTM 2005 WORKSHOPS, PROCEEDINGS, 2005, 3762 : 926 - 935
  • [10] G-LOG - A DECLARATIVE GRAPHICAL QUERY LANGUAGE
    PAREDAENS, J
    PEELMAN, P
    TANCA, L
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 566 : 108 - 128