Heterogeneous biological data integration with declarative query language

被引:4
|
作者
Nguyen, H. [1 ]
Michel, L. [2 ]
Thompson, J. D. [3 ]
Poch, O. [3 ]
机构
[1] IGBMC, F-67404 Illkirch Graffenstaden, France
[2] Observ Astron, Equipe Hautes Energies, Strasbourg, France
[3] ICube UMR7357, Fac Med, F-67085 Strasbourg, France
关键词
SYSTEM; BIOINFORMATICS; MANAGEMENT; DATABASES; FRAMEWORK; RESOURCE;
D O I
10.1147/JRD.2014.2309032
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The requirements for scalable data integration systems for modern biology are indisputable, due to the very large, heterogeneous, and complex datasets available in public databases. The management and fusion of this "big data" with local databases represents a major challenge, since it underlies the computational inferences and models that will be subsequently generated and validated experimentally. In this paper, we present an alternative conception for local data integration, called BIRD (Biological Integration and Retrieval Data), based on four concepts: (i) a hybrid flat file and relational database architecture permits the rapid management of large volumes of heterogeneous datasets; (ii) a generic data model allows the simultaneous organization and classification of local databases according to real-world requirements; (iii) configuration rules are used to divide and map each data resource into several data model entities; and (iv) a simple, declarative query language (BIRD-QL) facilitates information extraction from heterogeneous datasets. This flexible, generic design allows the integration of diverse data formats in a searchable database with high-level functionalities depending on the specific scientific context. It has been validated in the context of real world projects, notably the SM2PH (Structural Mutation to the Phenotypes of Human Pathologies) project.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Fuzzy data mining query language
    Maelainin, SA
    Bensaid, A
    1998 SECOND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ELECTRONIC SYSTEMS, KES'98 PROCEEDINGS, VOL 1, 1998, : 335 - 340
  • [42] A Query Language for Mobility Data Mining
    Trasarti, Roberto
    Giannotti, Fosca
    Nanni, Mirco
    Pedreschi, Dino
    Renso, Chiara
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2011, 7 (01) : 24 - 45
  • [43] Fuzzy data mining query language
    Al Akhawayn Univ in Ifrane, Ifrane, Morocco
    Int Conf Knowledge Based Intell Electron Syst Proc KES, (335-340):
  • [44] StreamAPAS: Query Language and Data Model
    Gorawski, Marcin
    Chroszcz, Aleksander
    CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 75 - 82
  • [45] The Lorel query language for semistructured data
    Serge Abiteboul
    Dallan Quass
    Jason McHugh
    Jennifer Widom
    Janet L. Wiener
    International Journal on Digital Libraries, 1997, 1 (1) : 68 - 88
  • [46] DATA MODELING FOR A ROBOT QUERY LANGUAGE
    TUIJNMAN, F
    MEIJER, GR
    HERTZBERGER, LO
    INTELLIGENT AUTONOMOUS SYSTEMS 2, VOLS 1 AND 2, 1989, : 208 - 218
  • [47] XPathLog: A declarative, native XML data manipulation language
    May, W
    2001 INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2001, : 123 - 128
  • [48] A Model and Declarative Language for Specifying Binary Data Formats
    A. A. Evgin
    M. A. Solovev
    V. A. Padaryan
    Programming and Computer Software, 2022, 48 : 469 - 483
  • [49] A Model and Declarative Language for Specifying Binary Data Formats
    Evgin, A. A.
    Solovev, M. A.
    Padaryan, V. A.
    PROGRAMMING AND COMPUTER SOFTWARE, 2022, 48 (07) : 469 - 483
  • [50] CAROL: Towards a declarative video data retrieval language
    Li, Q
    Yang, YH
    Chung, WK
    ELECTRONIC IMAGING AND MULTIMEDIA SYSTEMS II, 1998, 3561 : 69 - 78