BioGraph: Data Model for Linking and Querying Diverse Biological Metadata

被引:3
|
作者
Veljkovic, Aleksandar N. N. [1 ]
Orlov, Yuriy L. L. [2 ,3 ,4 ]
Mitic, Nenad S. S. [1 ]
机构
[1] Univ Belgrade, Fac Math, Studentski Trg 16, Belgrade 11158, Serbia
[2] IM Sechenov First Moscow State Med Univ, Sechenov Univ, Digital Hlth Inst, Minist Hlth Russian Federat, Moscow 119991, Russia
[3] Inst Cytol & Genet SB RAS, Novosibirsk 630090, Russia
[4] Peoples Friendship Univ Russia, Agrarian & Technol Inst, Moscow 117198, Russia
基金
俄罗斯科学基金会;
关键词
gene network; associations with the diseases; connecting biological data; BioGraph; metadata; query data properties;
D O I
10.3390/ijms24086954
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Studying the association of gene function, diseases, and regulatory gene network reconstruction demands data compatibility. Data from different databases follow distinct schemas and are accessible in heterogenic ways. Although the experiments differ, data may still be related to the same biological entities. Some entities may not be strictly biological, such as geolocations of habitats or paper references, but they provide a broader context for other entities. The same entities from different datasets can share similar properties, which may or may not be found within other datasets. Joint, simultaneous data fetching from multiple data sources is complicated for the end-user or, in many cases, unsupported and inefficient due to differences in data structures and ways of accessing the data. We propose BioGraph-a new model that enables connecting and retrieving information from the linked biological data that originated from diverse datasets. We have tested the model on metadata collected from five diverse public datasets and successfully constructed a knowledge graph containing more than 17 million model objects, of which 2.5 million are individual biological entity objects. The model enables the selection of complex patterns and retrieval of matched results that can be discovered only by joining the data from multiple sources.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] An Improved Metadata Model for Big Data Processing in Cloud Data Centers
    Mir, Nader F.
    Marreddy, Navyatha
    Nigam, Prita
    [J]. PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 1417 - 1420
  • [32] Modeling metadata in data lakes-A generic model
    Eichler, Rebecca
    Giebler, Corinna
    Groeger, Christoph
    Schwarz, Holger
    Mitschang, Bernhard
    [J]. DATA & KNOWLEDGE ENGINEERING, 2021, 136 (136)
  • [33] A Metadata Management Model For Massive Data Engineering In Oilfield
    Xiong Huaping
    Liu Wanwei
    Zhao Chunyu
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 4372 - 4377
  • [34] Generic model of metadata management system for data lakes
    Elkina, Hamza
    Sahib, Mohamed Rida
    Zaki, Taher
    [J]. International Journal of Metadata, Semantics and Ontologies, 2023, 16 (04) : 315 - 328
  • [36] An integrated metadata model for statistical data collection and processing
    Vardaki, M
    Papageorgiou, H
    [J]. 16TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2004, : 363 - 372
  • [37] A satistical metadata model for clinical trials' data management
    Vardaki, Maria
    Papageorgiou, Haralambos
    Pentaris, Fragkiskos
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2009, 95 (02) : 129 - 145
  • [38] The Composite Data Model: A Unified Approach for Combining and Querying Multiple Data Models
    Pourabbas, Elaheh
    Shoshani, Arie
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (05) : 1424 - 1437
  • [39] A prototype model for data warehouse security based on metadata
    Katic, N
    Quirchmayr, G
    Schiefer, J
    Stolba, M
    Tjoa, AM
    [J]. NINTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 1998, : 300 - 308
  • [40] Semantic integration and querying of heterogeneous data sources using a hypergraph data model
    Theodoratos, D
    [J]. ADVANCES IN DATABASES, 2002, 2405 : 166 - 182