Querying semantic catalogues of biomedical databases

被引:7
|
作者
Pereira, Arnaldo [1 ]
Almeida, Joao Rafael [1 ,2 ]
Lopes, Rui Pedro [3 ]
Oliveira, Jose Luis [1 ]
机构
[1] Univ Aveiro, DETI, IEETA, LASI, Aveiro, Portugal
[2] Univ A Coruna, Dept Computat, La Coruna, Spain
[3] Polytech Inst Braganca, CeDRI, Braganca, Portugal
关键词
Biomedical data; Knowledge bases; Semantic data; Linked data; Information extraction; Natural language interfaces; Question answering; LINKED DATA; PLATFORM; CHALLENGES; DISCOVERY;
D O I
10.1016/j.jbi.2022.104272
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.Methods: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.Results: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical on-tologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https:// bioinformatics-ua.github.io/BioKBQA/.Conclusion: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Semantically Annotating and Querying Databases
    Karagiannis, Georgios Th.
    MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS, 2008, : 461 - +
  • [42] Modeling and querying video databases
    Decleir, C
    Hacid, MS
    Kouloumdjian, J
    24TH EUROMICRO CONFERENCE - PROCEEDING, VOLS 1 AND 2, 1998, : 492 - 498
  • [43] Querying Encrypted Graph Databases
    Aburawi, Nahla
    Lisitsa, Alexei
    Coenen, Frans
    ICISSP: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY, 2018, : 447 - 451
  • [44] Functional Querying in Graph Databases
    Pokorny, Jaroslav
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2017, PT I, 2017, 10191 : 291 - 301
  • [45] Querying and learning in probabilistic databases
    Dylla, Maximilian
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8714 : 313 - 368
  • [46] Keyword Querying and Ranking in Databases
    Chaudhuri, Surajit
    Das, Gautam
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02): : 1658 - 1659
  • [47] Querying Large Graph Databases
    Ke, Yiping
    Cheng, James
    Yu, Jeffrey Xu
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 487 - +
  • [48] Querying sequence databases with transducers
    Anthony J. Bonner
    Giansalvatore Mecca
    Acta Informatica, 2000, 36 : 511 - 544
  • [49] QUERYING DESIGN AND PLANNING DATABASES
    IMIELINSKI, T
    NAQVI, S
    VADAPARTY, K
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 566 : 524 - 545
  • [50] Automated querying of genome databases
    Schattner, Peter
    PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (01) : 3 - 8