Querying semantic catalogues of biomedical databases

被引:7
|
作者
Pereira, Arnaldo [1 ]
Almeida, Joao Rafael [1 ,2 ]
Lopes, Rui Pedro [3 ]
Oliveira, Jose Luis [1 ]
机构
[1] Univ Aveiro, DETI, IEETA, LASI, Aveiro, Portugal
[2] Univ A Coruna, Dept Computat, La Coruna, Spain
[3] Polytech Inst Braganca, CeDRI, Braganca, Portugal
关键词
Biomedical data; Knowledge bases; Semantic data; Linked data; Information extraction; Natural language interfaces; Question answering; LINKED DATA; PLATFORM; CHALLENGES; DISCOVERY;
D O I
10.1016/j.jbi.2022.104272
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.Methods: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.Results: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical on-tologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https:// bioinformatics-ua.github.io/BioKBQA/.Conclusion: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Cooperative querying in relational databases
    Ramos, CV
    Braga, JL
    Laender, AHF
    XVII INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 1997, : 190 - 198
  • [32] Querying by sketch geographical databases
    Han, Yu
    PROCEEDINGS OF THE 2015 4TH INTERNATIONAL CONFERENCE ON SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS, 2016, 43 : 313 - 317
  • [33] QUERYING DATA IN NOSQL DATABASES
    Babic, Andrea
    Jaksic, Danijela
    Poscic, Patrizia
    ZBORNIK VELEUCILISTA U RIJECI-JOURNAL OF THE POLYTECHNICS OF RIJEKA, 2019, 7 (01): : 257 - 270
  • [34] Integration and querying of distributed databases
    Hu, GZ
    Fernandes, H
    PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2003, : 167 - 174
  • [35] Querying sequence databases with transducers
    Bonner, AJ
    Mecca, G
    ACTA INFORMATICA, 2000, 36 (07) : 511 - 544
  • [36] Querying Graph Databases at Scale
    Hogan, Aidan
    Vrgoc, Domagoj
    COMPANION OF THE 2024 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, SIGMOD-COMPANION 2024, 2024, : 585 - 589
  • [37] Graphical querying of multidimensional Databases
    Ravat, Franck
    Teste, Olivier
    Tournier, Ronan
    Zurfluh, Gilles
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4690 : 298 - +
  • [38] Querying Communities in Relational Databases
    Qin, Lu
    Yu, Jeffrey Xu
    Chang, Lijun
    Tao, Yufei
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 724 - 735
  • [39] Querying sequence databases with transducers
    Bonner, AJ
    Mecca, G
    DATABASE PROGRAMMING LANGUAGES, 1998, 1369 : 118 - 135
  • [40] Querying and Learning in Probabilistic Databases
    Dylla, Maximilian
    Theobald, Martin
    Miliaraki, Iris
    REASONING WEB: REASONING ON THE WEB IN THE BIG DATA ERA, 2014, 8714 : 313 - +