From language models to large-scale food and biomedical knowledge graphs

被引:0
|
作者
Gjorgjina Cenikj
Lidija Strojnik
Risto Angelski
Nives Ogrinc
Barbara Koroušić Seljak
Tome Eftimov
机构
[1] Jožef Stefan Institute,
[2] Jožef Stefan International Postgraduate School,undefined
[3] Clinic Doctor 24-hours,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Knowledge about the interactions between dietary and biomedical factors is scattered throughout uncountable research articles in an unstructured form (e.g., text, images, etc.) and requires automatic structuring so that it can be provided to medical professionals in a suitable format. Various biomedical knowledge graphs exist, however, they require further extension with relations between food and biomedical entities. In this study, we evaluate the performance of three state-of-the-art relation-mining pipelines (FooDis, FoodChem and ChemDis) which extract relations between food, chemical and disease entities from textual data. We perform two case studies, where relations were automatically extracted by the pipelines and validated by domain experts. The results show that the pipelines can extract relations with an average precision around 70%, making new discoveries available to domain experts with reduced human effort, since the domain experts should only evaluate the results, instead of finding, and reading all new scientific papers.
引用
收藏
相关论文
共 50 条
  • [31] Utilizing structural metrics from knowledge graphs to enhance the robustness quantification of large language models
    Haque, Mohd Ariful
    Kamal, Marufa
    George, Roy
    Gupta, Kishor Datta
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [32] Limits of Detecting Text Generated by Large-Scale Language Models
    Varshney, Lav R.
    Keskar, Nitish Shirish
    Socher, Richard
    2020 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2020,
  • [33] On the Multilingual Capabilities of Very Large-Scale English Language Models
    Armengol-Estape, Jordi
    de Gibert Bonet, Ona
    Melero, Maite
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3056 - 3068
  • [34] NOBLE – Flexible concept recognition for large-scale biomedical natural language processing
    Eugene Tseytlin
    Kevin Mitchell
    Elizabeth Legowski
    Julia Corrigan
    Girish Chavan
    Rebecca S. Jacobson
    BMC Bioinformatics, 17
  • [35] Large-Scale Random Forest Language Models for Speech Recognition
    Su, Yi
    Jelinek, Frederick
    Khudanpur, Sanjeev
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 945 - 948
  • [36] MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
    Xu, Peng
    Patwary, Mostofa
    Shoeybi, Mohammad
    Puri, Raul
    Fung, Pascale
    Anandkumar, Anima
    Catanzaro, Bryan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2831 - 2845
  • [37] Large-Scale Language Models for Sarcasm Detection with Data Augmentation
    Zhang, Linrui
    Copus, Belinda
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024, 2024, 14763 : 1 - 9
  • [38] Towards Artwork Explanation in Large-scale Vision Language Models
    Hayashi, Kazuki
    Sakai, Yusuke
    Kamigaito, Hidetaka
    Hayashi, Katsuhiko
    Watanabe, Taro
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 705 - 729
  • [39] NOBLE - Flexible concept recognition for large-scale biomedical natural language processing
    Tseytlin, Eugene
    Mitchell, Kevin
    Legowski, Elizabeth
    Corrigan, Julia
    Chavan, Girish
    Jacobson, Rebecca S.
    BMC BIOINFORMATICS, 2016, 17
  • [40] Biomedical knowledge graph-optimized prompt generation for large language models
    Soman, Karthik
    Rose, Peter W.
    Morris, John H.
    Akbas, Rabia E.
    Smith, Brett
    Peetoom, Braian
    Villouta-Reyes, Catalina
    Cerono, Gabriel
    Shi, Yongmei
    Rizk-Jackson, Angela
    Israni, Sharat
    Nelson, Charlotte A.
    Huang, Sui
    Baranzini, Sergio E.
    BIOINFORMATICS, 2024, 40 (09)