A System for Medical Information Extraction and Verification from Unstructured Text

被引:0
|
作者
Juric, Damir [1 ]
Stoilos, Giorgos [2 ]
Melo, Andre [2 ]
Moore, Jonathan [1 ]
Khodadadi, Mohammad [1 ]
机构
[1] Babylon Hlth, London SW3 3DD, England
[2] Huawei Technol Res & Dev, Edinburgh, Midlothian, Scotland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A wealth of medical knowledge has been encoded in terminologies like SNOMED CT, NCI, FMA, and more. However, these resources are usually lacking information like relations between diseases, symptoms, and risk factors preventing their use in diagnostic or other decision making applications. In this paper we present a pipeline for extracting such information from unstructured text and enriching medical knowledge bases. Our approach uses Semantic Role Labelling and is unsupervised. We show how we dealt with several deficiencies of SRL-based extraction, like copula verbs, relations expressed through nouns, and assigning scores to extracted triples. The system have so far extracted about 120K relations and in-house doctors verified about 5k relationships. We compared the output of the system with a manually constructed network of diseases, symptoms and risk factors build by doctors in the course of a year. Our results show that our pipeline extracts good quality and precise relations and speeds up the knowledge acquisition process considerably.
引用
收藏
页码:13314 / 13319
页数:6
相关论文
共 50 条
  • [1] A general framework for subjective information extraction from unstructured English text
    Mangassarian, Hratch
    Artail, Hassan
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 62 (02) : 352 - 367
  • [2] CyNER: Information Extraction from Unstructured Text of CTI Sources with Noncontextual IOCs
    Fujii, Shota
    Kawaguchi, Nobutaka
    Shigemoto, Tomohiro
    Yamauchi, Toshihiro
    [J]. ADVANCES IN INFORMATION AND COMPUTER SECURITY, IWSEC 2022, 2022, 13504 : 85 - 104
  • [3] A Refinement System for Medical Information Extraction from Text-based Bilingual Electronic Medical Records
    Bae, Inho
    Kim, Jin-Sang
    [J]. HEALTHCARE INFORMATICS RESEARCH, 2008, 14 (03) : 267 - 274
  • [4] Event Extraction from Unstructured Amharic Text
    Tadesse, Ephrem
    Aga, Rosa Tsegaye
    Qaqqabaa, Kuulaa
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2103 - 2109
  • [5] GeoXTag: Relative Spatial Information Extraction and Tagging of Unstructured Text
    Syed, Mehtab Alam
    Arsevska, Elena
    Roche, Mathieu
    Teisseire, Maguelonne
    [J]. 25TH AGILE CONFERENCE ON GEOGRAPHIC INFORMATION SCIENCE ARTIFICIAL INTELLIGENCE IN THE SERVICE OF GEOSPATIAL TECHNOLOGIES, 2022, 3
  • [6] Image Text Extraction and Natural Language Processing of Unstructured Data from Medical Reports
    Malashin, Ivan
    Masich, Igor
    Tynchenko, Vadim
    Gantimurov, Andrei
    Nelyub, Vladimir
    Borodulin, Aleksei
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (02): : 1361 - 1377
  • [7] An Application of Intuitionistic Fuzzy Sets to Improve Information Extraction from Thai Unstructured Text
    Intarapaiboon, Peerasak
    Theeramunkong, Thanaruk
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09): : 2334 - 2345
  • [8] Extraction of protein interaction information from unstructured text using a link grammar parser
    Seoud, Rania A. Abul
    Youssef, Abou-Bakr M.
    Kadah, Yasser M.
    [J]. 2007 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS: ICCES '07, 2007, : 70 - +
  • [9] Semantic Representation Extraction from Unstructured Arabic Text
    Zakria, Gehad
    Farouk, Mamdouh
    Fathy, Khaled
    Makar, Malak N.
    [J]. PROCEEDINGS OF 2019 8TH INTERNATIONAL CONFERENCE ON SOFTWARE AND INFORMATION ENGINEERING (ICSIE 2019), 2019, : 222 - 226
  • [10] Mathematical Expression Extraction from Unstructured Plain Text
    Fernando, Kulakshi
    Ranathunga, Surangika
    Dias, Gihan
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 312 - 320