Term Extraction from Medical Documents Using Word Embeddings

被引:0
|
作者
Bay, Matthias [1 ]
Bruness, Daniel [2 ]
Herold, Miriam [3 ]
Schulze, Christian [4 ]
Guckert, Michael [4 ]
Minor, Mirj Am [3 ]
机构
[1] MINDS Med GmbH, Frankfurt, Germany
[2] TH Mittelhessen, KITE Kompetenzzentrum Informationstechnol, Friedberg, Germany
[3] Goethe Univ, Dept Business Informat, Frankfurt, Germany
[4] TH Mittelhessen, Dept MND, Friedberg, Germany
关键词
Machine learning; natural language processing; text mining; term extraction; machine learning applications;
D O I
10.1109/CIST49399.2021.9357263
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a new method for the extraction of discipline-specific terms from medical documents. Due to the small text corpora and the specific nature of medical documents, there are limitations for approaches that are solely based on term frequencies. A combination of such methods with procedures that are sensitive to semantic aspects is therefore promising. We use word embeddings in a neighborhood context based method which we call Snowball because of its layerwise way of working. Snowball is integrated together with established methods into an end to end pipeline with which we can process documents to extract relevant terms. Proof of concept is given on a gold standard created recently together with experts in medical coding. The preliminary results highlight the feasibility of our approach and its potential for automated, machine learning based text processing in the medical context.
引用
下载
收藏
页码:328 / 333
页数:6
相关论文
共 50 条
  • [1] Comparison of Word Embeddings for Extraction from Medical Records
    Dudchenko, Aleksei
    Kopanitsa, Georgy
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2019, 16 (22)
  • [2] Automatic keyphrase extraction using word embeddings
    Yuxiang Zhang
    Huan Liu
    Suge Wang
    W. H. Ip.
    Wei Fan
    Chunjing Xiao
    Soft Computing, 2020, 24 : 5593 - 5608
  • [3] Automatic keyphrase extraction using word embeddings
    Zhang, Yuxiang
    Liu, Huan
    Wang, Suge
    Ip, W. H.
    Fan, Wei
    Xiao, Chunjing
    SOFT COMPUTING, 2020, 24 (08) : 5593 - 5608
  • [4] Prescription extraction using CRFs and word embeddings
    Tao, Carson
    Filannino, Michele
    Uzuner, Ozlem
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 72 : 60 - 66
  • [5] Word Relation Autoencoder for Unseen Hypernym Extraction Using Word Embeddings
    Chen, Hong-You
    Lee, Cheng-Syuan
    Liao, Keng-Te
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4834 - 4839
  • [6] A supervised approach to taxonomy extraction using word embeddings
    Sarkar, Rajdeep
    McCrae, John P.
    Buitelaar, Paul
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2059 - 2064
  • [7] Leveraging word embeddings and medical entity extraction for biomedical dataset retrieval using unstructured texts
    Wang, Yanshan
    Rastegar-Mojarad, Majid
    Komandur-Elayavilli, Ravikumar
    Liu, Hongfang
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2017,
  • [8] Using pseudo-senses for improving the extraction of synonyms from word embeddings
    Ferret, Olivier
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 351 - 357
  • [9] Deduplication of Scholarly Documents using Locality Sensitive Hashing and Word Embeddings
    Gyawali, Bikash
    Anastasiou, Lucas
    Knoth, Petr
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 901 - 910
  • [10] Context-Aware Dynamic Word Embeddings for Aspect Term Extraction
    Xu, Jingyun
    Xie, Jiayuan
    Cai, Yi
    Lin, Zehang
    Leung, Ho-Fung
    Li, Qing
    Chua, Tat-Seng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (01) : 144 - 156