Topic modeling of biomedical text From words and topics to disease and gene links

被引:0
|
作者
ElShal, Sarah [1 ]
Mathad, Mithila [1 ]
Simm, Jaak [1 ]
Davis, Jesse [1 ]
Moreau, Yves [1 ]
机构
[1] Katholieke Univ Leuven, Dept Comp Sci DTAI, IMinds Future Hlth Dept, Dept Elect Engn ESAT, Leuven, Belgium
关键词
text analysis; pattern recognition; machine learning; topic modelling; disease-gene linkage;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The massive growth of biomedical text makes it very challenging for researchers to review all relevant work and generate all possible hypotheses in a reasonable amount of time. Many text mining methods have been developed to simplify this process and quickly present the researcher with a learned set of biomedical hypotheses that could be potentially validated. Previously, we have focused on the task of identifying genes that are linked with a given disease by text mining the PubMed abstracts. We applied a word-based concept profile similarity to learn patterns between disease and gene entities and hence identify links between them. In this work, we study an alternative approach based on topic modelling to learn different patterns between the disease and the gene entities and measure how well this affects the identified links. We investigated multiple input corpuses, word representations, topic parameters, and similarity measures. On one hand, our results show that when we (1) learn the topics from an input set of gene-clustered set of abstracts, and (2) apply the dot-product similarity measure, we succeed to improve our original methods and identify more correct disease-gene links. On the other hand, the results also show that the learned topics remain limited to the diseases existing in our vocabulary such that scaling the methodology to new disease queries becomes non trivial.
引用
收藏
页码:712 / 716
页数:5
相关论文
共 50 条
  • [31] Text Analysis Software Using Topic Modeling Techniques for the Extraction of Knowledge from Cases Related to Vulnerability and Access to Justice
    Espinosa, Jorge E.
    Mateus, Sandra P.
    Ramirez, Diana M.
    ARTIFICIAL INTELLIGENCE IN HCI, PT III, AI-HCI 2024, 2024, 14736 : 334 - 352
  • [32] Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature
    Singhal, Ayush
    Simmons, Michael
    Lu, Zhiyong
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (04) : 766 - 772
  • [33] Analyzing Alzheimer's Disease Research Trends: Insights From Improved Dynamic Topic Modeling
    Shen, Juan
    Mariano, Vladimir Y.
    IEEE ACCESS, 2024, 12 : 106121 - 106132
  • [34] Extracting Gene-Disease Relations from Text to Support Biomarker Discovery
    Thompson, Paul
    Ananiadou, Sophia
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON DIGITAL HEALTH (DH'17), 2017, : 180 - 189
  • [35] Incorporating an Unsupervised Text Mining Approach into Studying Logistics Risk Management: Insights from Corporate Annual Reports and Topic Modeling
    Olson, David
    Chae, Bongsug
    INFORMATION, 2023, 14 (07)
  • [36] Applying citizen science to gene, drug and disease relationship extraction from biomedical abstracts
    Tsueng, Ginger
    Nanis, Max
    Fouquier, Jennifer T.
    Mayers, Michael
    Good, Benjamin M.
    Su, Andrew, I
    BIOINFORMATICS, 2020, 36 (04) : 1226 - 1233
  • [37] Disease Topic Modeling of Users' Inquiry Texts: A Text Mining-Based PQDR-LDA Model for Analyzing the Online Medical Records
    Liu, Xin
    Zhou, Yanju
    Wang, Zongrun
    Kumar, Ajay
    Biswas, Baidyanath
    IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2023, 71 : 6319 - 6337
  • [38] Evolution of inflammatory bowel disease research from a bird's eye perspective: A text-mining analysis of publication trends and topics
    Barash, Y.
    Klan, E.
    Tau, N.
    Ben-Horin, S.
    Levartovsky, A.
    Arebi, N.
    Soffer, S.
    Kopylov, U.
    JOURNAL OF CROHNS & COLITIS, 2020, 14 : S615 - S615
  • [39] From phenotype to gene: Detecting disease-specific gene functional modules via a text-based human disease phenotype network construction
    Zhang, Shi-Hua
    Wu, Chao
    Li, Xia
    Chen, Xi
    Jiang, Wei
    Gong, Bin-Sheng
    Li, Jiang
    Yan, Yu-Qing
    FEBS LETTERS, 2010, 584 (16) : 3635 - 3643
  • [40] Evolution of Inflammatory Bowel Disease Research From a Bird's-Eye Perspective: A Text-Mining Analysis of Publication Trends and Topics
    Barash, Yiftach
    Klang, Eyal
    Tau, Noam
    Ben-Horin, Shomron
    Mahajna, Hussein
    Levartovsky, Asaf
    Arebi, Naila
    Soffer, Shelly
    Kopylov, Uri
    INFLAMMATORY BOWEL DISEASES, 2021, 27 (03) : 434 - 439