Topic modeling of biomedical text From words and topics to disease and gene links

被引:0
|
作者
ElShal, Sarah [1 ]
Mathad, Mithila [1 ]
Simm, Jaak [1 ]
Davis, Jesse [1 ]
Moreau, Yves [1 ]
机构
[1] Katholieke Univ Leuven, Dept Comp Sci DTAI, IMinds Future Hlth Dept, Dept Elect Engn ESAT, Leuven, Belgium
关键词
text analysis; pattern recognition; machine learning; topic modelling; disease-gene linkage;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The massive growth of biomedical text makes it very challenging for researchers to review all relevant work and generate all possible hypotheses in a reasonable amount of time. Many text mining methods have been developed to simplify this process and quickly present the researcher with a learned set of biomedical hypotheses that could be potentially validated. Previously, we have focused on the task of identifying genes that are linked with a given disease by text mining the PubMed abstracts. We applied a word-based concept profile similarity to learn patterns between disease and gene entities and hence identify links between them. In this work, we study an alternative approach based on topic modelling to learn different patterns between the disease and the gene entities and measure how well this affects the identified links. We investigated multiple input corpuses, word representations, topic parameters, and similarity measures. On one hand, our results show that when we (1) learn the topics from an input set of gene-clustered set of abstracts, and (2) apply the dot-product similarity measure, we succeed to improve our original methods and identify more correct disease-gene links. On the other hand, the results also show that the learned topics remain limited to the diseases existing in our vocabulary such that scaling the methodology to new disease queries becomes non trivial.
引用
下载
收藏
页码:712 / 716
页数:5
相关论文
共 50 条
  • [1] TFIDF based Feature Words Extraction and Topic Modeling for Short Text
    Zhao, Guifen
    Liu, Yanjun
    Zhang, Wei
    Wang, Yiou
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON MANAGEMENT ENGINEERING, SOFTWARE ENGINEERING AND SERVICE SCIENCES (ICMSS 2018), 2018, : 188 - 191
  • [2] Novel Topic Models for Parallel Topics Extraction from Multilingual Text
    Maanicshah, Kamal
    Manouchehri, Narges
    Amayri, Manar
    Bouguila, Nizar
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II, 2023, 13996 : 297 - 309
  • [3] Clustering-based topic modeling for biomedical documents extractive text summarization
    Nabil M. AbdelAziz
    Aliaa A. Ali
    Soaad M. Naguib
    Lamiaa S. Fayed
    The Journal of Supercomputing, 2025, 81 (1)
  • [4] Mining Causal Topics in Text Data: Iterative Topic Modeling with Time Series Feedback
    Kim, Hyun Duk
    Castellanos, Malu
    Hsu, Meichun
    Zha, ChengXiang
    Rietz, Thomas
    Diermeier, Daniel
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 885 - 890
  • [5] Topic Modeling Techniques for Text Mining over a Large-Scale Scientific and Biomedical Text Corpus
    Avasthi S.
    Chauhan R.
    Acharjya D.P.
    International Journal of Ambient Computing and Intelligence, 2022, 13 (01)
  • [6] Topic Modeling for Interpretable Text Classification From EHRs
    Rijcken, Emil
    Kaymak, Uzay
    Scheepers, Floortje
    Mosteiro, Pablo
    Zervanou, Kalliopi
    Spruit, Marco
    FRONTIERS IN BIG DATA, 2022, 5
  • [7] A Joint Model for Topic-Sentiment Modeling from Text
    Dermouche, Mohamed
    Kouas, Leila
    Velcin, Julien
    Loudcher, Sabine
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 819 - 824
  • [8] Topic Modeling over Text Streams from Social Media
    Smatana, Miroslav
    Paralic, Jan
    Butka, Peter
    TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 163 - 172
  • [9] BURST-LDA: A NEW TOPIC MODEL FOR DETECTING BURSTY TOPICS FROM STREAM TEXT
    Qi Xiang
    Huang Yu
    Chen Ziyan
    Liu Xiaoyan
    Tian Jing
    Huang Tinglei
    Wang Hongqi
    Journal of Electronics(China), 2014, 31 (06) : 565 - 575
  • [10] Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data
    Chen, Zhiyuan
    Liu, Bing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 703 - 711