LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities

被引:90
|
作者
Wang, Lei [1 ]
You, Zhu-Hong [1 ]
Chen, Xing [2 ]
Li, Yang-Ming [3 ]
Dong, Ya-Nan [4 ]
Li, Li-Ping [1 ]
Zheng, Kai [1 ]
机构
[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
[3] Rochester Inst Technol, Dept Elect Comp & Telecommun Engn Technol, Rochester, NY 14623 USA
[4] Cent South Univ, Xiangya Sch Publ Hlth, Changsha, Hunan, Peoples R China
基金
美国国家科学基金会;
关键词
PROTEIN-PROTEIN INTERACTIONS; MICRORNAS; IDENTIFICATION; NETWORK;
D O I
10.1371/journal.pcbi.1006865
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Emerging evidence has shown microRNAs (miRNAs) play an important role in human disease research. Identifying potential association among them is significant for the development of pathology, diagnose and therapy. However, only a tiny portion of all miRNA-disease pairs in the current datasets are experimentally validated. This prompts the development of high-precision computational methods to predict real interaction pairs. In this paper, we propose a new model of Logistic Model Tree for predicting miRNA-Disease Association (LMTRDA) by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. In particular, we introduce miRNA sequence information and extract its features using natural language processing technique for the first time in the miRNA-disease prediction model. In the cross-validation experiment, LMTRDA obtained 90.51% prediction accuracy with 92.55% sensitivity at the AUC of 90.54% on the HMDD V3.0 dataset. To further evaluate the performance of LMTRDA, we compared it with different classifier and feature descriptor models. In addition, we also validate the predictive ability of LMTRDA in human diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma. As a result, 28, 27 and 26 out of the top 30 miRNAs associated with these diseases were verified by experiments in different kinds of case studies. These experimental results demonstrate that LMTRDA is a reliable model for predicting the association among miRNAs and diseases. Author summary Identification of miRNA-disease associations is considered as an important step for the development of diagnose and therapy. Computational methods contribute to discovering the potential disease-related miRNAs. Based on the assumption that functionally related miRNAs tend to be involved disease, the model of LMTRDA is proposed to prioritize the underlying miRNA-disease associations by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. Through cross validation, the promising results demonstrated the effectiveness of the proposed model. We further implemented the case studies of three important human complex diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma, 28, 27 and 26 of top-30 predicted miRNA-disease associations have been manually confirmed based on recent experimental reports. It is anticipated that LMTRDA model could prioritize the most potential miRNA-disease associations on a large scale for advancing the progress of biological experiment validation in the future, which could further contribute to the understanding of complex disease mechanisms.
引用
收藏
页数:18
相关论文
共 45 条
  • [41] A miRNA-disease association prediction model based on tree-path global feature extraction and fully connected artificial neural network with multi-head self-attention mechanism
    Hou, Biyu
    Li, Mengshan
    Hou, Yuxin
    Zeng, Ming
    Wang, Nan
    Guan, Lixin
    BMC CANCER, 2024, 24 (01)
  • [42] DEVELOPING DYNAMIC P2P TRUST MODEL USING THEORY OF ENTROPY-BASED MULTI-SOURCE INFORMATION FUSION
    Li, Xiao-yong
    Zhou, Feng
    Yang, Xu-dong
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2011, 7 (02): : 777 - 790
  • [43] An efficient approach based on multi-sources information to predict circRNA-disease associations using deep convolutional neural network
    Wang, Lei
    You, Zhu-Hong
    Huang, Yu-An
    Huang, De-Shuang
    Chan, Keith C. C.
    BIOINFORMATICS, 2020, 36 (13) : 4038 - 4046
  • [44] Multi-source forcing effects analysis using Liang-Kleeman information flow method and the community atmosphere model (CAM4.0)
    Jiang, ShunYu
    Hu, HaiBo
    Zhang, Ning
    Lei, LiPing
    Bai, HaoKun
    CLIMATE DYNAMICS, 2019, 53 (9-10) : 6035 - 6053
  • [45] A Model for Expressing Industrial Information Based on Object-Oriented Industrial Heat Sources Detected Using Multi-Source Thermal Anomaly Data in China
    Ma, Caihong
    Yang, Jin
    Xia, Wei
    Liu, Jianbo
    Zhang, Yifan
    Sui, Xin
    REMOTE SENSING, 2022, 14 (04)