Protein-Protein Interaction Network Extraction Using Text Mining Methods Adds Insight into Autism Spectrum Disorder

被引:1
|
作者
Nezamuldeen, Leena [1 ,2 ]
Jafri, Mohsin Saleet [1 ,3 ]
机构
[1] George Mason Univ, Sch Syst Biol, Fairfax, VA 22030 USA
[2] King Abdulaziz Univ, King Fahd Med Res Ctr, Jeddah 21589, Saudi Arabia
[3] Univ Maryland, Ctr Biomed Engn & Technol, Sch Med, Baltimore, MD 21201 USA
来源
BIOLOGY-BASEL | 2023年 / 12卷 / 10期
关键词
artificial intelligence; PPI; protein-protein interaction; text mining; BiLSTM; recurrent neural network; FILAMIN; GENE; PHOSPHORYLATION; ACTIVATION; COMPLEX; SITE; RSK;
D O I
10.3390/biology12101344
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary Research on proteins and their interactions with other proteins yields many new findings that help explain how diseases emerge. However, manual curation of scientific literature delays new discoveries in the field. Artificial intelligence and deep learning techniques have played a significant part in information extraction from textual forms. In this study, we used text mining and artificial intelligence techniques to address the issue of extracting protein-protein interaction networks from the vast amount of scientific research literature. We have created an automated system consisting of three models using deep learning and natural language processing methods. The accuracy of our first model, which employs recurrent neural networks using sentiment analysis, was 95%. Additionally, the accuracy of our second model, which employs the named entity recognition technique in NLP, was effective and achieved an accuracy of 98%. In comparison to the protein interaction network, we discovered by manual curation of more than 30 articles on Autism Spectrum Disorder, that the automated system testing on 6027 abstracts was successful in developing the network of interactions and provided an improved view. Discovering these networks will greatly help physicians and scientists understand how these molecules interact for physiological, pharmacological, and pathological insight.Abstract Text mining methods are being developed to assimilate the volume of biomedical textual materials that are continually expanding. Understanding protein-protein interaction (PPI) deficits would assist in explaining the genesis of diseases. In this study, we designed an automated system to extract PPIs from the biomedical literature that uses a deep learning sentence classification model, a pretrained word embedding, and a BiLSTM recurrent neural network with additional layers, a conditional random field (CRF) named entity recognition (NER) model, and shortest-dependency path (SDP) model using the SpaCy library in Python. The automated system ensures that it targets sentences that contain PPIs and not just these proteins mentioned in the framework of disease discovery or other context. Our first model achieved 13% greater precision on the Aimed/BioInfr benchmark corpus than the previous state-of-the-art BiLSTM neural network models. The NER model presented in this study achieved 98% precision on the Aimed/BioInfr corpus over previous models. In order to facilitate the production of an accurate representation of the PPI network, the processes were developed to systematically map the protein interactions in the texts. Overall, evaluating our system through the use of 6027 abstracts pertaining to seven proteins associated with Autism Spectrum Disorder completed the manually curated PPI network for these proteins. When it comes to complicated diseases, these networks would assist in understanding how PPI deficits contribute to disease development while also emphasizing the influence of interactions on protein function and biological processes.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [31] A Framework for Discovering Important Patterns Through Parallel Mining of Protein-Protein Interaction Network
    Dasgupta, Sarbani
    Saha, Banani
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 3, 2016, 381 : 397 - 406
  • [32] A Study of Network-based Kernel Methods on Protein-Protein Interaction for Protein Functions Prediction
    Ching, Wai-Ki
    Li, Limin
    Chan, Yat-Ming
    Mamitsuka, Hiroshi
    OPTIMIZATION AND SYSTEMS BIOLOGY, 2009, 11 : 25 - +
  • [33] Protein-Protein Interaction Network Clustering Using Particle Swarm Optimization
    Sharafuddin, Iman
    Mirzaei, Mehrdad
    Rahgozar, Masoud
    Masoudi-Nejad, Ali
    PROCEEDINGS IWBBIO 2013: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, 2013, : 317 - +
  • [34] Inferring strengths of protein-protein interaction using artificial neural network
    Xia, Jun-Feng
    Wang, Bing
    Huang, De-Shuang
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 2470 - +
  • [35] Deep Neural Network Based Protein-Protein Interaction Extraction from Biomedical Literature
    Zhao, Zhehuan
    Yang, Zhihao
    Luo, Ling
    Lin, Hongfei
    Wang, Jian
    Gao, Song
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1156 - 1156
  • [36] RETRACTED: Comparison of classification methods on imbalanced protein-protein interaction text set (Retracted Article)
    Xu, Guixian
    Gao, Xu
    Zhao, Xiaobing
    2011 INTERNATIONAL CONFERENCE ON ENERGY AND ENVIRONMENTAL SCIENCE-ICEES 2011, 2011, 11 : 2295 - 2301
  • [37] Protein-Protein Interaction Network Analysis Revealed a New Prospective of Posttraumatic Stress Disorder
    Okhovatian, Farshad
    Tavirani, Mostafa Rezaei
    Rostami-Nejad, Mohammad
    Tavirani, Sina Rezaei
    GALEN MEDICAL JOURNAL, 2018, 7 (01):
  • [38] Improving protein-protein interaction prediction using protein language model and protein network features
    Hu, Jun
    Li, Zhe
    Rao, Bing
    Thafar, Maha A.
    Arif, Muhammad
    ANALYTICAL BIOCHEMISTRY, 2024, 693
  • [39] Detection of protein complex from protein-protein interaction network using Markov clustering
    Ochieng, P. J.
    Kusuma, W. A.
    Haryanto, T.
    INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS, CHEMOMETRICS AND METABOLOMICS, 2017, 835
  • [40] Protein features fusion using attributed network embedding for predicting protein-protein interaction
    Cao, Mei-Yuan
    Zainudin, Suhaila
    Daud, Kauthar Mohd
    BMC GENOMICS, 2024, 25 (01):