Protein-Protein Interaction Network Extraction Using Text Mining Methods Adds Insight into Autism Spectrum Disorder

被引:1
|
作者
Nezamuldeen, Leena [1 ,2 ]
Jafri, Mohsin Saleet [1 ,3 ]
机构
[1] George Mason Univ, Sch Syst Biol, Fairfax, VA 22030 USA
[2] King Abdulaziz Univ, King Fahd Med Res Ctr, Jeddah 21589, Saudi Arabia
[3] Univ Maryland, Ctr Biomed Engn & Technol, Sch Med, Baltimore, MD 21201 USA
来源
BIOLOGY-BASEL | 2023年 / 12卷 / 10期
关键词
artificial intelligence; PPI; protein-protein interaction; text mining; BiLSTM; recurrent neural network; FILAMIN; GENE; PHOSPHORYLATION; ACTIVATION; COMPLEX; SITE; RSK;
D O I
10.3390/biology12101344
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary Research on proteins and their interactions with other proteins yields many new findings that help explain how diseases emerge. However, manual curation of scientific literature delays new discoveries in the field. Artificial intelligence and deep learning techniques have played a significant part in information extraction from textual forms. In this study, we used text mining and artificial intelligence techniques to address the issue of extracting protein-protein interaction networks from the vast amount of scientific research literature. We have created an automated system consisting of three models using deep learning and natural language processing methods. The accuracy of our first model, which employs recurrent neural networks using sentiment analysis, was 95%. Additionally, the accuracy of our second model, which employs the named entity recognition technique in NLP, was effective and achieved an accuracy of 98%. In comparison to the protein interaction network, we discovered by manual curation of more than 30 articles on Autism Spectrum Disorder, that the automated system testing on 6027 abstracts was successful in developing the network of interactions and provided an improved view. Discovering these networks will greatly help physicians and scientists understand how these molecules interact for physiological, pharmacological, and pathological insight.Abstract Text mining methods are being developed to assimilate the volume of biomedical textual materials that are continually expanding. Understanding protein-protein interaction (PPI) deficits would assist in explaining the genesis of diseases. In this study, we designed an automated system to extract PPIs from the biomedical literature that uses a deep learning sentence classification model, a pretrained word embedding, and a BiLSTM recurrent neural network with additional layers, a conditional random field (CRF) named entity recognition (NER) model, and shortest-dependency path (SDP) model using the SpaCy library in Python. The automated system ensures that it targets sentences that contain PPIs and not just these proteins mentioned in the framework of disease discovery or other context. Our first model achieved 13% greater precision on the Aimed/BioInfr benchmark corpus than the previous state-of-the-art BiLSTM neural network models. The NER model presented in this study achieved 98% precision on the Aimed/BioInfr corpus over previous models. In order to facilitate the production of an accurate representation of the PPI network, the processes were developed to systematically map the protein interactions in the texts. Overall, evaluating our system through the use of 6027 abstracts pertaining to seven proteins associated with Autism Spectrum Disorder completed the manually curated PPI network for these proteins. When it comes to complicated diseases, these networks would assist in understanding how PPI deficits contribute to disease development while also emphasizing the influence of interactions on protein function and biological processes.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [21] Construct Protein-Protein Interaction Network by Mining Domain-Domain Interactions
    Zhixia Teng
    Maozu Guo
    Xiaoyan Liu
    Jin Li
    Qiguo Dai
    Chunyu Wang
    Journal of Harbin Institute of Technology, 2016, 23 (04) : 27 - 36
  • [22] A detailed error analysis of 13 kernel methods for protein-protein interaction extraction
    Tikk, Domonkos
    Solt, Illes
    Thomas, Philippe
    Leser, Ulf
    BMC BIOINFORMATICS, 2013, 14
  • [23] A detailed error analysis of 13 kernel methods for protein-protein interaction extraction
    Domonkos Tikk
    Illés Solt
    Philippe Thomas
    Ulf Leser
    BMC Bioinformatics, 14
  • [24] Inferring protein-protein interactions using interaction network topologies
    Paccanaro, A
    Trifonov, V
    Yu, HY
    Gerstein, M
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 161 - 166
  • [25] Domain distribution and intrinsic disorder in hubs in the human protein-protein interaction network
    Patil, Ashwini
    Kinoshita, Kengo
    Nakamura, Haruki
    PROTEIN SCIENCE, 2010, 19 (08) : 1461 - 1468
  • [26] Protein function prediction using neighbor relativity in protein-protein interaction network
    Moosavi, Sobhan
    Rahgozar, Masoud
    Rahimi, Amir
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2013, 43 : 11 - 16
  • [27] Validating text mining results on protein-protein interactions using gene expression profiles
    Zhou, Deyu
    He, Yulan
    Kwoh, Chee Keong
    2006 INTERNATIONAL CONFERENCE ON BIOMEDICAL AND PHARMACEUTICAL ENGINEERING, VOLS 1 AND 2, 2006, : 577 - +
  • [28] Protein Function Prediction Using Function Associations in Protein-Protein Interaction Network
    Sun, Pingping
    Tan, Xian
    Guo, Sijia
    Zhang, Jingbo
    Sun, Bojian
    Du, Ning
    Wang, Han
    Sun, Hui
    IEEE ACCESS, 2018, 6 : 30892 - 30902
  • [29] Clustering of Protein-Protein Interaction Network Using Fractal Dimension of Protein Subnetworks
    Deepthi, V. R.
    Gopakumar, G.
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [30] A Quasi-Clique Mining Algorithm for Analysis of the Human Protein-Protein Interaction Network
    Sriwastava, Brijesh Kumar
    Basu, Subhadip
    Maulik, Ujjwal
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 411 - 417