Protein-Protein Interaction Network Extraction Using Text Mining Methods Adds Insight into Autism Spectrum Disorder

被引:1
|
作者
Nezamuldeen, Leena [1 ,2 ]
Jafri, Mohsin Saleet [1 ,3 ]
机构
[1] George Mason Univ, Sch Syst Biol, Fairfax, VA 22030 USA
[2] King Abdulaziz Univ, King Fahd Med Res Ctr, Jeddah 21589, Saudi Arabia
[3] Univ Maryland, Ctr Biomed Engn & Technol, Sch Med, Baltimore, MD 21201 USA
来源
BIOLOGY-BASEL | 2023年 / 12卷 / 10期
关键词
artificial intelligence; PPI; protein-protein interaction; text mining; BiLSTM; recurrent neural network; FILAMIN; GENE; PHOSPHORYLATION; ACTIVATION; COMPLEX; SITE; RSK;
D O I
10.3390/biology12101344
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary Research on proteins and their interactions with other proteins yields many new findings that help explain how diseases emerge. However, manual curation of scientific literature delays new discoveries in the field. Artificial intelligence and deep learning techniques have played a significant part in information extraction from textual forms. In this study, we used text mining and artificial intelligence techniques to address the issue of extracting protein-protein interaction networks from the vast amount of scientific research literature. We have created an automated system consisting of three models using deep learning and natural language processing methods. The accuracy of our first model, which employs recurrent neural networks using sentiment analysis, was 95%. Additionally, the accuracy of our second model, which employs the named entity recognition technique in NLP, was effective and achieved an accuracy of 98%. In comparison to the protein interaction network, we discovered by manual curation of more than 30 articles on Autism Spectrum Disorder, that the automated system testing on 6027 abstracts was successful in developing the network of interactions and provided an improved view. Discovering these networks will greatly help physicians and scientists understand how these molecules interact for physiological, pharmacological, and pathological insight.Abstract Text mining methods are being developed to assimilate the volume of biomedical textual materials that are continually expanding. Understanding protein-protein interaction (PPI) deficits would assist in explaining the genesis of diseases. In this study, we designed an automated system to extract PPIs from the biomedical literature that uses a deep learning sentence classification model, a pretrained word embedding, and a BiLSTM recurrent neural network with additional layers, a conditional random field (CRF) named entity recognition (NER) model, and shortest-dependency path (SDP) model using the SpaCy library in Python. The automated system ensures that it targets sentences that contain PPIs and not just these proteins mentioned in the framework of disease discovery or other context. Our first model achieved 13% greater precision on the Aimed/BioInfr benchmark corpus than the previous state-of-the-art BiLSTM neural network models. The NER model presented in this study achieved 98% precision on the Aimed/BioInfr corpus over previous models. In order to facilitate the production of an accurate representation of the PPI network, the processes were developed to systematically map the protein interactions in the texts. Overall, evaluating our system through the use of 6027 abstracts pertaining to seven proteins associated with Autism Spectrum Disorder completed the manually curated PPI network for these proteins. When it comes to complicated diseases, these networks would assist in understanding how PPI deficits contribute to disease development while also emphasizing the influence of interactions on protein function and biological processes.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [1] Protein-protein interaction predictions using text mining methods
    Papanikolaou, Niko Las
    Pavlopoulos, Georgios A.
    Theodosiou, Theodosios
    Iliopoulos, Ioannis
    METHODS, 2015, 74 : 47 - 53
  • [2] Identification of the Key Genes of Autism Spectrum Disorder Through Protein-Protein Interaction Network
    Azodi, Mona Zamanian
    Tavirani, Mostafa Rezaei
    Tavirani, Majid Rezaei
    GALEN MEDICAL JOURNAL, 2019, 8
  • [3] Text-Mining Protein-Protein Interaction Corpus Using Concept Clustering to Identify Intermittency
    Peterson, Leif E.
    Coleman, Matthew A.
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3634 - +
  • [4] Protein-protein interaction network constructing based on text mining and reinforcement learning with application to prostate cancer
    Zhu, Fei
    Liu, Quan
    Zhang, Xiaofang
    Shen, Bairong
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [5] Protein-Protein Interaction Extraction from Text by Selecting Linguistic Features
    Thuy Thi Thanh Phan
    Ohkawa, Takenao
    Yamamoto, Akihiro
    2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 181 - 187
  • [6] Classification of Protein-Protein Interaction Full-Text Documents Using Text and Citation Network Features
    Kolchinsky, Artemy
    Abi-Haidar, Alaa
    Kaur, Jasleen
    Hamed, Ahmed Abdeen
    Rocha, Luis M.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (03) : 400 - 411
  • [7] TMAC: An automated text mining tool for construction of an annotated corpus to support protein-protein interaction information extraction
    Communication and Electronics Section, Faculty of Engineering, El Fayoum University, Fayoum, Egypt
    ICCTD - Int. Conf. Comput. Technol. Dev., Proc., (75-79):
  • [8] Extracting and mining protein-protein interaction network from biomedical literature
    Hu, XH
    Yoo, IH
    Song, IY
    Song, M
    Han, JC
    Lechner, M
    PROCEEDINGS OF THE 2004 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2004, : 244 - 251
  • [9] Review on several clustering methods in protein-protein interaction network
    Key Laboratory of Science and Technology for National Defense of Parallel and Distributed Processing, National Univ. of Defense Technology, Changsha 410073, China
    Guofang Keji Daxue Xuebao, 2009, 4 (81-86):
  • [10] Boolean Modeling of Biological Network Applied to Protein-Protein Interaction Network of Autism Patients
    Nezamuldeen, Leena
    Jafri, Mohsin Saleet
    BIOLOGY-BASEL, 2024, 13 (08):