Investigation of protein family relationships with deep learning

被引:0
|
作者
Ponamareva, Irina [1 ,2 ]
Andreeva, Antonina [1 ]
Bileschi, Maxwell L. [3 ]
Colwell, Lucy [2 ,3 ]
Bateman, Alex [1 ]
机构
[1] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Wellcome Genome Campus, Cambridge CB10 1SD, England
[2] Univ Cambridge, Dept Chem, Cambridge CB2 1EW, England
[3] Google Res, Cambridge, MA 02142 USA
来源
BIOINFORMATICS ADVANCES | 2024年 / 4卷 / 01期
基金
英国生物技术与生命科学研究理事会;
关键词
DOMAIN; SEQUENCE;
D O I
10.1093/bioadv/vbae132
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivation In this article, we propose a method for finding similarities between Pfam families based on the pre-trained neural network ProtENN2. We use the model ProtENN2 per-residue embeddings to produce new high-dimensional per-family embeddings and develop an approach for calculating inter-family similarity scores based on these embeddings, and evaluate its predictions using structure comparison.Results We apply our method to Pfam annotation by refining clan membership for Pfam families, suggesting both new members of existing clans and potential new clans for future Pfam releases. We investigate some of the failure modes of our approach, which suggests directions for future improvements. Our method is relatively simple with few parameters and could be applied to other protein family classification models. Overall, our work suggests potential benefits of employing deep learning for improving our understanding of protein family relationships and functions of previously uncharacterized families.Availability and implementation github.com/iponamareva/ProtCNNSim, 10.5281/zenodo.10091909.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] DeepPPF: A deep learning framework for predicting protein family
    Yusuf, Shehu Mohammed
    Zhang, Fuhao
    Zeng, Min
    Li, Min
    [J]. NEUROCOMPUTING, 2021, 428 : 19 - 29
  • [2] An Innovative Bispectral Deep Learning Method for Protein Family Classification
    Abu-Qasmieh, Isam
    Al Fahoum, Amjed
    Alquran, Hiam
    Zyout, Ala'a
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 3971 - 3991
  • [3] Multimodal deep representation learning for protein interaction identification and protein family classification
    Da Zhang
    Mansur Kabuka
    [J]. BMC Bioinformatics, 20
  • [4] Multimodal deep representation learning for protein interaction identification and protein family classification
    Zhang, Da
    Kabuka, Mansur
    [J]. BMC BIOINFORMATICS, 2019, 20 (Suppl 16)
  • [5] Exploring protein family relationships
    Chambers D.
    [J]. Genome Biology, 2 (5)
  • [6] Protein Family Classification from Scratch: A CNN Based Deep Learning Approach
    Zhang, Da
    Kabuka, Mansur R.
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (05) : 1996 - 2007
  • [7] A Novel Multi-Stage Bispectral Deep Learning Method for Protein Family Classification
    Al Fahoum, Amjed
    Zyout, Ala'a
    Alquran, Hiam
    Abu-Qasmieh, Isam
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (01): : 1173 - 1193
  • [8] Calibrating the classifier for protein family prediction with protein sequence using machine learning techniques: An empirical investigation
    Idhaya, T.
    Suruliandi, A.
    Calitoiu, Dragos
    Raja, S. P.
    [J]. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2023, 21 (03)
  • [9] Transcriptome Sequences Resolve Deep Relationships of the Grape Family
    Wen, Jun
    Xiong, Zhiqiang
    Nie, Ze-Long
    Mao, Likai
    Zhu, Yabing
    Kan, Xian-Zhao
    Ickert-Bond, Stefanie M.
    Gerrath, Jean
    Zimmer, Elizabeth A.
    Fang, Xiao-Dong
    [J]. PLOS ONE, 2013, 8 (09):
  • [10] Recommendation with Social Relationships via Deep Learning
    Rafailidis, Dimitrios
    Crestani, Fabio
    [J]. ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, 2017, : 151 - 158