Protein transfer learning improves identification of heat shock protein families

被引:14
|
作者
Min, Seonwoo [1 ]
Kim, HyunGi [1 ]
Lee, Byunghan [2 ]
Yoon, Sungroh [1 ,3 ,4 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Seoul Natl Univ Sci & Technol, Dept Elect & IT Media Engn, Seoul, South Korea
[3] Seoul Natl Univ, Dept Biol Sci, Interdisciplinary Program Artificial Intelligence, Interdisciplinary Program Bioinformat,ASRI,INMC, Seoul, South Korea
[4] Seoul Natl Univ, Inst Engn Res, Seoul, South Korea
来源
PLOS ONE | 2021年 / 16卷 / 05期
基金
新加坡国家研究基金会;
关键词
D O I
10.1371/journal.pone.0251865
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Heat shock proteins (HSPs) play a pivotal role as molecular chaperones against unfavorable conditions. Although HSPs are of great importance, their computational identification remains a significant challenge. Previous studies have two major limitations. First, they relied heavily on amino acid composition features, which inevitably limited their prediction performance. Second, their prediction performance was overestimated because of the independent two-stage evaluations and train-test data redundancy. To overcome these limitations, we introduce two novel deep learning algorithms: (1) time-efficient DeepHSP and (2) high-performance DeeperHSP. We propose a convolutional neural network (CNN)-based DeepHSP that classifies both non-HSPs and six HSP families simultaneously. It outperforms state-of-the-art algorithms, despite taking 14-15 times less time for both training and inference. We further improve the performance of DeepHSP by taking advantage of protein transfer learning. While DeepHSP is trained on raw protein sequences, DeeperHSP is trained on top of pre-trained protein representations. Therefore, DeeperHSP remarkably outperforms state-of-the-art algorithms increasing F1 scores in both cross-validation and independent test experiments by 20% and 10%, respectively. We envision that the proposed algorithms can provide a proteome-wide prediction of HSPs and help in various downstream analyses for pathology and clinical research.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] The maximal cytoprotective function of the heat shock protein 27 is dependent on heat shock protein 70
    Sreedharan, R.
    Riordan, M.
    Thullin, G.
    Van Why, S.
    Siegel, N. J.
    Kashgarian, M.
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH, 2011, 1813 (01): : 129 - 135
  • [32] Inhibition of Heat Shock Induction of Heat Shock Protein 70 and Enhancement of Heat Shock Protein 27 Phosphorylation by Quercetin Derivatives
    Wang, Rongsheng E.
    Kao, Jeffrey L. -F.
    Hilliard, Carolyn A.
    Pandita, Raj K.
    Roti, Joseph L. Roti
    Hunt, Clayton R.
    Taylor, John-Stephen
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2009, 52 (07) : 1912 - 1921
  • [33] Crystallization of heat shock protein essential for protein disaggregation
    Orlikowska, Marta
    Liberek, Krzysztof
    Bujacz, Grzegorz
    [J]. ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2016, 72 : S247 - S247
  • [34] CHARACTERIZATION OF THE HEAT-SHOCK RESPONSE AND IDENTIFICATION OF HEAT-SHOCK PROTEIN ANTIGENS OF BORRELIA-BURGDORFERI
    CARREIRO, MM
    LAUX, DC
    NELSON, DR
    [J]. INFECTION AND IMMUNITY, 1990, 58 (07) : 2186 - 2191
  • [35] Emotional and learning behaviour in mice overexpressing heat shock protein 70
    Ammon-Treiber, Susanne
    Grecksch, Gisela
    Angelidis, Charalampos
    Vezyraki, Patra
    Hoellt, Volker
    Becker, Axel
    [J]. NEUROBIOLOGY OF LEARNING AND MEMORY, 2008, 90 (02) : 358 - 364
  • [36] Identifying Heat Shock Protein Families from Imbalanced Data by Using Combined Features
    Jing, Xiao-Yang
    Li, Feng-Min
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2020, 2020 (2020)
  • [37] COMPARATIVE SEQUENCE-ANALYSIS OF HEAT-SHOCK PROTEIN GENE FAMILIES OF SOYBEAN
    NAGAO, RT
    KEY, JL
    [J]. JOURNAL OF CELLULAR BIOCHEMISTRY, 1987, : 41 - 41
  • [38] Small heat shock protein improves survival of mammalian cells during dehydrated storage
    Ma, X
    Tablin, F
    Crowe, JH
    Oliver, AE
    [J]. MOLECULAR BIOLOGY OF THE CELL, 2004, 15 : 108A - 108A
  • [39] Oral glutamine enhances heat shock protein expression and improves survival following hyperthermia
    Singleton, KD
    Wischmeyer, PE
    [J]. SHOCK, 2006, 25 (03): : 295 - 299
  • [40] The heat shock protein amplifier arimoclomol improves refolding, maturation and lysosomal activity of glucocerebrosidase
    Fog, Cathrine K.
    Zago, Paola
    Malini, Erika
    Solanko, Lukasz M.
    Peruzzo, Paolo
    Bornaes, Claus
    Magnoni, Raffaella
    Mehmedbasic, Arnela
    Petersen, Nikolaj H. T.
    Bembi, Bruno
    Aerts, Johannes F. M. G.
    Dardis, Andrea
    Kirkegaard, Thomas
    [J]. EBIOMEDICINE, 2018, 38 : 142 - 153