DeepSF: deep convolutional neural network for mapping protein sequences to folds

被引:106
|
作者
Hou, Jie [1 ]
Adhikari, Badri [2 ]
Cheng, Jianlin [1 ,3 ]
机构
[1] Univ Missouri, Dept Elect Engn & Comp Sci, Columbia, MO 65211 USA
[2] Univ Missouri St Louis, Dept Math & Comp Sci, St Louis, MO 63121 USA
[3] Univ Missouri, Inst Informat, Columbia, MO 65211 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
SECONDARY STRUCTURE; HOMOLOGY DETECTION; RECOGNITION; PREDICTION; DATABASE; CLASSIFICATION; CATH; SCOP;
D O I
10.1093/bioinformatics/btx780
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein fold recognition is an important problem in structural bioinformatics. Almost all traditional fold recognition methods use sequence (homology) comparison to indirectly predict the fold of a target protein based on the fold of a template protein with known structure, which cannot explain the relationship between sequence and fold. Only a few methods had been developed to classify protein sequences into a small number of folds due to methodological limitations, which are not generally useful in practice. Results: We develop a deep 1D-convolution neural network (DeepSF) to directly classify any protein sequence into one of 1195 known folds, which is useful for both fold recognition and the study of sequence-structure relationship. Different from traditional sequence alignment (comparison) based methods, our method automatically extracts fold-related features from a protein sequence of any length and maps it to the fold space. We train and test our method on the datasets curated from SCOP1.75, yielding an average classification accuracy of 75.3%. On the independent testing dataset curated from SCOP2.06, the classification accuracy is 73.0%. We compare our method with a top profile-profile alignment method-HHSearch on hard template-based and template-free modeling targets of CASP9-12 in terms of fold recognition accuracy. The accuracy of our method is 12.63-26.32% higher than HHSearch on template-free modeling targets and 3.39-17.09% higher on hard template-based modeling targets for top 1, 5 and 10 predicted folds. The hidden features extracted from sequence by our method is robust against sequence mutation, insertion, deletion and truncation, and can be used for other protein pattern recognition problems such as protein clustering, comparison and ranking.
引用
收藏
页码:1295 / 1303
页数:9
相关论文
共 50 条
  • [1] DeepSF: Deep Convolutional Neural Network for Mapping Protein Sequences to Folds
    Hou, Jie
    Adhikari, Badri
    Cheng, Jianlin
    [J]. ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2018, : 565 - 565
  • [2] DEEP CONVOLUTIONAL NEURAL NETWORK FOR MANGROVE MAPPING
    Iovan, Corina
    Kulbicki, Michel
    Mermet, Eric
    [J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 1969 - 1972
  • [3] Development of Vegetation Mapping with Deep Convolutional Neural Network
    Suh, Sae-Han
    Jhang, Ji-Eun
    Won, Kwanghee
    Shin, Sung-Y.
    Sung, Chang Oan
    [J]. PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 53 - 58
  • [4] Deep Convolutional Neural Network Framework for Subpixel Mapping
    He, Da
    Zhong, Yanfei
    Wang, Xinyu
    Zhang, Liangpei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (11): : 9518 - 9539
  • [5] Landslide Susceptibility Mapping Using Deep Neural Network and Convolutional Neural Network
    Gong, Sung-Hyun
    Baek, Won-Kyung
    Jung, Hyung-Sup
    [J]. KOREAN JOURNAL OF REMOTE SENSING, 2022, 38 (06) : 1723 - 1735
  • [6] Deep Convolutional Neural Network for Bidirectional Image-Sentence Mapping
    Yu, Tianyuan
    Bai, Liang
    Guo, Jinlin
    Yang, Zheng
    Xie, Yuxiang
    [J]. MULTIMEDIA MODELING, MMM 2017, PT II, 2017, 10133 : 136 - 147
  • [7] Deep convolutional networks for quality assessment of protein folds
    Derevyanko, Georgy
    Grudinin, Sergei
    Bengio, Yoshua
    Lamoureux, Guillaume
    [J]. BIOINFORMATICS, 2018, 34 (23) : 4046 - 4053
  • [8] Deep Convolutional Neural Network
    Zhou, Yu
    Fang, Rui
    Liu, Peng
    Liu, Kai
    [J]. 2019 PROCEEDINGS OF THE CONFERENCE ON CONTROL AND ITS APPLICATIONS, CT, 2019, : 46 - 51
  • [9] Visualizing and Annotating Protein Sequences using A Deep Neural Network
    Zhao, Zhengqiao
    Rosen, Gail
    [J]. 2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 506 - 510
  • [10] Deep Residual Convolutional Neural Network for Protein-Protein interaction Extraction
    Zhang, Hao
    Guan, Renchu
    Zhou, Fengfeng
    Liang, Yanchun
    Zhan, Zhi-Hui
    Huang, Lan
    Feng, Xiaoyue
    [J]. IEEE ACCESS, 2019, 7 : 89354 - 89365