Deep learning on chaos game representation for proteins

被引:38
|
作者
Loechel, Hannah F. [1 ]
Eger, Dominic [1 ]
Sperlea, Theodor [1 ]
Heider, Dominik [1 ]
机构
[1] Philipps Univ Marburg, Dept Math & Comp Sci, D-35032 Marburg, Germany
关键词
DRUG-RESISTANCE; SEQUENCES; PREDICTION; CLASSIFICATION;
D O I
10.1093/bioinformatics/btz493
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Classification of protein sequences is one big task in bioinformatics and has many applications. Different machine learning methods exist and are applied on these problems, such as support vector machines (SVM), random forests (RF) and neural networks (NN). All of these methods have in common that protein sequences have to be made machine-readable and comparable in the first step, for which different encodings exist. These encodings are typically based on physical or chemical properties of the sequence. However, due to the outstanding performance of deep neural networks (DNN) on image recognition, we used frequency matrix chaos game representation (FCGR) for encoding of protein sequences into images. In this study, we compare the performance of SVMs, RFs and DNNs, trained on FCGR encoded protein sequences. While the original chaos game representation (CGR) has been used mainly for genome sequence encoding and classification, we modified it to work also for protein sequences, resulting in n-flakes representation, an image with several icosagons. Results We could show that all applied machine learning techniques (RF, SVM and DNN) show promising results compared to the state-of-the-art methods on our benchmark datasets, with DNNs outperforming the other methods and that FCGR is a promising new encoding method for protein sequences. Availability and implementation https://cran.r-project.org/. Supplementary information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:272 / 279
页数:8
相关论文
共 50 条
  • [1] Chaos game representation of proteins
    Basu, S
    Pan, A
    Dutta, C
    Das, J
    [J]. JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 1997, 15 (05): : 279 - 289
  • [2] Author Identification Using Chaos Game Representation and Deep Learning
    Stoean, Catalin
    Lichtblau, Daniel
    [J]. MATHEMATICS, 2020, 8 (11) : 1 - 19
  • [3] Accurate and fast clade assignment via deep learning and frequency chaos game representation
    Avila Cartes, Jorge
    Anand, Santosh
    Ciccolella, Simone
    Bonizzoni, Paola
    Della Vedova, Gianluca
    [J]. GIGASCIENCE, 2022, 12
  • [4] Accurate and fast clade assignment via deep learning and frequency chaos game representation
    Avila Cartes, Jorge
    Anand, Santosh
    Ciccolella, Simone
    Bonizzoni, Paola
    Della Vedova, Gianluca
    [J]. GIGASCIENCE, 2023, 12
  • [5] Subcellular Locations Prediction of Proteins Based on Chaos Game Representation
    Li Nana
    Niu Xiaohui
    Shi Feng
    Hu Xuehai
    [J]. 2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 328 - 331
  • [6] Subcellular location of apoptosis proteins based on chaos game representation
    Song, Chaohong
    Shi, Feng
    [J]. 2009 INTERNATIONAL CONFERENCE ON FUTURE BIOMEDICAL INFORMATION ENGINEERING (FBIE 2009), 2009, : 194 - 196
  • [7] Chaos Game Representation\ast
    Chan, Eunice Y. S.
    Corless, Robert M.
    [J]. SIAM REVIEW, 2023, 65 (01) : 261 - 290
  • [8] Evaluating the chaos game representation of proteins for applications in machine learning models: prediction of antibody affinity and specificity as a case study
    Andrea Arsiccio
    Lorenzo Stratta
    Tim Menzen
    [J]. Journal of Molecular Modeling, 2023, 29
  • [9] Evaluating the chaos game representation of proteins for applications in machine learning models: prediction of antibody affinity and specificity as a case study
    Arsiccio, Andrea
    Stratta, Lorenzo
    Menzen, Tim
    [J]. JOURNAL OF MOLECULAR MODELING, 2023, 29 (12)
  • [10] Chaos Game Representation of Audio Signals
    Cohen-McFarlane, Madison
    Dick, Kevin
    Green, James R.
    Goubran, Rafik
    [J]. 2021 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE (I2MTC 2021), 2021,