Photosynthetic protein classification using genome neighborhood-based machine learning feature

被引:6
|
作者
Sangphukieo, Apiwat [1 ,3 ]
Laomettachit, Teeraphan [1 ]
Ruengjitchatchawalya, Marasri [1 ,2 ,4 ]
机构
[1] King Mongkuts Univ Technol Thonburi KMUTT, Sch Bioresources & Technol, Bioinformat & Syst Biol Program, Bangkok 10150, Thailand
[2] KMUTT, Sch Bioresources & Technol, Biotechnol Program, Bangkok 10150, Thailand
[3] KMUTT, Sch Informat Technol, Bangkok 10140, Thailand
[4] KMUTT, Pilot Plant Dev & Training Inst PDTI, Algal Biotechnol Res Grp, Bangkok 10150, Thailand
关键词
GENE CLUSTERS; PHOTOSYSTEM-II; SEQUENCE; CONSERVATION; ACCLIMATION; CHALLENGES; PREDICTION; NETWORK; LIGHT;
D O I
10.1038/s41598-020-64053-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Identification of novel photosynthetic proteins is important for understanding and improving photosynthetic efficiency. Synergistically, genome neighborhood can provide additional useful information to identify photosynthetic proteins. We, therefore, expected that applying a computational approach, particularly machine learning (ML) with the genome neighborhood-based feature should facilitate the photosynthetic function assignment. Our results revealed a functional relationship between photosynthetic genes and their conserved neighboring genes observed by 'Phylo score', indicating their functions could be inferred from the genome neighborhood profile. Therefore, we created a new method for extracting patterns based on the genome neighborhood network (GNN) and applied them for the photosynthetic protein classification using ML algorithms. Random forest (RF) classifier using genome neighborhood-based features achieved the highest accuracy up to 87% in the classification of photosynthetic proteins and also showed better performance (Mathew's correlation coefficient = 0.718) than other available tools including the sequence similarity search (0.447) and ML-based method (0.361). Furthermore, we demonstrated the ability of our model to identify novel photosynthetic proteins compared to the other methods. Our classifier is available at http://bicep2.kmutt.ac.th/photomod_standalone, https://bit.ly/2S0I2Ox and DockerHub: https://hub.docker.com/r/asangphukieo/photomod.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Photosynthetic protein classification using genome neighborhood-based machine learning feature
    Apiwat Sangphukieo
    Teeraphan Laomettachit
    Marasri Ruengjitchatchawalya
    [J]. Scientific Reports, 10
  • [2] Fuzzy Neighborhood-Based Manifold Learning and Feature Weight Matrix for Multilabel Feature Selection
    Sun, Lin
    Zhang, Qifeng
    Ding, Weiping
    Xu, Jiucheng
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [3] Partial Multilabel Learning Using Fuzzy Neighborhood-Based Ball Clustering and Kernel Extreme Learning Machine
    Sun, Lin
    Wang, Tianxiang
    Ding, Weiping
    Xu, Jiucheng
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (07) : 2277 - 2291
  • [4] Pixel Classification using General Adaptive Neighborhood-based Features
    Gonzalez-Castro, Victor
    Debayle, Johan
    Curic, Vladimir
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3750 - 3755
  • [5] A Novel Classification Method: Neighborhood-Based Positive Unlabeled Learning Using Decision Tree (NPULUD)
    Ghasemkhani, Bita
    Balbal, Kadriye Filiz
    Birant, Kokten Ulas
    Birant, Derya
    [J]. ENTROPY, 2024, 26 (05)
  • [6] Feature selection and classification of protein protein complexes based on their binding affinities using machine learning approaches
    Yugandhar, K.
    Gromiha, M. Michael
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2014, 82 (09) : 2088 - 2096
  • [7] SAR image classification using adaptive neighborhood-based convolutional neural network
    Zhang, Anjun
    Yang, Xuezhi
    Jia, Lu
    Ai, Jiaqiu
    Dong, Zhangyu
    [J]. EUROPEAN JOURNAL OF REMOTE SENSING, 2019, 52 (01) : 178 - 193
  • [8] Mushroom Classification Using Feature-Based Machine Learning Approach
    Maurya, Pranjal
    Singh, Nagendra Pratap
    [J]. PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 1, 2020, 1022 : 197 - 206
  • [9] Neighborhood based sample and feature selection for SVM classification learning
    He, Qiang
    Xie, Zongxia
    Hu, Qinghua
    Wu, Congxin
    [J]. NEUROCOMPUTING, 2011, 74 (10) : 1585 - 1594
  • [10] Neighborhood-Based Ensemble Evaluation Using the CRPS
    Stein, Joel
    Stoop, Fabien
    [J]. MONTHLY WEATHER REVIEW, 2022, 150 (08) : 1901 - 1914