Cross-species prediction of essential genes in insects

被引:6
|
作者
de Castro, Giovanni Marques [1 ]
Hastenreiter, Zandora [1 ]
Silva Monteiro, Thiago Augusto [1 ]
Martins da Silva, Thieres Tayroni [1 ]
Lobo, Francisco Pereira [1 ]
机构
[1] Univ Fed Minas Gerais, Inst Ciencias Biol, Dept Genet Ecol & Evolucao, Belo Horizonte, MG, Brazil
关键词
DATABASE; SEQUENCE; UPDATE;
D O I
10.1093/bioinformatics/btac009
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Insects possess a vast phenotypic diversity and key ecological roles. Several insect species also have medical, agricultural and veterinary importance as parasites and disease vectors. Therefore, strategies to identify potential essential genes in insects may reduce the resources needed to find molecular players in central processes of insect biology. However, most predictors of essential genes in multicellular eukaryotes using machine learning rely on expensive and laborious experimental data to be used as gene features, such as gene expression profiles or protein-protein interactions, even though some of this information may not be available for the majority of insect species with genomic sequences available. Results: Here, we present and validate a machine learning strategy to predict essential genes in insects using sequence-based intrinsic attributes (statistical and physicochemical data) together with the predictions of subcellular location and transcriptomic data, if available. We gathered information available in public databases describing essential and non-essential genes for Drosophila melanogaster (fruit fly, Diptera) and Tribolium castaneum (red flour beetle, Coleoptera). We proceeded by computing intrinsic and extrinsic attributes that were used to train statistical models in one species and tested by their capability of predicting essential genes in the other. Even models trained using only intrinsic attributes are capable of predicting genes in the other insect species, including the prediction of lineage-specific essential genes. Furthermore, the inclusion of RNA-Seq data is a major factor to increase classifier performance.
引用
收藏
页码:1504 / 1513
页数:10
相关论文
共 50 条
  • [21] A cross-species transcriptomics approach to identify genes involved in leaf development
    Nathaniel Robert Street
    Andreas Sjödin
    Max Bylesjö
    Petter Gustafsson
    Johan Trygg
    Stefan Jansson
    [J]. BMC Genomics, 9
  • [22] Cross-species comparison of Drosophila male accessory gland protein genes
    Mueller, JL
    Ram, KR
    McGraw, LA
    Qazi, MCB
    Siggia, ED
    Clark, AG
    Aquadro, CF
    Wolfner, MF
    [J]. GENETICS, 2005, 171 (01) : 131 - 143
  • [23] Cross-species transcriptomic approach reveals genes in hamster implantation sites
    Lei, Wei
    Herington, Jennifer
    Galindo, Cristi L.
    Ding, Tianbing
    Brown, Naoko
    Reese, Jeff
    Paria, Bibhash C.
    [J]. REPRODUCTION, 2014, 148 (06) : 607 - 621
  • [24] Cross-species data integration to prioritize causal genes in lipid metabolism
    Votava, James A.
    Parks, Brian W.
    [J]. CURRENT OPINION IN LIPIDOLOGY, 2021, 32 (02) : 141 - 146
  • [25] Cross-Species Protein Function Prediction with Asynchronous-Random Walk
    Zhao, Yingwen
    Wang, Jun
    Guo, Maozu
    Zhang, Xiangliang
    Yu, Guoxian
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (04) : 1439 - 1450
  • [26] CrepHAN: cross-species prediction of enhancers by using hierarchical attention networks
    Hong, Jianwei
    Gao, Ruitian
    Yang, Yang
    [J]. BIOINFORMATICS, 2021, 37 (20) : 3436 - 3443
  • [27] Prediction of mammalian virus cross-species transmission based on host proteins
    Zhang, Zheng
    Lu, Congyu
    Mo, Bocheng
    Bai, Kehan
    Ge, Xing-Yi
    Deng, Li
    Peng, Yousong
    [J]. MICROBIOLOGY SPECTRUM, 2023, 11 (05):
  • [28] Prediction of mammalian virus cross-species transmission based on host proteins
    Zhang, Zheng
    Lu, Congyu
    Mo, Bocheng
    Bai, Kehan
    Ge, Xing-Yi
    Deng, Li
    Peng, Yousong
    [J]. MICROBIOLOGY SPECTRUM, 2023,
  • [29] The cross-species prediction of bacterial promoters using a support vector machine
    Towsey, Michael
    Timms, Peter
    Hogan, James
    Mathews, Sarah A.
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2008, 32 (05) : 359 - 366
  • [30] PSPGO: Cross-Species Heterogeneous Network Propagation for Protein Function Prediction
    Wu, Kaitao
    Wang, Lexiang
    Liu, Bo
    Liu, Yang
    Wang, Yadong
    Li, Junyi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 1713 - 1724