Cross-species prediction of essential genes in insects

被引:6
|
作者
de Castro, Giovanni Marques [1 ]
Hastenreiter, Zandora [1 ]
Silva Monteiro, Thiago Augusto [1 ]
Martins da Silva, Thieres Tayroni [1 ]
Lobo, Francisco Pereira [1 ]
机构
[1] Univ Fed Minas Gerais, Inst Ciencias Biol, Dept Genet Ecol & Evolucao, Belo Horizonte, MG, Brazil
关键词
DATABASE; SEQUENCE; UPDATE;
D O I
10.1093/bioinformatics/btac009
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Insects possess a vast phenotypic diversity and key ecological roles. Several insect species also have medical, agricultural and veterinary importance as parasites and disease vectors. Therefore, strategies to identify potential essential genes in insects may reduce the resources needed to find molecular players in central processes of insect biology. However, most predictors of essential genes in multicellular eukaryotes using machine learning rely on expensive and laborious experimental data to be used as gene features, such as gene expression profiles or protein-protein interactions, even though some of this information may not be available for the majority of insect species with genomic sequences available. Results: Here, we present and validate a machine learning strategy to predict essential genes in insects using sequence-based intrinsic attributes (statistical and physicochemical data) together with the predictions of subcellular location and transcriptomic data, if available. We gathered information available in public databases describing essential and non-essential genes for Drosophila melanogaster (fruit fly, Diptera) and Tribolium castaneum (red flour beetle, Coleoptera). We proceeded by computing intrinsic and extrinsic attributes that were used to train statistical models in one species and tested by their capability of predicting essential genes in the other. Even models trained using only intrinsic attributes are capable of predicting genes in the other insect species, including the prediction of lineage-specific essential genes. Furthermore, the inclusion of RNA-Seq data is a major factor to increase classifier performance.
引用
收藏
页码:1504 / 1513
页数:10
相关论文
共 50 条
  • [1] Essential genes: a cross-species perspective
    Cacheiro, Pilar
    Smedley, Damian
    [J]. MAMMALIAN GENOME, 2023, 34 (03) : 357 - 363
  • [2] Essential genes: a cross-species perspective
    Pilar Cacheiro
    Damian Smedley
    [J]. Mammalian Genome, 2023, 34 : 357 - 363
  • [3] Cross-species transcriptomics reveals differential regulation of essential photosynthesis genes in Hirschfeldia incana
    Garassino, Francesco
    Luoni, Sofia Bengoa
    Cumerlato, Tommaso
    Marquez, Francisca Reyes
    Harbinson, Jeremy
    Aarts, Mark G. M.
    Nijveen, Harm
    Smit, Sandra
    [J]. G3-GENES GENOMES GENETICS, 2024,
  • [4] Cross-species regulatory sequence activity prediction
    Kelley, David R.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (07)
  • [5] Cross-species transmission and host range genes in poxviruses
    Yang, Chen -Hui
    Song, A. -Ling
    Qiu, Ye
    Ge, Xing-Yi
    [J]. VIROLOGICA SINICA, 2024, 39 (02) : 177 - 193
  • [6] Annotation and cross-species comparison of Drosophila genes.
    Hobson, J.
    Van Stry, M.
    Leung, W.
    Elgin, S. C.
    [J]. MOLECULAR BIOLOGY OF THE CELL, 2017, 28
  • [7] On gene prediction by cross-species comparative sequence analysis
    Chen, R
    Ali, H
    [J]. PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 446 - 447
  • [8] Cross-species enhancer prediction using machine learning
    MacPhillamy, Callum
    Alinejad-Rokny, Hamid
    Pitchford, Wayne S.
    Low, Wai Yee
    [J]. GENOMICS, 2022, 114 (05)
  • [9] AGenDA: gene prediction by cross-species sequence comparison
    Taher, L
    Rinner, O
    Garg, S
    Sczyrba, A
    Morgenstern, B
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : W305 - W308
  • [10] A cross-species approach for the identification of Drosophila male sterility genes
    Ibaraki, Kimihide
    Nakatsuka, Mihoko
    Ohsako, Takashi
    Watanabe, Masahide
    Miyazaki, Yu
    Shirakami, Machi
    Karr, Timothy L.
    Sanuki, Rikako
    Tomaru, Masatoshi
    Takano-Shimizu-Kouno, Toshiyuki
    [J]. G3-GENES GENOMES GENETICS, 2021, 11 (08):