Unsupervised AI reveals insect species-specific genome signatures

被引:0
|
作者
Sawada, Yui [1 ]
Minei, Ryuhei [1 ]
Tabata, Hiromasa [1 ]
Ikemura, Toshimichi [1 ]
Wada, Kennosuke [1 ]
Wada, Yoshiko [1 ]
Nagata, Hiroshi [1 ]
Iwasaki, Yuki [1 ]
机构
[1] Nagahama Inst Biosci & Technol, Dept Biosci, Tamura, Japan
来源
PEERJ | 2024年 / 12卷
关键词
Chromatin; Genome signature; Insect genome; Oligonucleotide usage; Transcription factor binding motifs; Unsupervised machine learning; SYNONYMOUS CODON USAGE; FACTOR-BINDING MOTIFS; ORGANIZING MAP SOM; HETEROCHROMATIN; CHROMOSOME; LANDSCAPE; SEQUENCES; DOMAINS; REGIONS; RANGE;
D O I
10.7717/peerj.17025
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Insects are a highly diverse phylogeny and possess a wide variety of traits, including the presence or absence of wings and metamorphosis. These diverse traits are of great interest for studying genome evolution, and numerous comparative genomic studies have examined a wide phylogenetic range of insects. Here, we analyzed 22 insects belonging to a wide phylogenetic range (Endopterygota, Paraneoptera, Polyneoptera, Palaeoptera, and other insects) by using a batch-learning self-organizing map (BLSOM) for oligonucleotide compositions in their genomic fragments (100-kb or 1-Mb sequences), which is an unsupervised machine learning algorithm that can extract species-specific characteristics of the oligonucleotide compositions (genome signatures). The genome signature is of particular interest in terms of the mechanisms and biological significance that have caused the species-specific difference, and can be used as a powerful search needle to explore the various roles of genome sequences other than protein coding, and can be used to unveil mysteries hidden in the genome sequence. Since BLSOM is an unsupervised clustering method, the clustering of sequences was performed based on the oligonucleotide composition alone, without providing information about the species from which each fragment sequence was derived. Therefore, not only the interspecies separation, but also the intraspecies separation can be achieved. Here, we have revealed the specific genomic regions with oligonucleotide compositions distinct from the usual sequences of each insect genome, e.g., Mb-level structures found for a grasshopper Schistocerca americana. One aim of this study was to compare the genome characteristics of insects with those of vertebrates, especially humans, which are phylogenetically distant from insects. Recently, humans seem to be the "model organism"for which a large amount of information has been accumulated using a variety of cutting-edge and high-throughput technologies. Therefore, it is reasonable to use the abundant information from humans to study insect lineages. The specific regions of Mb length with distinct oligonucleotide compositions have also been previously observed in the human genome. These regions were enriched by transcription factor binding motifs (TFBSs) and hypothesized to be involved in the three-dimensional arrangement of chromosomal DNA in interphase nuclei. The present study characterized the species-specific oligonucleotide compositions (i.e., genome signatures) in insect genomes and identified specific genomic regions with distinct oligonucleotide compositions.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Species-Specific Chemical Signatures in Scale Insect Honeydew
    Manpreet K. Dhami
    Robin Gardner-Gee
    Jeremy Van Houtte
    Silas G. Villas-Bôas
    Jacqueline R. Beggs
    [J]. Journal of Chemical Ecology, 2011, 37 : 1231 - 1241
  • [2] Species-Specific Chemical Signatures in Scale Insect Honeydew
    Dhami, Manpreet K.
    Gardner-Gee, Robin
    Van Houtte, Jeremy
    Villas-Boas, Silas G.
    Beggs, Jacqueline R.
    [J]. JOURNAL OF CHEMICAL ECOLOGY, 2011, 37 (11) : 1231 - 1241
  • [3] The Genome of the Great Gerbil Reveals Species-Specific Duplication of an MHCII Gene
    Nilsson, Pernille
    Solbakken, Monica H.
    Schmid, Boris, V
    Orr, Russell J. S.
    Lv, Ruichen
    Cui, Yujun
    Song, Yajun
    Zhang, Yujiang
    Baalsrud, Helle T.
    Torresen, Ole K.
    Stenseth, Nils Chr
    Yang, Ruifu
    Jakobsen, Kjetill S.
    Easterday, William Ryan
    Jentoft, Sissel
    [J]. GENOME BIOLOGY AND EVOLUTION, 2020, 12 (02): : 3832 - 3849
  • [4] Quantitative proteomics reveals tissue-specific, infection-induced and species-specific neutrophil protein signatures
    Sollberger, Gabriel
    Brenes, Alejandro J.
    Warner, Jordan
    Arthur, J. Simon C.
    Howden, Andrew J. M.
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [5] Mitochondrial genome anatomy and species-specific lifespan
    Lehmann, Gilad
    Budovsky, Arie
    Muradian, Khachik K.
    Fraifeld, Vadim E.
    [J]. REJUVENATION RESEARCH, 2006, 9 (02) : 223 - 226
  • [6] Insect repellents mediate species-specific olfactory behaviours in mosquitoes
    Ali Afify
    Christopher J. Potter
    [J]. Malaria Journal, 19
  • [7] Insect repellents mediate species-specific olfactory behaviours in mosquitoes
    Afify, Ali
    Potter, Christopher J.
    [J]. MALARIA JOURNAL, 2020, 19 (01)
  • [8] CG dinucleotide clustering is a species-specific property of the genome
    Glass, Jacob L.
    Thompson, Reid F.
    Khulan, Batbayar
    Figueroa, Maria E.
    Olivier, Emmanuel N.
    Oakley, Erin J.
    Van Zant, Gary
    Bouhassira, Eric E.
    Melnick, Ari
    Golden, Aaron
    Fazzari, Melissa J.
    Greally, John M.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 (20) : 6798 - 6807
  • [9] Genetic basis of species-specific genitalia reveals role in species diversification
    Fujisawa, Tomochika
    Sasabe, Masataka
    Nagata, Nobuaki
    Takami, Yasuoki
    Sota, Teiji
    [J]. SCIENCE ADVANCES, 2019, 5 (06):
  • [10] Single-cell RNA sequencing of Plasmodium vivax sporozoites reveals stage- and species-specific transcriptomic signatures
    Ruberto, Anthony A.
    Bourke, Caitlin
    Vantaux, Amelie
    Maher, Steven P.
    Jex, Aaron
    Witkowski, Benoit
    Snounou, Georges
    Mueller, Ivo
    [J]. PLOS NEGLECTED TROPICAL DISEASES, 2022, 16 (08): : e0010633