Overlapping codes within protein-coding sequences

被引:54
|
作者
Itzkovitz, Shalev [1 ]
Hodis, Eran [1 ]
Segal, Eran [1 ,2 ]
机构
[1] Weizmann Inst Sci, Dept Comp Sci & Appl Math, IL-76100 Rehovot, Israel
[2] Weizmann Inst Sci, Dept Mol Cell Biol, IL-76100 Rehovot, Israel
基金
欧洲研究理事会;
关键词
RNA SECONDARY STRUCTURE; RESTRICTION ENZYMES; ESCHERICHIA-COLI; GENE-EXPRESSION; DNA-SEQUENCES; BACTERIAL; SELECTION; TARGETS; REGIONS; MICRORNAS;
D O I
10.1101/gr.105072.110
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genomes encode multiple signals, raising the question of how these different codes are organized along the linear genome sequence. Within protein-coding regions, the redundancy of the genetic code can, in principle, allow for the overlapping encoding of signals in addition to the amino acid sequence, but it is not known to what extent genomes exploit this potential and, if so, for what purpose. Here, we systematically explore whether protein-coding regions accommodate overlapping codes, by comparing the number of occurrences of each possible short sequence within the protein-coding regions of over 700 species from viruses to plants, to the same number in randomizations that preserve amino acid sequence and codon bias. We find that coding regions across all phyla encode additional information, with bacteria carrying more information than eukaryotes. The detailed signals consist of both known and potentially novel codes, including position-dependent secondary RNA structure, bacteria-specific depletion of transcription and translation initiation signals, and eukaryote-specific enrichment of microRNA target sites. Our results suggest that genomes may have evolved to encode extensive overlapping information within protein-coding regions. [Supplemental material is available online at http://www.genome.org.]
引用
收藏
页码:1582 / 1589
页数:8
相关论文
共 50 条
  • [1] Protein-coding tRNA sequences?
    Jimenez, Juan
    [J]. GENE, 2022, 814
  • [2] Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses
    Firth, Andrew E.
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (20) : 12425 - 12439
  • [3] Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes
    Lin, Michael F.
    Kheradpour, Pouya
    Washietl, Stefan
    Parker, Brian J.
    Pedersen, Jakob S.
    Kellis, Manolis
    [J]. GENOME RESEARCH, 2011, 21 (11) : 1916 - 1928
  • [4] PROMOTER SEQUENCES OF EUKARYOTIC PROTEIN-CODING GENES
    CHAMBON, P
    [J]. HOPPE-SEYLERS ZEITSCHRIFT FUR PHYSIOLOGISCHE CHEMIE, 1981, 362 (04): : 381 - 381
  • [5] Identifying protein-coding genes in genomic sequences
    Jennifer Harrow
    Alinda Nagy
    Alexandre Reymond
    Tyler Alioto
    Laszlo Patthy
    Stylianos E Antonarakis
    Roderic Guigó
    [J]. Genome Biology, 10
  • [6] PROMOTER SEQUENCES OF EUKARYOTIC PROTEIN-CODING GENES
    CORDEN, J
    WASYLYK, B
    BUCHWALDER, A
    CORSI, PS
    KEDINGER, C
    CHAMBON, P
    [J]. SCIENCE, 1980, 209 (4463) : 1406 - 1414
  • [7] Identifying protein-coding genes in genomic sequences
    Harrow, Jennifer
    Nagy, Alinda
    Reymond, Alexandre
    Alioto, Tyler
    Patthy, Laszlo
    Antonarakis, Stylianos E.
    Guigo, Roderic
    [J]. GENOME BIOLOGY, 2009, 10 (01): : 201
  • [8] The genetic code is nearly optimal for allowing additional information within protein-coding sequences
    Itzkovitz, Shalev
    Alon, Uri
    [J]. GENOME RESEARCH, 2007, 17 (04) : 405 - 412
  • [9] Protein-coding regions prediction combining similarity searches and conservative evolutionary properties of protein-coding sequences
    Rogozin, IB
    D'Angelo, D
    Milanesi, L
    [J]. GENE, 1999, 226 (01) : 129 - 137
  • [10] Reconstructing protein-coding sequences from ancient DNA
    Hofreiter, Michael
    Hartmann, Stefanie
    [J]. ODORANT BINDING AND CHEMOSENSORY PROTEINS, 2020, 642 : 21 - 33