Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes

被引:72
|
作者
Lin, Michael F. [1 ,2 ]
Kheradpour, Pouya [1 ,2 ]
Washietl, Stefan [2 ]
Parker, Brian J. [3 ]
Pedersen, Jakob S. [3 ]
Kellis, Manolis [1 ,2 ,4 ]
机构
[1] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[3] Univ Copenhagen, Dept Biol, DK-2200 Copenhagen, Denmark
[4] Broad Inst, Cambridge, MA 02139 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
AMINO-ACID SITES; CODON-SUBSTITUTION MODELS; EXONIC SPLICING ENHANCERS; RNA SECONDARY STRUCTURE; FALSE DISCOVERY RATE; PURIFYING SELECTION; POSITIVE SELECTION; READING FRAMES; HUMAN GENES; SYNONYMOUS MUTATIONS;
D O I
10.1101/gr.108753.110
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes-especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain similar to 2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape.
引用
收藏
页码:1916 / 1928
页数:13
相关论文
共 26 条
  • [1] Overlapping codes within protein-coding sequences
    Itzkovitz, Shalev
    Hodis, Eran
    Segal, Eran
    GENOME RESEARCH, 2010, 20 (11) : 1582 - 1589
  • [2] Uncovering Pseudogenes and Intergenic Protein-coding Sequences in TriTryps' Genomes
    Abrahim, Mayla
    Machado, Edson
    Alvarez-Valin, Fernando
    de Miranda, Antonio Basilio
    Catanho, Marcos
    GENOME BIOLOGY AND EVOLUTION, 2022, 14 (10):
  • [3] Correlates of substitution rate variation in mammalian protein-coding sequences
    Welch, John J.
    Bininda-Emonds, Olaf R. P.
    Bromham, Lindell
    BMC EVOLUTIONARY BIOLOGY, 2008, 8 (1)
  • [4] Correlates of substitution rate variation in mammalian protein-coding sequences
    John J Welch
    Olaf RP Bininda-Emonds
    Lindell Bromham
    BMC Evolutionary Biology, 8
  • [5] Analysis of selection in protein-coding sequences accounting for common biases
    Del Amparo, Roberto
    Branco, Catarina
    Arenas, Jesus
    Vicens, Alberto
    Arenas, Miguel
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [6] Optimization of Mutation Pressure in Relation to Properties of Protein-Coding Sequences in Bacterial Genomes
    Blazej, Pawel
    Miasojedow, Blazej
    Grabinska, Malgorzata
    Mackiewicz, Pawel
    PLOS ONE, 2015, 10 (06):
  • [7] gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes
    Nakagawa, So
    Takahashi, Mahoko Ueda
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
  • [8] The genetic code is nearly optimal for allowing additional information within protein-coding sequences
    Itzkovitz, Shalev
    Alon, Uri
    GENOME RESEARCH, 2007, 17 (04) : 405 - 412
  • [9] Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences
    Sabath, Niv
    Morris, Jeffrey S.
    Graur, Dan
    JOURNAL OF MOLECULAR EVOLUTION, 2011, 73 (5-6) : 305 - 315
  • [10] Is There a Twelfth Protein-Coding Gene in the Genome of Influenza A? A Selection-Based Approach to the Detection of Overlapping Genes in Closely Related Sequences
    Niv Sabath
    Jeffrey S. Morris
    Dan Graur
    Journal of Molecular Evolution, 2011, 73 : 305 - 315