Tandem repeat copy-number variation in protein-coding regions of human genes

被引:41
|
作者
O'Dushlaine, CT
Edwards, RJ
Park, SD
Shields, DC
机构
[1] Royal Coll Surgeons Ireland, Dept Clin Pharmacol, Bioinformat Core, Dublin 2, Ireland
[2] Royal Coll Surgeons Ireland, Inst Biopharmaceut Sci, Dublin 2, Ireland
关键词
D O I
10.1186/gb-2005-6-8-r69
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Tandem repeat variation in protein-coding regions will alter protein length and may introduce frameshifts. Tandem repeat variants are associated with variation in pathogenicity in bacteria and with human disease. We characterized tandem repeat polymorphism in human proteins, using the UniGene database, and tested whether these were associated with host defense roles. Results: Protein-coding tandem repeat copy-number polymorphisms were detected in 249 tandem repeats found in 218 UniGene clusters; observed length differences ranged from 2 to 144 nucleotides, with unit copy lengths ranging from 2 to 57. This corresponded to 1.59% ( 218/13,749) of proteins investigated carrying detectable polymorphisms in the copy-number of protein-coding tandem repeats. We found no evidence that tandem repeat copy-number polymorphism was significantly elevated in defense-response proteins ( p = 0.882). An association with the Gene Ontology term 'protein-binding' remained significant after covariate adjustment and correction for multiple testing. Combining this analysis with previous experimental evaluations of tandem repeat polymorphism, we estimate the approximate mean frequency of tandem repeat polymorphisms in human proteins to be 6%. Because 13.9% of the polymorphisms were not a multiple of three nucleotides, up to 1% of proteins may contain frameshifting tandem repeat polymorphisms. Conclusion: Around 1 in 20 human proteins are likely to contain tandem repeat copy-number polymorphisms within coding regions. Such polymorphisms are not more frequent among defense-response proteins; their prevalence among protein-binding proteins may reflect lower selective constraints on their structural modification. The impact of frameshifting and longer copy-number variants on protein function and disease merits further investigation.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Tandem repeat copy-number variation in protein-coding regions of human genes
    Colm T O'Dushlaine
    Richard J Edwards
    Stephen D Park
    Denis C Shields
    [J]. Genome Biology, 6
  • [2] Copy-number variation
    Du Toit A.
    [J]. Nature Reviews Microbiology, 2020, 18 (10) : 542 - 542
  • [3] Copy-number variation: the end of the human genome?
    Dear, Paul H.
    [J]. TRENDS IN BIOTECHNOLOGY, 2009, 27 (08) : 448 - 454
  • [4] Efficient algorithms for tandem copy number variation reconstruction in repeat-rich regions
    He, Dan
    Hormozdiari, Farhad
    Furlotte, Nicholas
    Eskin, Eleazar
    [J]. BIOINFORMATICS, 2011, 27 (11) : 1513 - 1520
  • [5] Transposable elements are found in a large number of human protein-coding genes
    Nekrutenko, A
    Li, WHS
    [J]. TRENDS IN GENETICS, 2001, 17 (11) : 619 - 621
  • [6] Segmental duplications and copy-number variation in the human genome
    Sharp, AJ
    Locke, DP
    McGrath, SD
    Cheng, Z
    Bailey, JA
    Vallente, RU
    Pertz, LM
    Clark, RA
    Schwartz, S
    Segraves, R
    Oseroff, VV
    Albertson, DG
    Pinkel, D
    Eichler, EE
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2005, 77 (01) : 78 - 88
  • [7] Copy-number variation and association studies of human disease
    Steven A McCarroll
    David M Altshuler
    [J]. Nature Genetics, 2007, 39 : S37 - S42
  • [8] Copy-number variation and association studies of human disease
    McCarroll, Steven A.
    Altshuler, David M.
    [J]. NATURE GENETICS, 2007, 39 (Suppl 7) : S37 - S42
  • [9] Copy number increases of transposable elements and protein-coding genes in an invasive fish of hybrid origin
    Dennenmoser, Stefan
    Sedlazeck, Fritz J.
    Iwaszkiewicz, Elzbieta
    Li, Xiang-Yi
    Altmueller, Janine
    Nolte, Arne W.
    [J]. MOLECULAR ECOLOGY, 2017, 26 (18) : 4712 - 4724
  • [10] Extensive Copy-Number Variation of Young Genes across Stickleback Populations
    Chain, Federic J. J.
    Feulner, Philine G. D.
    Panchal, Mahesh
    Eizaguirre, Christophe
    Samonte, Irene E.
    Kalbe, Martin
    Lenz, Tobias L.
    Stoll, Monika
    Bornberg-Bauer, Erich
    Milinski, Manfred
    Reusch, Thorsten B. H.
    [J]. PLOS GENETICS, 2014, 10 (12):