Capacity and Expressiveness of Genomic Tandem Duplication

被引:0
|
作者
Jain, Siddharth [1 ]
Farnoud , Farzad [1 ]
Bruck, Jehoshua [1 ]
机构
[1] CALTECH, Elect Engn, Pasadena, CA 91125 USA
关键词
Expressiveness; tandem repeats; finite automata; square-free strings; REPEATS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The majority of the human genome consists of repeated sequences. An important type of repeats common in the human genome are tandem repeats, where identical copies appear next to each other. For example, in the sequence AGTC (TGTG) under barC, TGTG is a tandem repeat, namely, generated from AGTCTGC by a tandem duplication of length 2. In this work, we investigate the possibility of generating a large number of sequences from a small initial string (called the seed) by tandem duplications of bounded length. Our results include exact capacity values for certain tandem duplication string systems with alphabet sizes 2, 3, and 4. In addition, motivated by the role of DNA sequences in expressing proteins via RNA and the genetic code, we define the notion of the expressiveness of a tandem duplication system, as the feasibility of expressing arbitrary substrings. We then completely characterize the expressiveness of tandem duplication systems for general alphabet sizes and duplication lengths. Noticing that a system with capacity = 1 is expressive, we prove that for an alphabet size >= 4, the capacity is strictly smaller than 1, independent of the seed and the duplication lengths. The proof of this limit on the capacity (note that the genomic alphabet size is 4), is related to an interesting result by Axel Thue from 1906 which states that there exist arbitrary length sequences with no tandem repeats (square-free) for alphabet size >= 3. Finally, our results illustrate that duplication lengths play a more significant role than the seed in generating a large number of sequences for these systems.
引用
收藏
页码:1946 / 1950
页数:5
相关论文
共 50 条
  • [11] Genomic characterization of a TP53tandem duplication in a pediatric patient with Li-Fraumeni syndrome
    Xu, Feng
    Aref-Eshghi, Erfan
    Wu, Jinhua
    Schubert, Jeffrey
    Patel, Maha
    Fan, Zhiqian
    Cao, Kajia
    Long, Ariel
    Denenberg, Elizabeth
    Fanning, Elizabeth
    Wilmoth, Donna
    Wertheim, Gerald
    Luo, Minjie
    Conlin, Laura
    Bhatti, Tricia
    Dain, Aleksandra
    Zelley, Kristin
    Balamuth, Naomi
    MacFarland, Suzanne
    Li, Marilyn
    Zhong, Yiming
    GENETICS IN MEDICINE, 2022, 24 (03) : S182 - S182
  • [12] Tandem duplication via light-strand synthesis may provide a precursor for mitochondrial genomic rearrangement
    Macey, JR
    Schulte, JA
    Larson, A
    Papenfuss, TJ
    MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (01) : 71 - 75
  • [13] An intragenic tandem duplication of genomic DNA is responsible for the f3N mutation of Drosophila melanogaster
    Ishimaru, S.
    Green, M. M.
    Saigo, K.
    Physical Review B: Condensed Matter, 1995, 51 (24):
  • [14] On the Capacity of Duplication Channels
    Ramezani, Mahdi
    Ardakani, Masoud
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2013, 61 (03) : 1020 - 1027
  • [15] Minimizing genomic duplication episodes
    Paszek, Jaroslaw
    Tiuryn, Jerzy
    Gorecki, Pawel
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2020, 89
  • [16] Fish, a successful genomic duplication
    Robinson-Rechavi, M
    BIOFUTUR, 2005, (254) : 29 - 33
  • [17] Construction of tandem duplication correcting codes
    Zeraatpisheh, Mohamadbagher
    Esmaeili, Morteza
    Gulliver, T. Aaron
    IET COMMUNICATIONS, 2019, 13 (15) : 2217 - 2225
  • [18] SYNAPTONEMAL COMPLEX AND A TANDEM DUPLICATION IN MOUSE
    MOSES, MJ
    POORMAN, PA
    RUSSELL, LB
    CACHEIRO, NL
    SOLARI, AJ
    JOURNAL OF CELL BIOLOGY, 1977, 75 (02): : A135 - A135
  • [19] Selection for gene clustering by tandem duplication
    Reams, AB
    Neidle, EL
    ANNUAL REVIEW OF MICROBIOLOGY, 2004, 58 : 119 - 142
  • [20] Mechanisms of tandem duplication in the cancer genome
    Scully, Ralph
    Glodzik, Dominik
    Menghi, Francesca
    Liu, Edison T.
    Zhang, Cheng-Zhong
    DNA REPAIR, 2025, 145