机构:
Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, JapanUniv Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, Japan
Morishita, Shinichi
[1
]
Ichikawa, Kazuki
论文数: 0引用数: 0
h-index: 0
机构:
Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, JapanUniv Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, Japan
Ichikawa, Kazuki
[1
]
Myers, Eugene W.
论文数: 0引用数: 0
h-index: 0
机构:
Max Planck Inst Mol Cell Biol & Genet, D-01307 Dresden, Saxony, Germany
Ctr Syst Biol Dresden, D-01307 Dresden, Saxony, GermanyUniv Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, Japan
Myers, Eugene W.
[2
,3
]
机构:
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, Japan
Motivation: Long tandem repeat expansions of more than 1000 nt have been suggested to be associated with diseases, but remain largely unexplored in individual human genomes because read lengths have been too short. However, new long-read sequencing technologies can produce single reads of 10 000 nt or more that can span such repeat expansions, although these long reads have high error rates, of 10-20%, which complicates the detection of repetitive elements. Moreover, most traditional algorithms for finding tandem repeats are designed to find short tandem repeats (< 1000 nt) and cannot effectively handle the high error rate of long reads in a reasonable amount of time. Results: Here, we report an efficient algorithm for solving this problem that takes advantage of the length of the repeat. Namely, a long tandem repeat has hundreds or thousands of approximate copies of the repeated unit, so despite the error rate, many short k-mers will be error-free in many copies of the unit. We exploited this characteristic to develop a method for first estimating regions that could contain a tandem repeat, by analyzing the k-mer frequency distributions of fixed-size windows across the target read, followed by an algorithm that assembles the k-mers of a putative region into the consensus repeat unit by greedily traversing a de Bruijn graph. Experimental results indicated that the proposed algorithm largely outperformed Tandem Repeats Finder, a widely used program for finding tandem repeats, in terms of sensitivity.
机构:
Grand Biosci, Beijing 102206, Peoples R China
Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Automat Sci & Engn, Xian, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Hu, Jiang
Wang, Zhuo
论文数: 0引用数: 0
h-index: 0
机构:
Grand Biosci, Beijing 102206, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Wang, Zhuo
Sun, Zongyi
论文数: 0引用数: 0
h-index: 0
机构:
Grand Biosci, Beijing 102206, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Sun, Zongyi
Hu, Benxia
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Kunming Inst Zool, Key Lab Genet Evolut & Anim Models, Kunming 650223, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Hu, Benxia
Ayoola, Adeola Oluwakemi
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Kunming Inst Zool, Key Lab Genet Evolut & Anim Models, Kunming 650223, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Ayoola, Adeola Oluwakemi
Liang, Fan
论文数: 0引用数: 0
h-index: 0
机构:
Grand Biosci, Beijing 102206, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Liang, Fan
Li, Jingjing
论文数: 0引用数: 0
h-index: 0
机构:
Grand Biosci, Beijing 102206, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Li, Jingjing
Sandoval, Jose R.
论文数: 0引用数: 0
h-index: 0
机构:
Univ San Martin Porres, Fac Med, Ctr Invest Genet & Biol Mol CIGBM, Inst Invest, Lima 15102, PeruGrand Biosci, Beijing 102206, Peoples R China
Sandoval, Jose R.
Cooper, David N.
论文数: 0引用数: 0
h-index: 0
机构:
Cardiff Univ, Inst Med Genet, Heath Pk, Cardiff CF14 4XN, EnglandGrand Biosci, Beijing 102206, Peoples R China
Cooper, David N.
Ye, Kai
论文数: 0引用数: 0
h-index: 0
机构:
Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Automat Sci & Engn, Xian, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Ye, Kai
Ruan, Jue
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Agr Sci, Guangdong Lab Lingnan Modern Agr, Genome Anal Lab,Minist Agr & Rural Affairs, Shenzhen Branch,Agr Genom Inst Shenzhen, Shenzhen 518120, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Ruan, Jue
Xiao, Chuan-Le
论文数: 0引用数: 0
h-index: 0
机构:
Sun Yat Sen Univ, Zhongshan Ophthalm Ctr, State Key Lab Ophthalmol, 7 Jinsui Rd, Guangzhou, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Xiao, Chuan-Le
Wang, Depeng
论文数: 0引用数: 0
h-index: 0
机构:
Grand Biosci, Beijing 102206, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Wang, Depeng
Wu, Dong-Dong
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Kunming Inst Zool, Key Lab Genet Evolut & Anim Models, Kunming 650223, Peoples R China
Chinese Acad Sci, Kunming Primate Res Ctr, Kunming 650107, Peoples R China
Chinese Acad Sci, Kunming Inst Zool, Natl Resource Ctr Nonhuman Primates, Natl Res Facil Phenotyp & Genet Anal Model Anim Pr, Kunming 650107, Peoples R China
Chinese Acad Sci, Kunming Inst Zool, Kunming Nat Hist Museum Zool, Kunming, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
Wu, Dong-Dong
Wang, Sheng
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Kunming Inst Zool, Key Lab Genet Evolut & Anim Models, Kunming 650223, Peoples R China
Chinese Acad Sci, Kunming Inst Zool, Yunnan Key Lab Biodivers Informat, Kunming, Peoples R ChinaGrand Biosci, Beijing 102206, Peoples R China
机构:
Yokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Mitsuhashi, Satomi
Frith, Martin C.
论文数: 0引用数: 0
h-index: 0
机构:
Natl Inst Adv Ind Sci & Technol, Artificial Intelligence Res Ctr, Koto Ku, 2-3-26 Aomi, Tokyo 1350064, Japan
Univ Tokyo, Grad Sch Frontier Sci, Kashiwa, Chiba, Japan
AIST, CBBD OIL, Shinjuku Ku, Tokyo, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Frith, Martin C.
Mizuguchi, Takeshi
论文数: 0引用数: 0
h-index: 0
机构:
Yokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Mizuguchi, Takeshi
Miyatake, Satoko
论文数: 0引用数: 0
h-index: 0
机构:
Yokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Miyatake, Satoko
Toyota, Tomoko
论文数: 0引用数: 0
h-index: 0
机构:
Univ Occupat & Environm Hlth, Sch Med, Dept Neurol, Kitakyushu, Fukuoka, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Toyota, Tomoko
Adachi, Hiroaki
论文数: 0引用数: 0
h-index: 0
机构:
Univ Occupat & Environm Hlth, Sch Med, Dept Neurol, Kitakyushu, Fukuoka, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Adachi, Hiroaki
Oma, Yoko
论文数: 0引用数: 0
h-index: 0
机构:
Saitama Med Univ, Fac Med, Dept Liberal Arts, Iruma, Saitama, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Oma, Yoko
Kino, Yoshihiro
论文数: 0引用数: 0
h-index: 0
机构:
Meiji Pharmaceut Univ, Dept Bioinformat & Mol Neuropathol, Tokyo, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
Kino, Yoshihiro
论文数: 引用数:
h-index:
机构:
Mitsuhashi, Hiroaki
Matsumoto, Naomichi
论文数: 0引用数: 0
h-index: 0
机构:
Yokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, JapanYokohama City Univ, Grad Sch Med, Dept Human Genet, Kanazawa Ku, Fukuura 3-9, Yokohama, Kanagawa 2360004, Japan
机构:
RIKEN, Res Program Computat Sci, Res & Dev Grp Next Generat Integrated Living Matt, Fus Data & Anal Res & Dev Team, Yokohama, Kanagawa 2300045, JapanRIKEN, Res Program Computat Sci, Res & Dev Grp Next Generat Integrated Living Matt, Fus Data & Anal Res & Dev Team, Yokohama, Kanagawa 2300045, Japan