Practical software for aligning ESTs to human genome

被引:0
|
作者
Ogasawara, J [1 ]
Morishita, S
机构
[1] Univ Tokyo, Dept Comp Sci, Tokyo, Japan
[2] Univ Tokyo, Dept Complex Sci & Engn, Tokyo, Japan
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
There is a pressing need to align growing set of expressed sequence tags (ESTs) to newly sequenced human genome that is still frequently revised, for providing biologists and medical scientists with fresh information. The problem is, however, complicated by the exon/intron structure of eucaryotic genes, misread nucleotides in ESTs, and millions of repeptive sequences in genomic sequences. Indeed, to solve this, algorithms that use dynamic programming have been proposed, in which space complexity is O(N) and time complexity is O(MN) for a genomic sequence of length M and an EST of length N, but in reality, these algorithms require an enormous amount of processing time. In an effort to improve the computational efficiency of these classical DP algorithms, we develop software that fully utilizes the lookup-table that stores the position at which each short subsequence occurs in the genomic sequence for allowing the efficient defection of the start- and endpoints of an EST within a given DNA sequence, and subsequently, the prompt identification of exons and introns. In addition, high sensitivity and accuracy must be achieved by calculating locations of all spliced sites correctly for more ESTs while retaining high computational efficiency. This goal is hard to accomplish in practice, owing to misread nucleotides in ESTs and repeptive sequences in the genome, but we present a couple of heuristics effective in settling this issue. Experimental results have confirmed that our technique improves the overall computation time by orders of magnitude compared with common tools such as sim4 and BLAT, and attains high sensitivity and accuracy against datasets of clean and documented genes at the same time. Consequently, our software is able to align about three millions of ESTs to a draft genome in less than one day, and all the information is available through the WWW at http://grl.gi.k.u-tokyo.ac.jp/.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [21] Aligning the proteome and genome of the silkworm, Bombyx mori
    Zhang, Yaozhou
    Xia, Qingyou
    Xu, Jie
    Chen, Jian
    Nie, Zuoming
    Wang, Dan
    Zhang, Wenping
    Chen, Jianqing
    Zheng, Qingliang
    Chen, Qing
    Kong, Lingying
    Ren, Xiaoyuan
    Wang, Jiang
    Lv, Zhengbing
    Yu, Wei
    Jiang, Caiying
    Liu, Lili
    Sheng, Qing
    Jin, Yongfeng
    Wu, Xiangfu
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2009, 9 (04) : 447 - 454
  • [22] Aligning the unalignable: bacteriophage whole genome alignments
    Sèverine Bérard
    Annie Chateau
    Nicolas Pompidor
    Paul Guertin
    Anne Bergeron
    Krister M. Swenson
    BMC Bioinformatics, 17
  • [23] Identification of transcribed sequences (ESTs) in the Trypanosoma cruzi genome project
    Brandao, A
    Urmenyi, R
    Rondinelli, E
    Gonzalez, A
    deMiranda, AB
    Degrave, W
    MEMORIAS DO INSTITUTO OSWALDO CRUZ, 1997, 92 (06): : 863 - 866
  • [24] Tree ESTs and proteomics part of Sweden's genome initiative
    不详
    NATURE GENETICS, 1998, 18 (01) : 10 - 10
  • [25] Practical application of a safe human-robot interaction software
    Bingol, Mustafa Can
    Aydogmus, Omur
    INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2020, 47 (03): : 359 - 368
  • [26] Aligning the proteome and genome of the silkworm, Bombyx mori
    Yaozhou Zhang
    Qingyou Xia
    Jie Xu
    Jian Chen
    Zuoming Nie
    Dan Wang
    Wenping Zhang
    Jianqing Chen
    Qingliang Zheng
    Qing Chen
    Lingying Kong
    Xiaoyuan Ren
    Jiang Wang
    Zhengbing Lv
    Wei Yu
    Caiying Jiang
    Lili Liu
    Qing Sheng
    Yongfeng Jin
    Xiangfu Wu
    Functional & Integrative Genomics, 2009, 9 : 447 - 454
  • [27] Aligning the unalignable: bacteriophage whole genome alignments
    Berard, Severine
    Chateau, Annie
    Pompidor, Nicolas
    Guertin, Paul
    Bergeron, Anne
    Swenson, Krister M.
    BMC BIOINFORMATICS, 2016, 17
  • [28] Fugu ESTs:: New resources for transcription analysis and genome annotation
    Clark, MS
    Edwards, YJK
    Peterson, D
    Clifton, SW
    Thompson, AJ
    Sasaki, M
    Suzuki, Y
    Kikuchi, K
    Watabe, S
    Kawakami, K
    Sugano, S
    Elgar, G
    Johnson, SL
    GENOME RESEARCH, 2003, 13 (12) : 2747 - 2753
  • [29] Software agents could tackle human genome data explosion
    Graham-Rowe, D
    NEW SCIENTIST, 2003, 179 (2407) : 22 - 22
  • [30] Micro- and minisatellites in human genome, TandemSWAN software in use
    Boeva, V. A.
    Makeev, V. J.
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 3, 2006, : 118 - +