Practical software for aligning ESTs to human genome

被引:0
|
作者
Ogasawara, J [1 ]
Morishita, S
机构
[1] Univ Tokyo, Dept Comp Sci, Tokyo, Japan
[2] Univ Tokyo, Dept Complex Sci & Engn, Tokyo, Japan
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
There is a pressing need to align growing set of expressed sequence tags (ESTs) to newly sequenced human genome that is still frequently revised, for providing biologists and medical scientists with fresh information. The problem is, however, complicated by the exon/intron structure of eucaryotic genes, misread nucleotides in ESTs, and millions of repeptive sequences in genomic sequences. Indeed, to solve this, algorithms that use dynamic programming have been proposed, in which space complexity is O(N) and time complexity is O(MN) for a genomic sequence of length M and an EST of length N, but in reality, these algorithms require an enormous amount of processing time. In an effort to improve the computational efficiency of these classical DP algorithms, we develop software that fully utilizes the lookup-table that stores the position at which each short subsequence occurs in the genomic sequence for allowing the efficient defection of the start- and endpoints of an EST within a given DNA sequence, and subsequently, the prompt identification of exons and introns. In addition, high sensitivity and accuracy must be achieved by calculating locations of all spliced sites correctly for more ESTs while retaining high computational efficiency. This goal is hard to accomplish in practice, owing to misread nucleotides in ESTs and repeptive sequences in the genome, but we present a couple of heuristics effective in settling this issue. Experimental results have confirmed that our technique improves the overall computation time by orders of magnitude compared with common tools such as sim4 and BLAT, and attains high sensitivity and accuracy against datasets of clean and documented genes at the same time. Consequently, our software is able to align about three millions of ESTs to a draft genome in less than one day, and all the information is available through the WWW at http://grl.gi.k.u-tokyo.ac.jp/.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [1] Fast and sensitive algorithm for aligning ESTs to human genome
    Ogasawara, J
    Morishita, S
    CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE, 2002, : 43 - 53
  • [2] Aligning ESTs to genome using multi-layer unique makers
    Hsu, FR
    Chen, JF
    PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 564 - 566
  • [3] ESTS: Patenting the genome?
    Campbell, C
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1999, 217 : U284 - U284
  • [4] Regional assignment of human ESTs by whole-genome radiation hybrid mapping
    Hayes, PD
    Schmitt, K
    Jones, HB
    Gyapay, G
    Weissenbach, J
    Goodfellow, PN
    MAMMALIAN GENOME, 1996, 7 (06) : 446 - 450
  • [5] Predicting Intron Sites by Aligning Cotton ESTs with Arabidopsis Genomic DNA
    Kumar, Pawan
    Paterson, Andrew
    Chee, Peng
    JOURNAL OF COTTON SCIENCE, 2006, 10 (01): : 29 - 38
  • [6] Aligning software processes with strategy
    Slaughter, Sandra A.
    Levine, Linda
    Ramesh, Balasubramaniam
    Pries-Heje, Jan
    Baskerville, Richard
    MIS QUARTERLY, 2006, 30 (04) : 891 - 918
  • [7] Aligning Software Architecture Training with Software Industry Requirements
    Yepez, Wilson Libardo Pantoja
    Alegria, Julio Ariel Hurtado
    Kiweleker, Arvind
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (03) : 435 - 460
  • [8] SOFTWARE HOLDS THE KEY TO ANALYZING THE HUMAN GENOME
    WATTS, S
    NEW SCIENTIST, 1989, 123 (1683) : 29 - 29
  • [9] FALSE ASSOCIATION OF HUMAN ESTS
    TSAI, JY
    NAMINGONZALEZ, ML
    SILVER, LM
    NATURE GENETICS, 1994, 8 (04) : 321 - 322
  • [10] Aligning software maintenance to the offshore reality
    Seybold, Christian
    Keller, Rudolf K.
    CSMR 2008: 12TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING: DEVELOPING EVOLVABLE SYSTEMS, 2008, : 33 - 42