An effective approach for analyzing "prefinished" genomic sequence data

被引:0
|
作者
Kuehl, PM
Weisemann, JM
Touchman, JW
Green, ED
Boguski, MS [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
[2] NIH, Natl Human Genome Res Inst, Genome Technol Branch, Bethesda, MD 20892 USA
[3] Univ Maryland, Dept Mol & Cell Biol, Baltimore, MD 21201 USA
关键词
D O I
暂无
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Ongoing efforts to sequence the human genome are already generating large amounts of data, with substantial increases anticipated over the next few years. In most cases, a shotgun sequencing strategy is being used, which rapidly yields most of the primary sequence in incompletely assembled sequence contigs ("prefinished" sequence) and more slowly produces the final, completely assembled sequence ("finished" sequence). Thus, in general, prefinished sequence is produced in excess of finished sequence, and this trend is certain to continue and even accelerate over the next few years. Even at a prefinished stage, genomic sequence represents a rich source of important biological information that is of great interest to many investigators. However, analyzing such data is a challenging and daunting task, both because of its sheer volume and because it can change on a day-by-day basis. To facilitate the discovery and characterization of genes and other important elements within prefinished sequence, we have developed an analytical strategy and system that uses readily available software tools in new combinations. Implementation of this strategy for the analysis of prefinished sequence data from human chromosome 7 has demonstrated that this is a convenient, inexpensive, and extensible solution to the problem of analyzing the large amounts of preliminary data being produced by large-scale sequencing efforts. Our approach is accessible to any investigator who wishes to assimilate additional information about particular sequence data en route to developing richer annotations of a finished sequence.
引用
收藏
页码:189 / 194
页数:6
相关论文
共 50 条
  • [41] A Cost Effective Approach for Analyzing Software Product Lines
    Narwane, Ganesh Khandu
    Krishna, Shankara Narayanan
    Bhattacharjee, Anup Kumar
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, ICDCIT 2014, 2014, 8337 : 212 - 223
  • [42] Approximate approach to analyzing effective velocity of surface waves
    Chai Hua-you
    Wei Chang-fu
    Bai Shi-wei
    ROCK AND SOIL MECHANICS, 2008, 29 (01) : 87 - 93
  • [43] Approximate approach to analyzing effective velocity of surface waves
    Chai, Hua-You
    Wei, Chang-Fu
    Bai, Shi-Wei
    Yantu Lixue/Rock and Soil Mechanics, 2008, 29 (01): : 87 - 93
  • [44] From IMU Measurement Sequence to Velocity Estimate Sequence: An Effective and Efficient Data-Driven Inertial Odometry Approach
    Wang, Yingying
    Cheng, Hu
    Zhang, Ang
    Meng, Max Q. -H.
    IEEE SENSORS JOURNAL, 2023, 23 (15) : 17117 - 17126
  • [45] Secure Sequence Similarity Search on Encrypted Genomic Data
    Mahdi, Md Safiur Rahman
    Hasan, Mohammad Zahidul
    Mohammed, Noman
    2017 IEEE/ACM SECOND INTERNATIONAL CONFERENCE ON CONNECTED HEALTH - APPLICATIONS, SYSTEMS AND ENGINEERING TECHNOLOGIES (CHASE), 2017, : 205 - 213
  • [46] A Hybrid Technique for the Periodicity Characterization of Genomic Sequence Data
    Epps, Julien
    EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY, 2009, (01)
  • [47] Inferring the Direction of Introgression Using Genomic Sequence Data
    Thawornwattana, Yuttapong
    Huang, Jun
    Flouri, Tomas
    Mallet, James
    Yang, Ziheng
    MOLECULAR BIOLOGY AND EVOLUTION, 2023, 40 (08)
  • [48] Novel selenoproteins identified from genomic sequence data
    Lescure, A
    Gautheret, D
    Krol, A
    PROTEIN SENSORS AND REACTIVE OXYGEN SPECIES, PT A, SELENOPROTEINS AND THIOREDOXIN, 2002, 347 : 57 - 70
  • [49] Secure Similar Sequence Query on Outsourced Genomic Data
    Cheng, Ke
    Hou, Yantian
    Wang, Liangmin
    PROCEEDINGS OF THE 2018 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (ASIACCS'18), 2018, : 237 - 251
  • [50] RIG: Recalibration and Interrelation of Genomic Sequence Data with the GATK
    McCormick, Ryan F.
    Truong, Sandra K.
    Mullet, John E.
    G3-GENES GENOMES GENETICS, 2015, 5 (04): : 655 - 665