Aspects of coverage in medical DNA sequencing

被引:22
|
作者
Wendl, Michael C. [1 ]
Wilson, Richard K.
机构
[1] Washington Univ, Genome Sequencing Ctr, St Louis, MO 63108 USA
[2] Washington Univ, Dept Genet, St Louis, MO 63108 USA
关键词
D O I
10.1186/1471-2105-9-239
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results: We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8 x to 10 x redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26 x and 21 x, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21 x value for normal samples is essentially a constant. Conclusion: Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Aspects of coverage in medical DNA sequencing
    Michael C Wendl
    Richard K Wilson
    BMC Bioinformatics, 9
  • [2] Medical DNA sequencing
    Marian, Ali J.
    CURRENT OPINION IN CARDIOLOGY, 2011, 26 (03) : 175 - 180
  • [3] A general coverage theory for shotgun DNA sequencing
    Wendl, Michael C.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (06) : 1177 - 1196
  • [4] MOLECULAR AND MEDICAL ASPECTS OF DNA MODIFICATION
    NIKOLSKAYA, II
    DEBOV, SS
    VESTNIK AKADEMII MEDITSINSKIKH NAUK SSSR, 1987, (07): : 23 - 29
  • [5] Occupancy modeling of coverage distribution for whole genome shotgun DNA sequencing
    Wendl, MC
    BULLETIN OF MATHEMATICAL BIOLOGY, 2006, 68 (01) : 179 - 196
  • [6] Medium-coverage DNA sequencing in the design of the genetic association study
    Xu, Chao
    Zhang, Ruiyuan
    Shen, Hui
    Deng, Hong-Wen
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2020, 28 (10) : 1459 - 1466
  • [7] Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens’ theorem
    Michael C. Wendl
    Karthik Kota
    George M. Weinstock
    Makedonka Mitreva
    Journal of Mathematical Biology, 2013, 67 : 1141 - 1161
  • [8] Occupancy Modeling of Coverage Distribution for Whole Genome Shotgun Dna Sequencing
    Michael C. Wendl
    Bulletin of Mathematical Biology, 2006, 68 : 179 - 196
  • [9] Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem
    Wendl, Michael C.
    Kota, Karthik
    Weinstock, George M.
    Mitreva, Makedonka
    JOURNAL OF MATHEMATICAL BIOLOGY, 2013, 67 (05) : 1141 - 1161
  • [10] Sequencing Coverage Analysis for Combinatorial DNA-Based Storage Systems
    Preuss, Inbal
    Galili, Ben
    Yakhini, Zohar
    Anavy, Leon
    IEEE TRANSACTIONS ON MOLECULAR BIOLOGICAL AND MULTI-SCALE COMMUNICATIONS, 2024, 10 (02): : 297 - 316