An optimal DNA segmentation based on the MDL principle

被引:6
|
作者
Szpankowski, W [1 ]
Ren, WH [1 ]
Szpankowski, L [1 ]
机构
[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
关键词
D O I
10.1109/CSB.2003.1227402
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The biological world is highly stochastic as well as inhomogeneous in its behavior The transition between homogeneous and inhomogeneous regions of DNA, known also as change points, carry important biological information. Our goal is to employ rigorous methods of information theory to quantify structural properties of DNA sequences. In particular, we adopt the Stein-Ziv lemma to find asymptotically optimal discriminant function that determines whether two DNA segments are generated by the same source and assuring exponentially small false positives. Then we apply the Minimum Description Length (MDL) principle to select parameters of our segmentation algorithm. Finally, we perform extensive experimental work on human chromosome 9. After grouping A and G (purines) and Tand C (pyrimidines) we discover change points between coding and noncoding regions as well as the beginning of a CpG island.
引用
收藏
页码:541 / 546
页数:6
相关论文
共 50 条
  • [41] Model selection using information theory and the MDL principle
    Stine, RA
    SOCIOLOGICAL METHODS & RESEARCH, 2004, 33 (02) : 230 - 260
  • [42] Recursive MDL via Graph Cuts: Application to Segmentation
    Gorelick, Lena
    Delong, Andrew
    Veksler, Olga
    Boykov, Yuri
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 890 - 897
  • [43] Spatial segmentation of color images according to the MDL formalism
    Pateux, S
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2000, : 92 - 95
  • [44] Optical flow detection using segmentation and MDL criterion
    Tsuchiya, H., 1600, John Wiley and Sons Inc. (35):
  • [45] Detecting Metachanges in Data Streams from the Viewpoint of the MDL Principle
    Fukushima, Shintaro
    Yamanishi, Kenji
    ENTROPY, 2019, 21 (12)
  • [46] Wavelet denoising in non gaussian noise using MDL principle
    Xie, JC
    Zhang, DL
    Xu, WL
    PROCEEDINGS OF THE 4TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-4, 2002, : 2075 - 2079
  • [47] Speaker adaptation with autonomous model complexity control by MDL principle
    Shinoda, K
    Watanabe, T
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 717 - 720
  • [48] Estimation of numbers of multipaths in DS-CDMA by the MDL principle
    Saarnisaari, H
    1998 IEEE 5TH INTERNATIONAL SYMPOSIUM ON SPREAD SPECTRUM TECHNIQUES AND APPLICATIONS - PROCEEDINGS, VOLS 1-3, 1998, : 258 - 261
  • [49] Instance reduction for time series classification using MDL principle
    Vo Thanh Vinh
    Duong Tuan Anh
    INTELLIGENT DATA ANALYSIS, 2017, 21 (03) : 491 - 514
  • [50] A study on difference of codelengths between codes based on MDL principle and Bayes codes for given prior distributions
    Gotoh, M
    Matsushima, T
    Hirasawa, S
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2001, 84 (04): : 30 - 40