Efficient change-points detection for genomic sequences via cumulative segmented regression

被引:1
|
作者
Jia, Shengji [1 ,2 ]
Shi, Lei [3 ]
机构
[1] Shanghai Lixin Univ Accounting & Finance, Sch Stat & Math, Shanghai 201209, Peoples R China
[2] Shanghai Lixin Univ Accounting & Finance, Interdisciplinary Res Inst Data Sci, Shanghai 201209, Peoples R China
[3] Yunnan Univ Finance & Econ, Stat & Math Sch, Kunming 650221, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
NUMBER; SELECTION; MODELS; JUMP;
D O I
10.1093/bioinformatics/btab685
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Knowing the number and the exact locations of multiple change points in genomic sequences serves several biological needs. The cumulative-segmented algorithm (cumSeg) has been recently proposed as a computationally efficient approach for multiple change-points detection, which is based on a simple transformation of data and provides results quite robust to model mis-specifications. However, the errors are also accumulated in the transformed model so that heteroscedasticity and serial correlation will show up, and thus the variations of the estimated change points will be quite different, while the locations of the change points should be of the same importance in the original genomic sequences. Results: In this study, we develop two new change-points detection procedures in the framework of cumulative segmented regression. Simulations reveal that the proposed methods not only improve the efficiency of each change point estimator substantially but also provide the estimators with similar variations for all the change points. By applying these proposed algorithms to Coriel and SNP genotyping data, we illustrate their performance on detecting copy number variations.
引用
收藏
页码:311 / 317
页数:7
相关论文
共 50 条
  • [1] SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION
    Kim, Hyune-Ju
    Yu, Binbing
    Feuer, Eric J.
    [J]. STATISTICA SINICA, 2009, 19 (02) : 597 - 609
  • [2] Change-points of linear regression coefficients: Retrospective detection
    Darkhovskii, BS
    [J]. AUTOMATION AND REMOTE CONTROL, 1998, 59 (08) : 1201 - 1204
  • [3] Use of two-segmented logistic regression to estimate change-points in epidemiologic studies
    Pastor, R
    Guallar, E
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 1998, 148 (07) : 631 - 642
  • [4] Detection of change-points near the end points of long-range dependent sequences
    Nie, Weilin
    Ben Hariz, Samir
    Wylie, Jonathan
    Zhang, Qiang
    [J]. COMPTES RENDUS MATHEMATIQUE, 2009, 347 (7-8) : 425 - 428
  • [5] Identifying Change-Points in Biological Sequences via Sequential Importance Sampling
    Sofronov, George Yu.
    Evans, Gareth E.
    Keith, Jonathan M.
    Kroese, Dirk P.
    [J]. ENVIRONMENTAL MODELING & ASSESSMENT, 2009, 14 (05) : 577 - 584
  • [6] Identifying Change-Points in Biological Sequences via Sequential Importance Sampling
    George Yu. Sofronov
    Gareth E. Evans
    Jonathan M. Keith
    Dirk P. Kroese
    [J]. Environmental Modeling & Assessment, 2009, 14 : 577 - 584
  • [7] Identifying Change-points in Biological Sequences via Sequential Importance Sampling
    Sofronov, G. Yu.
    Evans, G. E.
    Keith, J. M.
    Kroese, D. P.
    [J]. MODSIM 2007: INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION: LAND, WATER AND ENVIRONMENTAL MANAGEMENT: INTEGRATED SYSTEMS FOR SUSTAINABILITY, 2007, : 2917 - 2923
  • [8] Re: "Use of two-segmented logistic regression to estimate change-points in epidemiologic studies"
    Pastor, R
    Guallar, E
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2001, 153 (06) : 615 - 615
  • [9] Re:: "Use of two-segmented logistic regression to estimate change-points in epidemiologic studies"
    Ulm, K
    Küchenhoff, H
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2000, 152 (03) : 289 - 289
  • [10] Bootstrap test for change-points in nonparametric regression
    Gijbels, I
    Goderniaux, AC
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2004, 16 (3-4) : 591 - 611