COSINE: non-seeding method for mapping long noisy sequences

被引:3
|
作者
Afshar, Pegah Tootoonchi [1 ]
Wong, Wing Hung [2 ,3 ]
机构
[1] Stanford Univ, Sch Engn, Dept Elect Engn, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Biomed Data Sci, Stanford, CA 94305 USA
基金
美国国家卫生研究院;
关键词
FAST FOURIER-TRANSFORM; GENERATION;
D O I
10.1093/nar/gkx511
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Third generation sequencing (TGS) are highly promising technologies but the long and noisy reads from TGS are difficult to align using existing algorithms. Here, we present COSINE, a conceptually new method designed specifically for aligning long reads contaminated by a high level of errors. COSINE computes the context similarity of two stretches of nucleobases given the similarity over distributions of their short k-mers (k = 3-4) along the sequences. The results on simulated and real data show that COSINE achieves high sensitivity and specificity under a wide range of read accuracies. When the error rate is high, COSINE can offer substantial advantages over existing alignment methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Stochastic response analysis of noisy system with non-negative real-power restoring force by generalized cell mapping method
    Qun Han
    Wei Xu
    Xiaole Yue
    Applied Mathematics and Mechanics, 2015, 36 : 329 - 336
  • [22] Zero-mean cosine polynomials which are non-negative for as long as possible
    Gilbert, AD
    Smyth, CJ
    JOURNAL OF THE LONDON MATHEMATICAL SOCIETY-SECOND SERIES, 2000, 62 : 489 - 504
  • [23] Mapping of a major osteomagenic determinant of murine leukemia virus RFB-14 to non-long terminal repeat sequences
    Ostergaard, M
    Pedersen, L
    Schmidt, J
    Luz, A
    Lovmand, J
    Erfle, V
    Pedersen, FS
    Strauss, PG
    JOURNAL OF VIROLOGY, 1997, 71 (01) : 645 - 649
  • [24] Random-breakage mapping method applied to human DNA sequences
    Lobrich, M
    Rydberg, B
    Cooper, PK
    NUCLEIC ACIDS RESEARCH, 1996, 24 (10) : 1802 - 1808
  • [25] A novel numerical mapping method based on entropy for digitizing DNA sequences
    Das, Bihter
    Turkoglu, Ibrahim
    NEURAL COMPUTING & APPLICATIONS, 2018, 29 (08): : 207 - 215
  • [26] A novel numerical mapping method based on entropy for digitizing DNA sequences
    Bihter Das
    Ibrahim Turkoglu
    Neural Computing and Applications, 2018, 29 : 207 - 215
  • [27] Identification and mapping of cis-regulatory elements within long genomic sequences
    S. B. Akopov
    I. P. Chernov
    A. S. Vetchinova
    S. S. Bulanenkova
    L. G. Nikolaev
    Molecular Biology, 2007, 41 : 717 - 722
  • [28] Identification and mapping of cis-regulatory elements within long genomic sequences
    Akopov, S. B.
    Chernov, I. P.
    Vetchinova, A. S.
    Bulanenkova, S. S.
    Nikolaev, L. G.
    MOLECULAR BIOLOGY, 2007, 41 (05) : 717 - 722
  • [29] A new technique for selective identification and mapping of enhancers within long genomic sequences
    Chernov, Igor P.
    Stukacheva, Elena A.
    Akopov, Sergey B.
    Didych, Dmitry A.
    Nikolaev, Lev G.
    Sverdlov, Eugene D.
    BIOTECHNIQUES, 2008, 44 (06) : 775 - +
  • [30] When Noisy Labels Meet Long Tail Dilemmas: A Representation Calibration Method
    Zhang, Manyi
    Zhao, Xuyang
    Yao, Jun
    Yuan, Chun
    Huang, Weiran
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15844 - 15854