A probabilistic framework for aligning paired-end RNA-seq data

被引:14
|
作者
Hu, Yin [1 ]
Wang, Kai [1 ]
He, Xiaping [2 ]
Chiang, Derek Y. [2 ]
Prins, Jan F. [3 ]
Liu, Jinze [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
[2] Univ N Carolina, Dept Genet, Chapel Hill, NC USA
[3] Univ N Carolina, Dept Comp Sci, Chapel Hill, NC USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btq336
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The RNA-seq paired-end read (PER) protocol samples transcript fragments longer than the sequencing capability of today's technology by sequencing just the two ends of each fragment. Deep sampling of the transcriptome using the PER protocol presents the opportunity to reconstruct the unsequenced portion of each transcript fragment using end reads from overlapping PERs, guided by the expected length of the fragment. Methods: A probabilistic framework is described to predict the alignment to the genome of all PER transcript fragments in a PER dataset. Starting from possible exonic and spliced alignments of all end reads, our method constructs potential splicing paths connecting paired ends. An expectation maximization method assigns likelihood values to all splice junctions and assigns the most probable alignment for each transcript fragment. Results: The method was applied to 2x35 bp PER datasets from cancer cell lines MCF-7 and SUM-102. PER fragment alignment increased the coverage 3-fold compared to the alignment of the end reads alone, and increased the accuracy of splice detection. The accuracy of the expectation maximization (EM) algorithm in the presence of alternative paths in the splice graph was validated by qRT-PCR experiments on eight exon skipping alternative splicing events. PER fragment alignment with long-range splicing confirmed 8 out of 10 fusion events identified in the MCF-7 cell line in an earlier study by (Maher et al., 2009).
引用
收藏
页码:1950 / 1957
页数:8
相关论文
共 50 条
  • [41] deepBlockAlign: a tool for aligning RNA-seq profiles of read block patterns
    Langenberger, David
    Pundhir, Sachin
    Ekstrom, Claus T.
    Stadler, Peter F.
    Hoffmann, Steve
    Gorodkin, Jan
    BIOINFORMATICS, 2012, 28 (01) : 17 - 24
  • [42] Analysis of clustered RNA-seq data
    Park, Hyunjin
    Lee, Seungyeoun
    Kim, Ye Jin
    Choi, Myung-Sook
    Park, Taesung
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (01) : 19 - 31
  • [43] Transcript quantification with RNA-Seq data
    Regina Bohnert
    Jonas Behr
    Gunnar Rätsch
    BMC Bioinformatics, 10
  • [44] RNA-Seq Data: A Complexity Journey
    Capobianco, Enrico
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2014, 11 (19): : 123 - 130
  • [45] aWGRS: Automates Paired-end Whole Genome Re-sequencing Data Analysis Framework
    Sun, Xiujuan
    Wan, Xiaohua
    Zhang, Fa
    Zhang, Jinzhi
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 910 - 916
  • [46] Analysis of paired end Pol II ChIP-seq and short capped RNA-seq in MCF-7 cells
    Scheidegger, Adam
    Burkholder, Adam
    Abbas, Ata
    Zarns, Kris
    Samarakkody, Ann
    Nechaev, Sergei
    GENOMICS DATA, 2015, 5 : 263 - 267
  • [47] An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data
    Sun, Xifang
    Sun, Shiquan
    Yang, Sheng
    CELLS, 2019, 8 (10)
  • [48] Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols
    Susan M. Corley
    Karen L. MacKenzie
    Annemiek Beverdam
    Louise F. Roddam
    Marc R. Wilkins
    BMC Genomics, 18
  • [49] Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols
    Corley, Susan M.
    MacKenzie, Karen L.
    Beverdam, Annemiek
    Roddam, Louise F.
    Wilkins, Marc R.
    BMC GENOMICS, 2017, 18
  • [50] Deep annotation of long noncoding RNAs by assembling RNA-seq and small RNA-seq data
    Zhang, Jiaming
    Hou, Weibo
    Zhao, Qi
    Xiao, Songling
    Linghu, Hongye
    Zhang, Lixin
    Du, Jiawei
    Cui, Hongdi
    Yang, Xu
    Ling, Shukuan
    Su, Jianzhong
    Kong, Qingran
    JOURNAL OF BIOLOGICAL CHEMISTRY, 2023, 299 (09)