Accurate Estimation of Expression Levels of Homologous Genes in RNA-seq Experiments

被引:18
|
作者
Pasaniuc, Bogdan [1 ,2 ]
Zaitlen, Noah [1 ,2 ]
Halperin, Eran [3 ,4 ,5 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
[4] Tel Aviv Univ, Mol Microbiol & Biotechnol Dept, IL-69978 Tel Aviv, Israel
[5] Tel Aviv Univ, Blavatnik Sch Comp Sci, IL-69978 Tel Aviv, Israel
基金
以色列科学基金会; 美国国家科学基金会;
关键词
algorithms; gene searching; genetic mapping; genetic variation; TRANSCRIPTOMES; REVEALS; GENOME; MOUSE;
D O I
10.1089/cmb.2010.0259
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Next generation high-throughput sequencing (NGS) is poised to replace array-based technologies as the experiment of choice for measuring RNA expression levels. Several groups have demonstrated the power of this new approach (RNA-seq), making significant and novel contributions and simultaneously proposing methodologies for the analysis of RNA-seq data. In a typical experiment, millions of short sequences (reads) are sampled from RNA extracts and mapped back to a reference genome. The number of reads mapping to each gene is used as proxy for its corresponding RNA concentration. A significant challenge in analyzing RNA expression of homologous genes is the large fraction of the reads that map to multiple locations in the reference genome. Currently, these reads are either dropped from the analysis, or a naive algorithm is used to estimate their underlying distribution. In this work, we present a rigorous alternative for handling the reads generated in an RNA-seq experiment within a probabilistic model for RNA-seq data; we develop maximum likelihood-based methods for estimating the model parameters. In contrast to previous methods, our model takes into account the fact that the DNA of the sequenced individual is not a perfect copy of the reference sequence. We show with both simulated and real RNA-seq data that our new method improves the accuracy and power of RNA-seq experiments.
引用
收藏
页码:459 / 468
页数:10
相关论文
共 50 条
  • [1] Accurate Estimation of Expression Levels of Homologous Genes in RNA-seq Experiments
    Pasaniuc, Bogdan
    Zaitlen, Noah
    Halperin, Eran
    RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS, 2010, 6044 : 397 - +
  • [2] ISOFORM ABUNDANCE INFERENCE PROVIDES A MORE ACCURATE ESTIMATION OF GENE EXPRESSION LEVELS IN RNA-SEQ
    Wang, Xi
    Wu, Zhengpeng
    Zhang, Xuegong
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2010, 8 : 177 - 192
  • [3] Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments
    Richard, Hugues
    Schulz, Marcel H.
    Sultan, Marc
    Nuernberger, Asja
    Schrinner, Sabine
    Balzereit, Daniela
    Dagand, Emilie
    Rasche, Axel
    Lehrach, Hans
    Vingron, Martin
    Haas, Stefan A.
    Yaspo, Marie-Laure
    NUCLEIC ACIDS RESEARCH, 2010, 38 (10) : e112
  • [4] Analysis and Differential Expression of Primo Genes Using RNA-Seq and qRT-PCR Experiments
    Shin, Jun-Young
    Ji, Jong-Ok
    Choi, Sang-Heon
    Choi, Da-Woon
    An, Ye-Jin
    Seo, Jae-Hyeok
    Choi, Jong-Gu
    Rho, Min-Suk
    Lee, Ji Yoon
    Yeo, Sujung
    Lee, Sang-Suk
    OXYGEN TRANSPORT TO TISSUE XLI, 2020, 1232 : 393 - 399
  • [5] The Utility of Shallow RNA-Seq for Documenting Differential Gene Expression in Genes with High and Low Levels of Expression
    Atallah, Joel
    Plachetzki, David C.
    Jasper, W. Cameron
    Johnson, Brian R.
    PLOS ONE, 2013, 8 (12):
  • [6] Statistical methods for identifying differentially expressed genes in RNA-Seq experiments
    Zhide Fang
    Jeffrey Martin
    Zhong Wang
    Cell & Bioscience, 2
  • [7] The RNA-Seq approach to studying the expression of mosquito mitochondrial genes
    Neira-Oviedo, M.
    Tsyganov-Bodounov, A.
    Lycett, G. J.
    Kokoza, V.
    Raikhel, A. S.
    Krzywinski, J.
    INSECT MOLECULAR BIOLOGY, 2011, 20 (02) : 141 - 152
  • [8] TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
    Nariai, Naoki
    Kojima, Kaname
    Mimori, Takahiro
    Sato, Yukuto
    Kawai, Yosuke
    Yamaguchi-Kabata, Yumi
    Nagasaki, Masao
    BMC GENOMICS, 2014, 15
  • [9] Evaluation of Normalization Methods for RNA-Seq Gene Expression Estimation
    Wu, Po-Yen
    Phan, John H.
    Zhou, Fengfeng
    Wang, May D.
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, 2011, : 50 - 57
  • [10] TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads
    Naoki Nariai
    Kaname Kojima
    Takahiro Mimori
    Yukuto Sato
    Yosuke Kawai
    Yumi Yamaguchi-Kabata
    Masao Nagasaki
    BMC Genomics, 15