Modeling and analysis of RNA-seq data: a review from a statistical perspective

被引:39
|
作者
Li, Wei Vivian [1 ]
Li, Jingyi Jessica [1 ,2 ]
机构
[1] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
基金
美国国家科学基金会;
关键词
RNA-seq; statistical modeling; differentially expressed genes; alternatively spliced exons; isoform reconstruction and quantification;
D O I
10.1007/s40484-018-0144-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
BackgroundSince the invention of next-generation RNA sequencing (RNA-seq) technologies, they have become a powerful tool to study the presence and quantity of RNA molecules in biological samples and have revolutionized transcriptomic studies. The analysis of RNA-seq data at four different levels (samples, genes, transcripts, and exons) involve multiple statistical and computational questions, some of which remain challenging up to date.ResultsWe review RNA-seq analysis tools at the sample, gene, transcript, and exon levels from a statistical perspective. We also highlight the biological and statistical questions of most practical considerations.ConclusionsThe development of statistical and computational methods for analyzing RNA-seq data has made significant advances in the past decade. However, methods developed to answer the same biological question often rely on diverse statistical models and exhibit different performance under different scenarios. This review discusses and compares multiple commonly used statistical models regarding their assumptions, in the hope of helping users select appropriate methods as needed, as well as assisting developers for future method development.
引用
收藏
页码:195 / 209
页数:15
相关论文
共 50 条
  • [21] Parametric analysis of RNA-seq expression data
    Konishi, Tomokazu
    [J]. GENES TO CELLS, 2016, 21 (06) : 639 - 647
  • [22] RNA-Seq UD: A bioinformatics plattform for RNA-Seq analysis
    Ramirez, Miguel
    Alejandro Rojas-Quintero, Cristian
    Enrique Vera-Parra, Nelson
    [J]. 2015 10TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2015,
  • [23] Identification of CNAs from RNA-Seq data
    Iwamoto, Eisuke
    Sanada, Masashi
    Yasuda, Takahiko
    [J]. CANCER SCIENCE, 2022, 113 : 1446 - 1446
  • [24] Modeling Alternative Splicing Variants from RNA-Seq Data with Isoform Graphs
    Beretta, Stefano
    Bonizzoni, Paola
    Della Vedova, Gianluca
    Pirola, Yuri
    Rizzi, Raffaella
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2014, 21 (01) : 16 - 40
  • [25] A COMPARISON OF STATISTICAL METHODS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES FROM RNA-SEQ DATA
    Kvam, Vanessa M.
    Lu, Peng
    Si, Yaqing
    [J]. AMERICAN JOURNAL OF BOTANY, 2012, 99 (02) : 248 - 256
  • [26] IQML: A Robust Statistical Approach for Isoform Level Quantification from RNA-Seq Data
    Mondal, Pronoy Kanti
    Chatterjee, Raghunath
    Mukhopadhyay, Indranil
    [J]. GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 653 - 654
  • [27] RISQ: A novel robust statistical approach for isoform quantification from RNA-seq data
    Mondal, Pronoy Kanti
    Chatterjee, Raghunath
    Mukhopadhyay, Indranil
    [J]. HUMAN GENOMICS, 2018, 12
  • [28] Systematically evaluating interfaces for RNA-seq analysis from a life scientist perspective
    Poplawski, Alicia
    Marini, Federico
    Hess, Moritz
    Zeller, Tanja
    Mazur, Johanna
    Binder, Harald
    [J]. BRIEFINGS IN BIOINFORMATICS, 2016, 17 (02) : 213 - 223
  • [29] Modeling RNA degradation for RNA-Seq with applications
    Wan, Lin
    Yan, Xiting
    Chen, Ting
    Sun, Fengzhu
    [J]. BIOSTATISTICS, 2012, 13 (04) : 734 - 747
  • [30] Statistical methods on detecting differentially expressed genes for RNA-seq data
    Chen, Zhongxue
    Liu, Jianzhong
    Ng, Hon Keung Tony
    Nadarajah, Saralees
    Kaufman, Howard L.
    Yang, Jack Y.
    Deng, Youping
    [J]. BMC SYSTEMS BIOLOGY, 2011, 5