Flexible analysis of RNA-seq data using mixed effects models

被引:47
|
作者
Turro, Ernest [1 ,2 ]
Astle, William J. [3 ]
Tavare, Simon [1 ]
机构
[1] Univ Cambridge, Canc Res UK Cambridge Inst, Cambridge CB2 0RE, England
[2] Univ Cambridge, NHS Blood & Transplant, Dept Haematol, Cambridge CB2 0PT, England
[3] McGill Univ, Dept Epidemiol Biostat & Occupat Hlth, Montreal, PQ H3A 1A2, Canada
基金
英国生物技术与生命科学研究理事会;
关键词
DIFFERENTIAL EXPRESSION ANALYSIS; GENE;
D O I
10.1093/bioinformatics/btt624
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Most methods for estimating differential expression from RNA-seq are based on statistics that compare normalized read counts between treatment classes. Unfortunately, reads are in general too short to be mapped unambiguously to features of interest, such as genes, isoforms or haplotype-specific isoforms. There are methods for estimating expression levels that account for this source of ambiguity. However, the uncertainty is not generally accounted for in downstream analysis of gene expression experiments. Moreover, at the individual transcript level, it can sometimes be too large to allow useful comparisons between treatment groups. Results: In this article we make two proposals that improve the power, specificity and versatility of expression analysis using RNA-seq data. First, we present a Bayesian method for model selection that accounts for read mapping ambiguities using random effects. This polytomous model selection approach can be used to identify many interesting patterns of gene expression and is not confined to detecting differential expression between two groups. For illustration, we use our method to detect imprinting, different types of regulatory divergence in cis and in trans and differential isoform usage, but many other applications are possible. Second, we present a novel collapsing algorithm for grouping transcripts into inferential units that exploits the posterior correlation between transcript expression levels. The aggregate expression levels of these units can be estimated with useful levels of uncertainty. Our algorithm can improve the precision of expression estimates when uncertainty is large with only a small reduction in biological resolution.
引用
收藏
页码:180 / 188
页数:9
相关论文
共 50 条
  • [1] Kimma: flexible linear mixed effects modeling with kinship covariance for RNA-seq data
    Dill-McFarland, Kimberly A.
    Mitchell, Kiana
    Batchu, Sashank
    Segnitz, Richard Max
    Benson, Basilin
    Janczyk, Tomasz
    Cox, Madison S.
    Mayanja-Kizza, Harriet
    Boom, William Henry
    Benchek, Penelope
    Stein, Catherine M.
    Hawn, Thomas R.
    Altman, Matthew C.
    [J]. BIOINFORMATICS, 2023, 39 (05)
  • [2] RNA-Seq Data Analysis using Nonparametric Gaussian Process Models
    Thanh Nguyen
    Nahavandi, Saeid
    Creighton, Douglas
    Khosravi, Abbas
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 5087 - 5093
  • [3] Poisson-Tweedie mixed-effects model: A flexible approach for the analysis of longitudinal RNA-seq data
    Signorelli, Mirko
    Spitali, Pietro
    Tsonaka, Roula
    [J]. STATISTICAL MODELLING, 2021, 21 (06) : 520 - 545
  • [4] Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
    Yu, Lianbo
    Fernandez, Soledad
    Brock, Guy N.
    [J]. BMC BIOINFORMATICS, 2020, 21 (01)
  • [5] Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
    Lianbo Yu
    Soledad Fernandez
    Guy Brock
    [J]. BMC Bioinformatics, 21
  • [6] Bayesian Analysis of RNA-Seq Data Using a Family of Negative Binomial Models
    Zhao, Lili
    Wu, Weisheng
    Feng, Dai
    Jiang, Hui
    Nguyen, XuanLong
    [J]. BAYESIAN ANALYSIS, 2018, 13 (02): : 411 - 436
  • [7] lmerSeq: an R package for analyzing transformed RNA-Seq data with linear mixed effects models
    Brian E. Vestal
    Elizabeth Wynn
    Camille M. Moore
    [J]. BMC Bioinformatics, 23
  • [8] lmerSeq: an R package for analyzing transformed RNA-Seq data with linear mixed effects models
    Vestal, Brian E.
    Wynn, Elizabeth
    Moore, Camille M.
    [J]. BMC BIOINFORMATICS, 2022, 23 (01)
  • [9] An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data
    Sun, Xifang
    Sun, Shiquan
    Yang, Sheng
    [J]. CELLS, 2019, 8 (10)
  • [10] Analysis of clustered RNA-seq data
    Park, Hyunjin
    Lee, Seungyeoun
    Kim, Ye Jin
    Choi, Myung-Sook
    Park, Taesung
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (01) : 19 - 31