mRIN for direct assessment of genome-wide and gene-specific mRNA integrity from large-scale RNA-sequencing data

被引:42
|
作者
Feng, Huijuan [1 ,2 ,3 ]
Zhang, Xuegong [1 ,2 ]
Zhang, Chaolin [3 ]
机构
[1] Tsinghua Univ, MOE Key Lab Bioinformat, Beijing 100084, Peoples R China
[2] Tsinghua Univ, TNLIST, Bioinformat Div, Dept Automat, Beijing 100084, Peoples R China
[3] Columbia Univ, Dept Syst Biol, Dept Biochem & Mol Biophys, Ctr Motor Neuron Biol & Dis, New York, NY 10032 USA
来源
NATURE COMMUNICATIONS | 2015年 / 6卷
基金
美国国家卫生研究院;
关键词
QUALITY-CONTROL; HUMAN BRAIN; SEQ DATA; ISOFORM EXPRESSION; DEGRADATION; DECAY; TRANSCRIPTOME; QUANTIFICATION; SITES; MOUSE;
D O I
10.1038/ncomms8816
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The volume of RNA-Seq data sets in public repositories has been expanding exponentially, providing unprecedented opportunities to study gene expression regulation. Because degraded RNA samples, such as those collected from post-mortem tissues, can result in distinct expression profiles with potential biases, a particularly important step in mining these data is quality control. Here we develop a method named mRIN to directly assess mRNA integrity from RNA-Seq data at the sample and individual gene level. We systematically analyse large-scale RNA-Seq data sets of the human brain transcriptome generated by different consortia. Our analysis demonstrates that 3' bias resulting from partial RNA fragmentation in post-mortem tissues has a marked impact on global expression profiles, and that mRIN effectively identifies samples with different levels of mRNA degradation. Unexpectedly, this process has a reproducible and gene-specific component, and transcripts with different stabilities are associated with distinct functions and structural features reminiscent of mRNA decay in living cells.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Removing unwanted variation from large-scale RNA sequencing data with PRPS
    Ramyar Molania
    Momeneh Foroutan
    Johann A. Gagnon-Bartsch
    Luke C. Gandolfo
    Aryan Jain
    Abhishek Sinha
    Gavriel Olshansky
    Alexander Dobrovic
    Anthony T. Papenfuss
    Terence P. Speed
    Nature Biotechnology, 2023, 41 : 82 - 95
  • [22] Removing unwanted variation from large-scale RNA sequencing data with PRPS
    Molania, Ramyar
    Foroutan, Momeneh
    Gagnon-Bartsch, Johann A.
    Gandolfo, Luke C.
    Jain, Aryan
    Sinha, Abhishek
    Olshansky, Gavriel
    Dobrovic, Alexander
    Papenfuss, Anthony T.
    Speed, Terence P.
    NATURE BIOTECHNOLOGY, 2023, 41 (01) : 82 - +
  • [23] First large-scale quantification study of DNA preservation in insects from natural history collections using genome-wide sequencing
    Mullin, Victoria E.
    Stephen, William
    Arce, Andres N.
    Nash, Will
    Raine, Calum
    Notton, David G.
    Whiffin, Ashleigh
    Blagderov, Vladimir
    Gharbi, Karim
    Hogan, James
    Hunter, Tony
    Irish, Naomi
    Jackson, Simon
    Judd, Steve
    Watkins, Chris
    Haerty, Wilfried
    Ollerton, Jeff
    Brace, Selina
    Gill, Richard J.
    Barnes, Ian
    METHODS IN ECOLOGY AND EVOLUTION, 2023, 14 (02): : 360 - 371
  • [24] Efficient and accurate framework for genome-wide gene-environment interaction analysis in large-scale biobanks
    Ma, Yuzhuo
    Zhao, Yanlong
    Zhang, Ji-Feng
    Bi, Wenjian
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [25] RNA-Sequencing Combined With Genome-Wide Allele-Specific Expression Patterning Identifies ZNF44 Variants as a Potential New Driver Gene for Pediatric Neuroblastoma
    Sun, Lan
    Li, Xiaoqing
    Tu, Lingli
    Stucky, Andres
    Huang, Chuan
    Chen, Xuelian
    Cai, Jin
    Li, Shengwen Calvin
    CANCER CONTROL, 2023, 30
  • [26] Insights into the Genetic Underpinnings of Endocrine Traits from Large-Scale Genome-Wide Association Studies
    Cousminer, Diana L.
    Grant, Struan F. A.
    ENDOCRINOLOGY AND METABOLISM CLINICS OF NORTH AMERICA, 2020, 49 (04) : 725 - +
  • [27] BAYESIAN LARGE-SCALE MULTIPLE REGRESSION WITH SUMMARY STATISTICS FROM GENOME-WIDE ASSOCIATION STUDIES
    Zhu, Xiang
    Stephens, Matthew
    ANNALS OF APPLIED STATISTICS, 2017, 11 (03): : 1561 - 1592
  • [28] Mixed models for time-to-event outcomes with large-scale population cohorts and genome-wide data
    Benner, Christian
    Pirinen, Matti
    Salomaa, Veikko
    Palmgren, Juni
    Ripatti, Samuli
    GENETIC EPIDEMIOLOGY, 2015, 39 (07) : 533 - 533
  • [29] Large-scale Exploration of Gene-Gene Interactions in Prostate Cancer Using a Multistage Genome-wide Association Study
    Ciampa, Julia
    Yeager, Meredith
    Amundadottir, Laufey
    Jacobs, Kevin
    Kraft, Peter
    Chung, Charles
    Wacholder, Sholom
    Yu, Kai
    Wheeler, William
    Thun, Michael J.
    Divers, W. Ryan
    Gapstur, Susan
    Albanes, Demetrius
    Virtamo, Jarmo
    Weinstein, Stephanie
    Giovannucci, Edward
    Willett, Walter C.
    Cancel-Tassin, Geraldine
    Cussenot, Olivier
    Valeri, Antoine
    Hunter, David
    Hoover, Robert
    Thomas, Gilles
    Chanock, Stephen
    Chatterjee, Nilanjan
    CANCER RESEARCH, 2011, 71 (09) : 3287 - 3295
  • [30] Gene discovery and biological insights into anxiety disorders from a large-scale multi-ancestry genome-wide association study
    Friligkou, Eleni
    Lokhammer, Solveig
    Cabrera-Mendoza, Brenda
    Shen, Jie
    He, Jun
    Deiana, Giovanni
    Zanoaga, Mihaela Diana
    Asgel, Zeynep
    Pilcher, Abigail
    Di Lascio, Luciana
    Makharashvili, Ana
    Koller, Dora
    Tylee, Daniel S.
    Pathak, Gita A.
    Polimanti, Renato
    NATURE GENETICS, 2024, 56 (10) : 2036 - 2045