RNA-seq data science: From raw data to effective interpretation

被引:40
|
作者
Deshpande, Dhrithi [1 ]
Chhugani, Karishma [1 ]
Chang, Yutong [1 ]
Karlsberg, Aaron [2 ]
Loeffler, Caitlin [3 ]
Zhang, Jinyang [4 ]
Muszynska, Agata [5 ,6 ]
Munteanu, Viorel [7 ]
Yang, Harry [8 ]
Rotman, Jeremy [2 ]
Tao, Laura [9 ]
Balliu, Brunilda [9 ]
Tseng, Elizabeth [10 ]
Eskin, Eleazar [3 ,9 ,11 ]
Zhao, Fangqing [4 ,12 ]
Mohammadi, Pejman [13 ]
Labaj, Pawel P. [5 ,14 ]
Mangul, Serghei [2 ,15 ]
机构
[1] USC Alfred E Mann Sch Pharm & Pharmaceut Sci, Dept Pharmacol & Pharmaceut Sci, Los Angeles, CA USA
[2] USC Alfred E Mann Sch Pharm & Pharmaceut Sci, Dept Clin Pharm, Los Angeles, CA 90089 USA
[3] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA USA
[4] Chinese Acad Sci, Beijing Inst Life Sci, Beijing, Peoples R China
[5] Jagiellonian Univ, Malopolska Ctr Biotechnol, Krakow, Poland
[6] Silesian Tech Univ, Inst Automat Control Elect & Comp Sci, Gliwice, Poland
[7] Tech Univ Moldova, Dept Comp Informat & Microelect, Kishinev, Moldova
[8] Univ Calif Los Angeles, Dept Microbiol Immunol & Mol Genet, Los Angeles, CA USA
[9] UCLA, Dept Computat Med, CHS, David Geffen Sch Med, Los Angeles, CA USA
[10] Pacific Biosci, Menlo Pk, CA USA
[11] UCLA, Dept Human Genet, David Geffen Sch Med, Los Angeles, CA USA
[12] Univ Chinese Acad Sci, Hangzhou Inst Adv Study, Key Lab Syst Biol, Hangzhou, Peoples R China
[13] Scripps Res Inst, Dept Integrat Struct & Computat Biol, La Jolla, CA USA
[14] Boku Univ Vienna, Dept Biotechnol, Vienna, Austria
[15] USC Dornsife Coll Letters Arts & Sci, Dept Quantitat & Computat Biol, Los Angeles, CA 90089 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
RNA sequencing; transcriptome quantification; differential gene expression; high throughput sequencing; read alignment; bioinformatics; CIRCULAR RNAS; EXPRESSION ANALYSIS; DIFFERENTIAL GENE; SEQUENCING DATA; QUANTIFICATION; ALIGNMENT; READS; LANDSCAPE; FRAMEWORK; TRANSCRIPTOME;
D O I
10.3389/fgene.2023.997383
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Interpretation of Whole-Kidney RNA-Seq Data
    Clark, Jevin
    Neijman, Kim
    Chou, Chung-Lin
    Deen, Peter
    Chen, Lihe
    Knepper, Mark
    FASEB JOURNAL, 2018, 32 (01):
  • [2] Identification of CNAs from RNA-Seq data
    Iwamoto, Eisuke
    Sanada, Masashi
    Yasuda, Takahiko
    CANCER SCIENCE, 2022, 113 : 1446 - 1446
  • [3] Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
    Lee, Soohyun
    Seo, Chae Hwa
    Lim, Byungho
    Yang, Jin Ok
    Oh, Jeongsu
    Kim, Minjin
    Lee, Sooncheol
    Lee, Byungwook
    Kang, Changwon
    Lee, Sanghyuk
    NUCLEIC ACIDS RESEARCH, 2011, 39 (02) : e9
  • [4] GeneTonic: an R/Bioconductor package for streamlining the interpretation of RNA-seq data
    Marini, Federico
    Ludt, Annekathrin
    Linke, Jan
    Strauch, Konstantin
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [5] Methods of Identifying Cell Type from Single Cell RNA-seq Data and the Interpretation
    Zhang, Weiyu
    Jin, Weijia
    Yang, Jiaxi
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1672 - 1679
  • [6] GeneTonic: an R/Bioconductor package for streamlining the interpretation of RNA-seq data
    Federico Marini
    Annekathrin Ludt
    Jan Linke
    Konstantin Strauch
    BMC Bioinformatics, 22
  • [7] Disease Biomarker Query from RNA-Seq Data
    Han, Henry
    Jiang, Xiaoqian
    CANCER INFORMATICS, 2014, 13 : 81 - 94
  • [8] Dimensionality Reduction of RNA-Seq Data
    Al-Turaiki, Isra
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (03): : 31 - 36
  • [9] GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data
    Zhao, Jian
    Chen, Qi
    Wu, Jing
    Han, Ping
    Song, Xiaofeng
    SCIENTIFIC REPORTS, 2017, 7
  • [10] Transcript quantification with RNA-Seq data
    Bohnert, Regina
    Behr, Jonas
    Raetsch, Gunnar
    BMC BIOINFORMATICS, 2009, 10 : P5