A survey on identification and quantification of alternative polyadenylation sites from RNA-seq data

被引:25
|
作者
Chen, Moliang [1 ]
Ji, Guoli [1 ,2 ]
Fu, Hongjuan [1 ]
Lin, Qianmin [3 ]
Ye, Congting [4 ]
Ye, Wenbin [1 ]
Su, Yaru [5 ]
Wu, Xiaohui [1 ,6 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen 361005, Fujian, Peoples R China
[2] Xiamen Res Inst, Xiamen, Peoples R China
[3] Xiamen Univ, Xiangan Hosp, Xiamen, Peoples R China
[4] Xiamen Univ, Coll Environm & Ecol, Xiamen, Peoples R China
[5] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou, Peoples R China
[6] Xiamen Res Inst, Natl Ctr Healthcare Big Data, Xiamen, Peoples R China
基金
中国国家自然科学基金;
关键词
alternative polyadenylation; RNA-seq; 3 ' untranslated region; benchmark; predictive modeling; 3' UNTRANSLATED REGIONS; CHANGE-POINT MODEL; GENE-EXPRESSION; MESSENGER-RNAS; POLY(A) SITES; CLEAVAGE; REVEALS; WIDESPREAD; MECHANISMS; DYNAMICS;
D O I
10.1093/bib/bbz068
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Alternative polyadenylation (APA) has been implicated to play an important role in post-transcriptional regulation by regulating mRNA abundance, stability, localization and translation, which contributes considerably to transcriptome diversity and gene expression regulation. RNA-seq has become a routine approach for transcriptome profiling, generating unprecedented data that could be used to identify and quantify APA site usage. A number of computational approaches for identifying APA sites and/or dynamic APA events from RNA-seq data have emerged in the literature, which provide valuable yet preliminary results that should be refined to yield credible guidelines for the scientific community. In this review, we provided a comprehensive overview of the status of currently available computational approaches. We also conducted objective benchmarking analysis using RNA-seq data sets from different species (human, mouse and Arabidopsis) and simulated data sets to present a systematic evaluation of 11 representative methods. Our benchmarking study showed that the overall performance of all tools investigated is moderate, reflecting that there is still lot of scope to improve the prediction of APA site or dynamic APA events from RNA-seq data. Particularly, prediction results from individual tools differ considerably, and only a limited number of predicted APA sites or genes are common among different tools. Accordingly, we attempted to give some advice on how to assess the reliability of the obtained results. We also proposed practical recommendations on the appropriate method applicable to diverse scenarios and discussed implications and future directions relevant to profiling APA from RNA-seq data.
引用
收藏
页码:1261 / 1276
页数:16
相关论文
共 50 条
  • [41] flexiMAP: a regression-based method for discovering differential alternative polyadenylation events in standard RNA-seq data
    Szkop, Krzysztof J.
    Moss, David S.
    Nobeli, Irene
    BIOINFORMATICS, 2021, 37 (10) : 1461 - 1464
  • [43] A novel robust statistical method for isoform quantification from RNA-seq data
    Mondal, Pronoy K.
    Chatterjee, Raghunath
    Mukhopadhyay, Indranil
    GENETIC EPIDEMIOLOGY, 2018, 42 (07) : 719 - 719
  • [44] Robust identification of differentially expressed genes from RNA-seq data
    Shahjaman, Md
    Mollah, Md Manir Hossain
    Rahman, Md Rezanur
    Islam, S. M. Shahinul
    Mollah, Md Nurul Haque
    GENOMICS, 2020, 112 (02) : 2000 - 2010
  • [45] PolyAMiner-Bulk is a deep learning-based algorithm that decodes alternative polyadenylation dynamics from bulk RNA-seq data
    Jonnakuti, Venkata Soumith
    Wagner, Eric J.
    Maletic-Savatic, Mirjana
    Liu, Zhandong
    Yalamanchili, Hari Krishna
    CELL REPORTS METHODS, 2024, 4 (02):
  • [46] A survey of best practices for RNA-seq data analysis
    Conesa, Ana
    Madrigal, Pedro
    Tarazona, Sonia
    Gomez-Cabrero, David
    Cervera, Alejandra
    McPherson, Andrew
    Szczesniak, Michal Wojciech
    Gaffney, Daniel J.
    Elo, Laura L.
    Zhang, Xuegong
    Mortazavi, Ali
    GENOME BIOLOGY, 2016, 17
  • [47] A survey of statistical software for analysing RNA-seq data
    Gao D.
    Kim J.
    Kim H.
    Phang T.L.
    Selby H.
    Tan A.C.
    Tong T.
    Human Genomics, 5 (1) : 56 - 60
  • [48] A survey of best practices for RNA-seq data analysis
    Ana Conesa
    Pedro Madrigal
    Sonia Tarazona
    David Gomez-Cabrero
    Alejandra Cervera
    Andrew McPherson
    Michał Wojciech Szcześniak
    Daniel J. Gaffney
    Laura L. Elo
    Xuegong Zhang
    Ali Mortazavi
    Genome Biology, 17
  • [49] Accurate quantification of transcriptome from RNA-Seq data by effective length normalization
    Lee, Soohyun
    Seo, Chae Hwa
    Lim, Byungho
    Yang, Jin Ok
    Oh, Jeongsu
    Kim, Minjin
    Lee, Sooncheol
    Lee, Byungwook
    Kang, Changwon
    Lee, Sanghyuk
    NUCLEIC ACIDS RESEARCH, 2011, 39 (02) : e9
  • [50] Identification of reference genes in lung cancer from RNA-seq data
    Varela, Macarena Arroyo
    Moreno, Rocio Bautista
    Munoz, Rosario Carmona
    Jimenez, Rafael Larrosa
    Rios, Jose Luis De la Cruz
    Cobo, Manuel
    Claros, M. G.
    EUROPEAN RESPIRATORY JOURNAL, 2017, 50