SNP calling from RNA-seq data without a reference genome: identification, quantification, differential analysis and impact on the protein sequence

被引:54
|
作者
Lopez-Maestre, Helene [1 ,2 ,3 ,4 ]
Brinza, Lilia [5 ]
Marchet, Camille [6 ,7 ]
Kielbassa, Janice [8 ]
Bastien, Sylvere [1 ,2 ,3 ,4 ]
Boutigny, Mathilde [1 ,2 ,3 ,4 ]
Monnin, David [1 ,2 ,3 ]
El Filali, Adil [1 ,2 ,3 ]
Carareto, Claudia Marcia [9 ]
Vieira, Cristina [1 ,2 ,3 ,4 ]
Picard, Franck [1 ,2 ,3 ]
Kremer, Natacha [1 ,2 ,3 ]
Vavre, Fabrice [1 ,2 ,3 ,4 ]
Sagot, Marie-France [1 ,2 ,3 ,4 ]
Lacroix, Vincent [1 ,2 ,3 ,4 ]
机构
[1] Univ Lyon, F-69000 Lyon, France
[2] Univ Lyon 1, F-69622 Villeurbanne, France
[3] CNRS, UMR5558, Lab Biometrie & Biol Evolut, F-69622 Villeurbanne, France
[4] EPI ERABLE Inria Grenoble, Rhone Alpes, France
[5] BIOASTER, PT Genom & Transcriptom, Lyon, France
[6] Univ Rennes, F-35000 Rennes, France
[7] IRISA, Equipe GenScale, Rennes, France
[8] Univ Lyon 1, Ctr Leon Berard, Synergie Lyon Canc, Lyon, France
[9] UNESP Sao Paulo State Univ, Dept Biol, Sao Paulo, Brazil
基金
巴西圣保罗研究基金会; 欧洲研究理事会;
关键词
ALIGNMENT; TRANSCRIPTOME; POLYMORPHISM; OOGENESIS; EFFICIENT; VARIANTS; UNCOVERS; PROGRAM;
D O I
10.1093/nar/gkw655
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
SNPs (Single Nucleotide Polymorphisms) are genetic markers whose precise identification is a prerequisite for association studies. Methods to identify them are currently well developed for model species, but rely on the availability of a (good) reference genome, and therefore cannot be applied to non-model species. They are also mostly tailored for whole genome (re-)sequencing experiments, whereas in many cases, transcriptome sequencing can be used as a cheaper alternative which already enables to identify SNPs located in transcribed regions. In this paper, we propose a method that identifies, quantifies and annotates SNPs without any reference genome, using RNA-seq data only. Individuals can be pooled prior to sequencing, if not enough material is available from one individual. Using pooled human RNA-seq data, we clarify the precision and recall of our method and discuss them with respect to other methods which use a reference genome or an assembled transcriptome. We then validate experimentally the predictions of our method using RNA-seq data from two non-model species. The method can be used for any species to annotate SNPs and predict their impact on the protein sequence. We further enable to test for the association of the identified SNPs with a phenotype of interest.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] BADGE: A novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data
    Jinghua Gu
    Xiao Wang
    Leena Halakivi-Clarke
    Robert Clarke
    Jianhua Xuan
    BMC Bioinformatics, 15
  • [32] The Impact of Normalization Methods on RNA-Seq Data Analysis
    Zyprych-Walczak, J.
    Szabelska, A.
    Handschuh, L.
    Gorczak, K.
    Klamecka, K.
    Figlerowicz, M.
    Siatkowski, I.
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [33] Differential meta-analysis of RNA-seq data from multiple studies
    Andrea Rau
    Guillemette Marot
    Florence Jaffrézic
    BMC Bioinformatics, 15
  • [34] Differential meta-analysis of RNA-seq data from multiple studies
    Rau, Andrea
    Marot, Guillemette
    Jaffrezic, Florence
    BMC BIOINFORMATICS, 2014, 15
  • [35] Error estimates for the analysis of differential expression from RNA-seq count data
    Burden, Conrad J.
    Qureshi, Sumaira E.
    Wilson, Susan R.
    PEERJ, 2014, 2
  • [36] Stability of methods for differential expression analysis of RNA-seq data
    Bingqing Lin
    Zhen Pang
    BMC Genomics, 20
  • [37] Novel Data Transformations for RNA-seq Differential Expression Analysis
    Zeyu Zhang
    Danyang Yu
    Minseok Seo
    Craig P. Hersh
    Scott T. Weiss
    Weiliang Qiu
    Scientific Reports, 9
  • [38] Stability of methods for differential expression analysis of RNA-seq data
    Lin, Bingqing
    Pang, Zhen
    BMC GENOMICS, 2019, 20 (1)
  • [39] Novel Data Transformations for RNA-seq Differential Expression Analysis
    Zhang, Zeyu
    Yu, Danyang
    Seo, Minseok
    Hersh, Craig P.
    Weiss, Scott T.
    Qiu, Weiliang
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [40] A comparison of methods for differential expression analysis of RNA-seq data
    Soneson, Charlotte
    Delorenzi, Mauro
    BMC BIOINFORMATICS, 2013, 14