A systematic comparison of normalization methods for eQTL analysis

被引:7
|
作者
Yang, Jiajun [1 ]
Wang, Dongyang [1 ]
Yang, Yanbo [1 ]
Yang, Wenqian [1 ]
Jin, Weiwei [1 ]
Niu, Xiaohui [1 ]
Gong, Jing [1 ]
机构
[1] Huazhong Agr Univ, Coll Informat, Wuhan 430070, Peoples R China
基金
中国国家自然科学基金;
关键词
normalization; expression quantitative trait loci; eQTL; RNA-Seq data; gene expression; GENOME-WIDE ASSOCIATION; TRANS-EQTLS; IDENTIFICATION; DRIVERS; LOCI;
D O I
10.1093/bib/bbab193
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Expression quantitative trait loci (eQTL) analysis has been widely used in interpreting disease-associated loci through correlating genetic variant loci with the expression of specific genes. RNA-sequencing (RNA-Seq), which can quantify gene expression at the genome-wide level, is often used in eQTL identification. Since different normalization methods of gene expression have substantial impacts on RNA-seq downstream analysis, it is of great necessity to systematically compare the effects of these methods on eQTL identification. Here, by using RNA-seq and genotype data of four different cancers in The Cancer Genome Atlas (TCGA) database, we comprehensively evaluated the effect of eight commonly used normalization methods on eQTL identification. Our results showed that the application of different methods could cause 20-30% differences in the final results of eQTL identification. Among these methods, COUNT, Median of Ratio (MED) and Trimmed Mean of M-values (TMM) generated similar results for identifying eQTLs, while Fragments Per Kilobase Million (FPKM) or RANK produced more differential results compared with other methods. Based on the accuracy and receiver operating characteristic (ROC) curve, the TMM method was found to be the optimal method for normalizing gene expression data in eQTLs analysis. In addition, we also evaluated the performance of different pairwise combinations of these methods. As a result, compared with single normalization methods, the combination of methods can not only identify more cis-eQTLs, but also improve the performance of the ROC curve. Overall, this study provides a comprehensive comparison of normalization methods for identifying eQTLs from RNA-seq data, and proposes some practical recommendations for diverse scenarios.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Comparison of normalization methods for the analysis of metagenomic gene abundance data
    Mariana Buongermino Pereira
    Mikael Wallroth
    Viktor Jonsson
    Erik Kristiansson
    BMC Genomics, 19
  • [2] Comparison of normalization methods for the analysis of metagenomic gene abundance data
    Pereira, Mariana Buongermino
    Wallroth, Mikael
    Jonsson, Viktor
    Kristiansson, Erik
    BMC GENOMICS, 2018, 19
  • [3] Systematic comparison of RNA-Seq normalization methods using measurement error models
    Sun, Zhaonan
    Zhu, Yu
    BIOINFORMATICS, 2012, 28 (20) : 2584 - 2591
  • [4] Comparison of normalization methods for cDNA microarrays
    Warren, LL
    Liu, B
    METHODS OF MICROARRAY DATA ANALYSIS III, 2003, : 105 - 121
  • [5] Comparison of normalization methods with microRNA microarray
    Hua, You-Jia
    Tu, Kang
    Tang, Zhong-Yi
    Li, Yi-Xue
    Mao, Hua-Sheng
    GENOMICS, 2008, 92 (02) : 122 - 128
  • [6] Gene set analysis methods: a systematic comparison
    Mathur, Ravi
    Rotroff, Daniel
    Ma, Jun
    Shojaie, Ali
    Motsinger-Reif, Alison
    BIODATA MINING, 2018, 11
  • [7] Gene set analysis methods: a systematic comparison
    Ravi Mathur
    Daniel Rotroff
    Jun Ma
    Ali Shojaie
    Alison Motsinger-Reif
    BioData Mining, 11
  • [8] COMPARISON OF THE INFLUENCE OF DIFFERENT NORMALIZATION METHODS ON TWEET SENTIMENT ANALYSIS IN THE SERBIAN LANGUAGE
    Ljajic, Adela
    Marovac, Ulfeta
    Stankovic, Milena
    FACTA UNIVERSITATIS-SERIES MATHEMATICS AND INFORMATICS, 2018, 33 (05): : 683 - 696
  • [9] Stain normalization methods for histopathology image analysis: A comprehensive review and experimental comparison
    Hoque, Md. Ziaul
    Keskinarkaus, Anja
    Nyberg, Pia
    Seppaenen, Tapio
    INFORMATION FUSION, 2024, 102
  • [10] A Large Comparison of Normalization Methods on Time Series
    Lima, Felipe Tomazelli
    Souza, Vinicius M. A.
    BIG DATA RESEARCH, 2023, 34