Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity

被引:65
|
作者
Kadota, Koji [1 ]
Nakai, Yuji [1 ]
Shimizu, Kentaro [1 ]
机构
[1] Univ Tokyo, Grad Sch Agr & Life Sci, Bunkyo Ku, Tokyo 1138657, Japan
来源
关键词
PROBE LEVEL DATA; OLIGONUCLEOTIDE ARRAYS; MICROARRAY EXPERIMENTS; HOOK-CALIBRATION; MODEL; BIOCONDUCTOR; ACCURACY;
D O I
10.1186/1748-7188-4-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility. Results: We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA- preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm. Conclusion: Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes
    Meuwissen, THE
    Goddard, ME
    GENETICS SELECTION EVOLUTION, 2004, 36 (02) : 191 - 205
  • [32] Prioritization of differentially expressed genes through integrating public expression data
    Xu, W.
    Li, S.
    Zhang, Z.
    Hu, J.
    Zhao, Y.
    ANIMAL GENETICS, 2019, 50 (06) : 726 - 732
  • [33] A Comparision Between Methods for Generating Differentially Expressed Genes from Microarray Data for Prediction of Disease
    Dasgupta, Srirupa
    Saha, Goutam
    Mondal, Ritwik
    Pal, Rajat Kumar
    Chanda, Amitabha
    2015 THIRD INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, CONTROL AND INFORMATION TECHNOLOGY (C3IT), 2015,
  • [34] Comparison of methods for identifying differentially expressed genes across multiple conditions from microarray data
    Tan, Yuande
    Liu, Yin
    BIOINFORMATION, 2011, 7 (08) : 400 - 404
  • [35] A COMPARISON OF STATISTICAL METHODS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES FROM RNA-SEQ DATA
    Kvam, Vanessa M.
    Lu, Peng
    Si, Yaqing
    AMERICAN JOURNAL OF BOTANY, 2012, 99 (02) : 248 - 256
  • [36] Statistical methods on detecting differentially expressed genes for RNA-seq data
    Chen, Zhongxue
    Liu, Jianzhong
    Ng, Hon Keung Tony
    Nadarajah, Saralees
    Kaufman, Howard L.
    Yang, Jack Y.
    Deng, Youping
    BMC SYSTEMS BIOLOGY, 2011, 5
  • [37] Gene Expression Profiling Identifies Differentially Expressed Genes in Cardiac Allograft Vasculopathy
    Colvin-Adams, Monica
    Harcourt, Nonyelum
    Zhang, Ying
    Mitchell, Adam
    Liao, Kenneth
    Beckman, Kenneth
    CIRCULATION, 2014, 130
  • [38] Multivariate Method for Inferential Identification of Differentially Expressed Genes in Gene Expression Experiments
    Pablo Acosta, Juan
    Restrepo, Silvia
    Henao, Juan David
    Lopez-Kleine, Liliana
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2019, 26 (08) : 866 - 874
  • [39] Specific gene expression in pancreatic β-cells -: Cloning and characterization of differentially expressed genes
    Arava, Y
    Adamsky, K
    Ezerzer, C
    Ablamunits, V
    Walker, MD
    DIABETES, 1999, 48 (03) : 552 - 556
  • [40] Uncovering differentially expressed pathways with protein interaction and gene expression data
    Qiu, Yu-Qing
    Zhang, Shihua
    Zhang, Xiang-Sun
    OPTIMIZATION AND SYSTEMS BIOLOGY, PROCEEDINGS, 2008, 9 : 74 - 82