Biological assessment of robust noise models in microarray data analysis

被引:23
|
作者
Posekany, A. [1 ]
Felsenstein, K. [2 ]
Sykacek, P. [1 ]
机构
[1] Univ Nat Resources & Life Sci, Dept Biotechnol, Chair Bioinformat, A-1180 Vienna, Austria
[2] Vienna Univ Technol, Dept Stat, A-1040 Vienna, Austria
关键词
DIFFERENTIALLY EXPRESSED GENES; NONPARAMETRIC METHODS; NORMALIZATION; TRANSCRIPTOME; MECHANISMS; ONTOLOGY; OBESITY; MICE; TOOL;
D O I
10.1093/bioinformatics/btr018
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Although several recently proposed analysis packages for microarray data can cope with heavy-tailed noise, many applications rely on Gaussian assumptions. Gaussian noise models foster computational efficiency. This comes, however, at the expense of increased sensitivity to outlying observations. Assessing potential insufficiencies of Gaussian noise in microarray data analysis is thus important and of general interest. Results: We propose to this end assessing different noise models on a large number of microarray experiments. The goodness of fit of noise models is quantified by a hierarchical Bayesian analysis of variance model, which predicts normalized expression values as a mixture of a Gaussian density and t-distributions with adjustable degrees of freedom. Inference of differentially expressed genes is taken into consideration at a second mixing level. For attaining far reaching validity, our investigations cover a wide range of analysis platforms and experimental settings. As the most striking result, we find irrespective of the chosen preprocessing and normalization method in all experiments that a heavy-tailed noise model is a better fit than a simple Gaussian. Further investigations revealed that an appropriate choice of noise model has a considerable influence on biological interpretations drawn at the level of inferred genes and gene ontology terms. We conclude from our investigation that neglecting the over dispersed noise in microarray data can mislead scientific discovery and suggest that the convenience of Gaussian-based modelling should be replaced by non-parametric approaches or other methods that account for heavy-tailed noise.
引用
收藏
页码:807 / 814
页数:8
相关论文
共 50 条
  • [41] Robust DNA microarray image analysis
    Brändle, N
    Bischof, H
    Lapp, H
    MACHINE VISION AND APPLICATIONS, 2003, 15 (01) : 11 - 28
  • [42] Robust DNA microarray image analysis
    Norbert Brändle
    Horst Bischof
    Hilmar Lapp
    Machine Vision and Applications, 2003, 15 : 11 - 28
  • [43] Robust gene signatures from microarray data using genetic algorithms enriched with biological pathway keywords
    Luque-Baena, R. M.
    Urda, D.
    Gonzalo Claros, M.
    Franco, L.
    Jerez, J. M.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 49 : 32 - 44
  • [44] Biclustering models for structured microarray data
    Turner, HL
    Bailey, TC
    Krzanowski, WJ
    Hemingway, CA
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2005, 2 (04) : 316 - 329
  • [45] A benchmark for statistical microarray data analysis that preserves actual biological and technical variance
    Benoît De Hertogh
    Bertrand De Meulder
    Fabrice Berger
    Michael Pierre
    Eric Bareke
    Anthoula Gaigneaux
    Eric Depiereux
    BMC Bioinformatics, 11 (1)
  • [46] GARBAN:: genomic analysis and rapid biological annotation of cDNA microarray and proteomic data
    Martínez-Cruz, LA
    Rubio, A
    Martínez-Chantar, ML
    Labarga, A
    Barrio, I
    Podhorski, A
    Segura, V
    Campo, JLS
    Avila, MA
    Mato, JM
    BIOINFORMATICS, 2003, 19 (16) : 2158 - 2160
  • [47] BIOLOGICAL ANALYSIS OF MICROARRAY DATA USING ORTHOGONAL FORWARD SELECTION WITH A CLUSTERING APPROACH
    Kah, Wong Sou
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    Kasim, Shahreen
    Deris, Safaai
    Omatu, Sigeru
    Yoshioka, Michifumi
    JOURNAL OF BIOLOGICAL SYSTEMS, 2015, 23 (02) : 275 - 288
  • [48] A benchmark for statistical microarray data analysis that preserves actual biological and technical variance
    De Hertogh, Benoit
    De Meulder, Bertrand
    Berger, Fabrice
    Pierre, Michael
    Bareke, Eric
    Gaigneaux, Anthoula
    Depiereux, Eric
    BMC BIOINFORMATICS, 2010, 11
  • [49] BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis
    Durinck, S
    Moreau, Y
    Kasprzyk, A
    Davis, S
    De Moor, B
    Brazma, A
    Huber, W
    BIOINFORMATICS, 2005, 21 (16) : 3439 - 3440
  • [50] Hidden Markov models for microarray time course data in multiple biological conditions - Discussion - Rejoinder
    Yuan, Ming
    Kendziorski, Christina
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (476) : 1338 - 1340