Missing value imputation for epistatic MAPs

被引:14
|
作者
Ryan, Colm [1 ]
Greene, Derek [1 ]
Cagney, Gerard [2 ]
Cunningham, Padraig [1 ]
机构
[1] Univ Coll Dublin, Sch Informat & Comp Sci, Dublin 2, Ireland
[2] Univ Coll Dublin, Conway Inst Biomol & Biomed Res, Dublin 2, Ireland
来源
BMC BIOINFORMATICS | 2010年 / 11卷
基金
爱尔兰科学基金会;
关键词
QUANTITATIVE GENETIC INTERACTIONS; PROTEIN COMPLEXES; YEAST; NETWORK; GENOME; ORGANIZATION; MICROARRAYS; PATTERNS; STRATEGY; PATHWAY;
D O I
10.1186/1471-2105-11-197
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Epistatic miniarray profiling (E-MAPs) is a high-throughput approach capable of quantifying aggravating or alleviating genetic interactions between gene pairs. The datasets resulting from E-MAP experiments typically take the form of a symmetric pairwise matrix of interaction scores. These datasets have a significant number of missing values-up to 35%-that can reduce the effectiveness of some data analysis techniques and prevent the use of others. An effective method for imputing interactions would therefore increase the types of possible analysis, as well as increase the potential to identify novel functional interactions between gene pairs. Several methods have been developed to handle missing values in microarray data, but it is unclear how applicable these methods are to E-MAP data because of their pairwise nature and the significantly larger number of missing values. Here we evaluate four alternative imputation strategies, three local (Nearest neighbor-based) and one global (PCA-based), that have been modified to work with symmetric pairwise data. Results: We identify different categories for the missing data based on their underlying cause, and show that values from the largest category can be imputed effectively. We compare local and global imputation approaches across a variety of distinct E-MAP datasets, showing that both are competitive and preferable to filling in with zeros. In addition we show that these methods are effective in an E-MAP from a different species, suggesting that pairwise imputation techniques will be increasingly useful as analogous epistasis mapping techniques are developed in different species. We show that strongly alleviating interactions are significantly more difficult to predict than strongly aggravating interactions. Finally we show that imputed interactions, generated using nearest neighbor methods, are enriched for annotations in the same manner as measured interactions. Therefore our method potentially expands the number of mapped epistatic interactions. In addition we make implementations of our algorithms available for use by other researchers. Conclusions: We address the problem of missing value imputation for E-MAPs, and suggest the use of symmetric nearest neighbor based approaches as they offer consistently accurate imputations across multiple datasets in a tractable manner.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Missing value imputation for epistatic MAPs
    Colm Ryan
    Derek Greene
    Gerard Cagney
    Pádraig Cunningham
    [J]. BMC Bioinformatics, 11
  • [2] Data Imputation in Epistatic MAPs by Network-Guided Matrix Completion
    Zitnik, Marinka
    Zupan, Blaz
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2015, 22 (06) : 595 - 608
  • [3] Gaussian processes for missing value imputation
    Jafrasteh, Bahram
    Hernandez-Lobato, Daniel
    Lubian-Lopez, Simon Pedro
    Benavente-Fernandez, Isabel
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 273
  • [4] A hybrid method for missing value imputation
    Karanikola, Aikaterini
    Kotsiantis, Sotiris
    [J]. PROCEEDINGS OF THE 23RD PAN-HELLENIC CONFERENCE OF INFORMATICS (PCI 2019), 2019, : 74 - 79
  • [5] Missing Value Imputation for Diabetes Prediction
    Luo, Fei
    Qian, Hangwei
    Wang, Di
    Guo, Xu
    Sun, Yan
    Lee, Eng Sing
    Teong, Hui Hwang
    Lai, Ray Tian Rui
    Miao, Chunyan
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] DataWig: Missing value imputation for tables
    Bießmann, Felix
    Rukat, Tammo
    Schmidt, Phillipp
    Naidu, Prathik
    Schelter, Sebastian
    Taptunov, Andrey
    Lange, Dustin
    Salinas, David
    [J]. Journal of Machine Learning Research, 2019, 20
  • [7] DataWig: Missing Value Imputation for Tables
    Biessmann, Felix
    Rukat, Tammo
    Schmidt, Phillipp
    Naidu, Prathik
    Schelter, Sebastian
    Taptunov, Andrey
    Lange, Dustin
    Salinas, David
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20
  • [8] MISSING VALUE IMPUTATION WITH UNSUPERVISED BACKPROPAGATION
    Gashler, Michael S.
    Smith, Michael R.
    Morris, Richard
    Martinez, Tony
    [J]. COMPUTATIONAL INTELLIGENCE, 2016, 32 (02) : 196 - 215
  • [9] Imputation of Quantitative Genetic Interactions in Epistatic MAPs by Interaction Propagation Matrix Completion
    Zitnik, Marinka
    Zupan, Blaz
    [J]. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB2014, 2014, 8394 : 448 - 462
  • [10] Missing Value Imputation Techniques Depth Survey And an Imputation Algorithm To Improve The Efficiency Of Imputation
    Thirukumaran, S.
    Sumathi, A.
    [J]. 2012 FOURTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2012,