Model, properties and imputation method of missing SNP genotype data utilizing mutual information

被引:3
|
作者
Wang, Ying [1 ]
Wan, Weiming [1 ]
Wang, Rui-Sheng [2 ]
Feng, Enmin [3 ]
机构
[1] Dalian Jiaotong Univ, Sch Sci, Dalian 116028, Peoples R China
[2] Renmin Univ China, Sch Informat, Beijing 100872, Peoples R China
[3] Dalian Univ Technol, Dept Appl Math, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
Mutual information; Imputation method; Missing genotype data; Missing SNP site; Extension method;
D O I
10.1016/j.cam.2008.10.020
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Mutual information can be used as a measure for the association of a genetic marker or a combination of markers with the phenotype. In this paper, we study the imputation of missing genotype data. We first utilize joint mutual information to compute the dependence between SNP sites, then construct a mathematical model in order to find the two SNP sites having maximal dependence with missing SNP sites, and further study the properties of this model. Finally, an extension method to haplotype-based imputation is proposed to impute the missing values in genotype data. To verify our method, extensive experiments have been performed, and numerical results show that our method is superior to haplotype-based imputation methods. At the same time, numerical results also prove joint mutual information can better measure the dependence between SNP sites. According to experimental results, we also conclude that the dependence between the adjacent SNP sites is not necessarily strongest. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:168 / 174
页数:7
相关论文
共 50 条
  • [21] Robust imputation method for missing values in microarray data
    Dankyu Yoon
    Eun-Kyung Lee
    Taesung Park
    BMC Bioinformatics, 8
  • [22] A robust missing value imputation method for noisy data
    Bing Zhu
    Changzheng He
    Panos Liatsis
    Applied Intelligence, 2012, 36 : 61 - 74
  • [23] A New Method to Missing Value Imputation for Immunosignature Data
    Koshechkin, A. A.
    Andryushchenko, V. S.
    Zamyatin, A., V
    SOVREMENNYE TEHNOLOGII V MEDICINE, 2019, 11 (02) : 19 - 23
  • [24] Approximate Imputation Method for Missing Data in Machine Learning
    Cao W.
    Chu Y.
    Li X.
    1600, Xi'an Jiaotong University (51): : 142 - 148
  • [25] A robust missing value imputation method for noisy data
    Zhu, Bing
    He, Changzheng
    Liatsis, Panos
    APPLIED INTELLIGENCE, 2012, 36 (01) : 61 - 74
  • [26] Missing Data Imputation With Baseline Information in Longitudinal Clinical Trials
    Zhang, Yilong
    Zimmer, Zachary
    Xu, Lei
    Lam, Raymond L. H.
    Huyck, Susan
    Golm, Gregory
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2022, 14 (02): : 242 - 248
  • [27] Missing data imputation in meteorological datasets with the GAIN method
    Popolizio, Marina
    Amato, Alberto
    Politi, Tiziano
    Calienno, Roberto
    Di Lecce, Vincenzo
    2021 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR INDUSTRY 4.0 & IOT (IEEE METROIND4.0 & IOT), 2021, : 556 - 560
  • [28] Optimal imputation of the missing data using multi auxiliary information
    Shashi Bhushan
    Abhay Pratap Pandey
    Computational Statistics, 2021, 36 : 449 - 477
  • [29] Optimal imputation of the missing data using multi auxiliary information
    Bhushan, Shashi
    Pandey, Abhay Pratap
    COMPUTATIONAL STATISTICS, 2021, 36 (01) : 449 - 477
  • [30] Augmented Stochastic Multiple Imputation Model for Airport Pavement Missing Data Imputation
    Farhan, J.
    Fwa, T. F.
    TRANSPORTATION RESEARCH RECORD, 2014, (2449) : 96 - 104