Handling missing values in kernel methods with application to microbiology data

被引:14
|
作者
Belanche, Lluis A. [1 ]
Kobayashi, Vladimer [2 ]
Aluja, Tomas [3 ]
机构
[1] Tech Univ Catalonia, Dept Software, Sch Comp Sci, Barcelona 08034, Spain
[2] CNRS, UMR 5516, Lab Hubert Curien, F-42000 St Etienne, France
[3] Tech Univ Catalonia, Dept Stat & Operat Res, Sch Comp Sci, Barcelona 08034, Spain
关键词
Missing values; Support vector machines; Binary variables; FULLY CONDITIONAL SPECIFICATION; MULTIPLE IMPUTATION;
D O I
10.1016/j.neucom.2014.01.047
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We discuss several approaches that make possible for kernel methods to deal with missing values for binary variables. The first two are extended kernels able to handle missing values without data preprocessing methods. Another two methods are derived from a sophisticated multiple imputation technique involving logistic regression as local model learner. The performance of these approaches is compared using a binary data set that arises typically in microbiology (the microbial source tracking problem). We also address approaches to the largely neglected problem of prediction with missing values. Our results show that the kernel extensions demonstrate competitive performance in comparison with multiple imputation in terms of predictive accuracy. However, these results are achieved with a simpler and deterministic methodology and entail a much lower computational effort. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:110 / 116
页数:7
相关论文
共 50 条
  • [1] Handling missing values in exploratory multivariate data analysis methods
    Josse, Julie
    Husson, Francois
    [J]. JOURNAL OF THE SFDS, 2012, 153 (02): : 79 - 99
  • [2] A Review of Missing Values Handling Methods on Time-Series Data
    Pratama, Irfan
    Permanasari, Adhistya Erna
    Ardiyanto, Igi
    Indrayani, Rini
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2016,
  • [3] Handling missing values in trait data
    Johnson, Thomas F.
    Isaac, Nick J. B.
    Paviolo, Agustin
    Gonzalez-Suarez, Manuela
    [J]. GLOBAL ECOLOGY AND BIOGEOGRAPHY, 2021, 30 (01): : 51 - 62
  • [4] Methods for handling missing values in clinical trials
    Little, RJA
    [J]. JOURNAL OF RHEUMATOLOGY, 1999, 26 (08) : 1654 - 1656
  • [5] Comparing Methods for Handling Missing Data
    Roda, Celina
    Nicolis, Ioannis
    Momas, Isabelle
    Guihenneuc-Jouyaux, Chantal
    [J]. EPIDEMIOLOGY, 2013, 24 (03) : 469 - 471
  • [6] Use of AI methods for handling multi-dimensionality and missing values in biomedical data
    Kane, Rediona
    Varlamis, Iraklis
    Yiannakoulia, Mary
    Scarmeas, Nikolaos
    [J]. PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [7] A study of handling missing data methods for big data
    Ezzine, Imane
    Benhlima, Laila
    [J]. 2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 498 - 501
  • [8] A kernel PLS based classification method with missing data handling
    Thuy Tuong Nguyen
    Yury Tsoy
    [J]. Statistical Papers, 2017, 58 : 211 - 225
  • [9] A kernel PLS based classification method with missing data handling
    Thuy Tuong Nguyen
    Tsoy, Yury
    [J]. STATISTICAL PAPERS, 2017, 58 (01) : 211 - 225
  • [10] Methods for Handling Missing Secondary Respondent Data
    Young, Rebekah
    Johnson, David
    [J]. JOURNAL OF MARRIAGE AND FAMILY, 2013, 75 (01) : 221 - 234