MIDA: a Web Tool for MIssing DAta Imputation based on a Boosted and Incremental Learning Algorithm

被引:0
|
作者
Acampora, Giovanni [1 ]
Vitiello, Autilia [1 ]
Siciliano, Roberta [2 ]
机构
[1] Univ Naples Federico II, Dept Phys Ettore Pancini, Naples, Italy
[2] Univ Naples Federico II, Dept Ind Engn, Naples, Italy
基金
欧盟地平线“2020”;
关键词
SOFTWARE TOOL; KEEL;
D O I
10.1109/fuzz48607.2020.9177644
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the main issues in machine learning is related to the quality of data used to efficiently train statistical models for classification/regression tasks. Among these issues, the presence of missing values in data sets is particularly prone in affecting the accuracy performance of learning methods. As a consequence there is a strong emergence of software tools aimed at supporting machine learning users in "filling-in" their data sets before inputting them to training algorithms. This paper bridges this gap by introducing a web-based tool for MIssing DAta imputation (MIDA) based on a novel supervised learning method, namely Generalized Boosted Incremental Non Parametric Imputation algorithm (G-BINPI), able to address the missing values issue in scenarios where a "missing at random" assumption occurs. The proposed approach enables machine learning users to remotely imputing their data sets by means of an intuitive graphical user interface. As highlighted in the experimental section, the proposed approach yields better performance than conventional approaches for missing data imputation on different benchmark data sets.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Accurate Tree-based Missing Data Imputation and Data Fusion within the Statistical Learning Paradigm
    Antonio D’Ambrosio
    Massimo Aria
    Roberta Siciliano
    [J]. Journal of Classification, 2012, 29 : 227 - 258
  • [42] Accurate Tree-based Missing Data Imputation and Data Fusion within the Statistical Learning Paradigm
    D'Ambrosio, Antonio
    Aria, Massimo
    Siciliano, Roberta
    [J]. JOURNAL OF CLASSIFICATION, 2012, 29 (02) : 227 - 258
  • [43] MIAEC: Missing Data Imputation Based on the Evidence Chain
    Xu, Xiaolong
    Chong, Weizhi
    Li, Shancang
    Arabo, Abdullahi
    Xiao, Jianyu
    [J]. IEEE ACCESS, 2018, 6 : 12983 - 12992
  • [44] Federated Incremental Learning algorithm based on Topological Data Analysis
    Hu, Kai
    Gong, Sheng
    Li, Lingxiao
    Luo, Yuantu
    Li, YaoGen
    Jiang, Shanshan
    [J]. Pattern Recognition, 2025, 158
  • [45] Incremental Learning Algorithm of Data Complexity Based on KNN Classifier
    Li Jie
    Xue Yaxu
    Yu Yadong
    [J]. 2020 INTERNATIONAL SYMPOSIUM ON COMMUNITY-CENTRIC SYSTEMS (CCS), 2020,
  • [46] Tree-based Approach to Missing Data Imputation
    Vateekul, Peerapon
    Sarinnapakorn, Kanoksri
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 70 - +
  • [47] CLASSIFIERS ACCURACY IMPROVEMENT BASED ON MISSING DATA IMPUTATION
    Jordanov, Ivan
    Petrov, Nedyalko
    Petrozziello, Alessio
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2018, 8 (01) : 31 - 48
  • [48] Missing data imputation based on stochastic neighbor embedding
    Petrov, I. B.
    Ryazanov, V. V.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE (ICPRAI 2018), 2018, : 698 - 701
  • [49] Missing Categorical Data Imputation Approach Based on Similarity
    Wu, Sen
    Feng, Xiaodong
    Han, Yushan
    Wang, Qiang
    [J]. PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 2827 - 2832
  • [50] Development of NgeXTEA: a web based learning tool for cryptography algorithm
    Syamsuddin, Irfan
    Daude, Alimin
    [J]. INTERNATIONAL CONFERENCE ON MATHEMATICS AND SCIENCE EDUCATION 2019 (ICMSCE 2019), 2020, 1521