A Hybrid Model Focusing on Data Pre-Processing in Diabetes Diagnosis

被引:0
|
作者
Zeidi, Farnaz [1 ]
Azar, Lalah [1 ]
Arslan, Vasfiye [1 ]
Erol, Cigdem [2 ,3 ]
机构
[1] Istanbul Univ, Inst Sci, Div Informat, Istanbul, Turkey
[2] Istanbul Univ, Informat Dept, Istanbul, Turkey
[3] Istanbul Univ, Fac Sci, Dept Biol, Div Bot, Istanbul, Turkey
关键词
Classification algorithms; diabetes diagnosis; hybrid model; K-means algorithm; normalization; outliers detection;
D O I
10.1080/01969722.2022.2080338
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Diabetes mellitus is a common and serious disease that has been studied by many researchers. Pima Indians Diabetes Dataset is one of the most famous datasets in this field. This study aims to increase the accuracy of machine learning algorithms in diagnosing the disease and to reveal the patterns that enable early diagnosis of the disease by focusing on the pre-processing stages. The proposed hybrid model includes "filling in missing values with KNN", "examining six different normalization methods for normalization" and "removing outliers with K-means" in the pre-processing stage. In the data classification stage, four algorithms C4.5, SVM, Naive Bayes and KNN were examined and the best hybrid model was found. The performance evaluation of these models is based on accuracy. The results were compared with previous studies and had higher accuracy of 98.3% and 99.1% for (KNN + n5 + K-means + SVM) and (KNN + n4/n3 + K-means + KNN), respectively. Finally, we offer the conclusive notes and some suggestions for further study.
引用
收藏
页码:1199 / 1211
页数:13
相关论文
共 50 条
  • [1] Image pre-processing techniques for auto focusing
    Li, Qi
    Feng, Hua-Jun
    Xu, Zhi-Hai
    Guangdian Gongcheng/Opto-Electronic Engineering, 2004, 31 (09):
  • [2] Pre-processing for data clustering
    Frigui, H
    NAFIPS 2004: ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY, VOLS 1AND 2: FUZZY SETS IN THE HEART OF THE CANADIAN ROCKIES, 2004, : 967 - 972
  • [3] Pre-processing of the speech data
    不详
    ROBUST ADAPTATION TO NON-NATIVE ACCENTS IN AUTOMATIC SPEECH RECOGNITION, 2002, 2560 : 15 - 19
  • [4] An Enhanced Pre-Processing Model for Big Data Processing: A Quality Framework
    Lincy, Blessy Trencia S. S.
    Kumar, N. Suresh
    2017 IEEE INTERNATIONAL CONFERENCE ON INNOVATIONS IN GREEN ENERGY AND HEALTHCARE TECHNOLOGIES (IGEHT), 2017,
  • [5] Hybrid Model of Customer Response Modeling Through Combination of Neural Networks and Data Pre-processing
    Aliabadi, Abbas Namdar
    Berenji, Hamid
    2013 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ - IEEE 2013), 2013,
  • [6] Drought Forecasting: A Review and Assessment of the Hybrid Techniques and Data Pre-Processing
    Alawsi, Mustafa A.
    Zubaidi, Salah L.
    Al-Bdairi, Nabeel Saleem Saad
    Al-Ansari, Nadhir
    Hashim, Khalid
    HYDROLOGY, 2022, 9 (07)
  • [7] On Pre-processing Algorithms for Data Stream
    Duda, Piotr
    Jaworski, Maciej
    Pietruczuk, Lena
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 56 - 63
  • [8] Kurtosis removal for data pre-processing
    Loperfido, Nicola
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2023, 17 (01) : 239 - 267
  • [9] Intelligent assistance for data pre-processing
    Bilalli, Besim
    Abello, Alberto
    Aluja-Banet, Tomas
    Wrembel, Robert
    COMPUTER STANDARDS & INTERFACES, 2018, 57 : 101 - 109
  • [10] A NEW METHOD FOR DATA PRE-PROCESSING
    RAISINGHANI, SC
    BILIMORIA, KD
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1984, 7 (02) : 255 - 256