Dynamic selection of normalization techniques using data complexity measures

被引:134
|
作者
Jain, Sukirty [1 ]
Shukla, Sanyam [1 ]
Wadhvani, Rajesh [1 ]
机构
[1] Maulana Azad Natl Inst Technol, Bhopal 462007, Madhya Pradesh, India
关键词
Data complexity; Data preprocessing; MM-max normalization; z-score normalization; Gaussian Kernel ELM; EXTREME LEARNING-MACHINE; CLASSIFIERS; SET;
D O I
10.1016/j.eswa.2018.04.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data preprocessing is an important step for designing classification model. Normalization is one of the preprocessing techniques used to handle the out-of-bounds attributes. This work develops 14 classification models using different learning algorithms for dynamic selection of normalization technique. This work extracts 12 data complexity measures for 48 datasets drawn from the KEEL dataset repository. Each of these datasets is normalized using min-max and z-score normalization technique. G-mean index is estimated for these normalized datasets using Gaussian Kernel Extreme Learning Machine (KELM) in order to determine the best-suited normalization technique. The data complexity measures along with the best suited normalization technique are used as an input for developing the aforementioned dynamic models. These models predict the best suitable normalization technique based on the estimated data complexity measures of the dataset The result shows that the model developed using Gaussian Kernel ELM (KELM) and Support Vector Machine (SVM) give promising results for most of the evaluated classification problems. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:252 / 262
页数:11
相关论文
共 50 条
  • [41] Using classification techniques to improve replica selection in data grid
    Jin, Hai
    Huang, Jin
    Xie, Xia
    Zhang, Qin
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2006: COOPIS, DOA, GADA, AND ODBASE PT 2, PROCEEDINGS, 2006, 4276 : 1376 - 1387
  • [42] Revisiting Feature Selection with Data Complexity
    Ngan Thi Dong
    Khosla, Megha
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020), 2020, : 211 - 216
  • [43] Normalization techniques for PARAFAC modeling of urine metabolomic data
    Gardlo, Alzbeta
    Smilde, Age K.
    Hron, Karel
    Hrda, Marcela
    Karlikova, Radana
    Friedecky, David
    Adam, Tomas
    METABOLOMICS, 2016, 12 (07)
  • [44] Normalization techniques for PARAFAC modeling of urine metabolomic data
    Alžběta Gardlo
    Age K. Smilde
    Karel Hron
    Marcela Hrdá
    Radana Karlíková
    David Friedecký
    Tomáš Adam
    Metabolomics, 2016, 12
  • [45] ASSESSMENT OF NORMALIZATION TECHNIQUES ON THE ACCURACY OF HYPERSPECTRAL DATA CLUSTERING
    Naeini, A. Alizadeh
    Babadi, M.
    Homayouni, S.
    ISPRS INTERNATIONAL JOINT CONFERENCES OF THE 2ND GEOSPATIAL INFORMATION RESEARCH (GI RESEARCH 2017); THE 4TH SENSORS AND MODELS IN PHOTOGRAMMETRY AND REMOTE SENSING (SMPR 2017); THE 6TH EARTH OBSERVATION OF ENVIRONMENTAL CHANGES (EOEC 2017), 2017, 42-4 (W4): : 27 - 30
  • [46] System and Architecture Evaluation Framework Using Cross-domain Dynamic Complexity Measures
    Fischi, Jonathan
    Nilchiani, Roshanak
    Wade, Jon
    2016 ANNUAL IEEE SYSTEMS CONFERENCE (SYSCON), 2016, : 42 - 48
  • [47] POLARIMETRIC SAR DATA FEATURE SELECTION USING MEASURES OF MUTUAL INFORMATION
    Tanase, R.
    Radoi, A.
    Datcu, M.
    Raducanu, D.
    2015 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2015, : 1140 - 1143
  • [48] Fuzzy Information Measures Feature Selection Using Descriptive Statistics Data
    Salem, Omar A. M.
    Liu, Haowen
    Liu, Feng
    Chen, Yi-Ping Phoebe
    Chen, Xi
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2022, PT III, 2022, 13370 : 77 - 90
  • [49] Evaluation of Code and Data Spatial Complexity Measures
    Chhabra, Jitender Kumar
    Gupta, Varun
    CONTEMPORARY COMPUTING, PROCEEDINGS, 2009, 40 : 604 - 614
  • [50] Data Complexity Measures for Imbalanced Classification Tasks
    Barella, Victor H.
    Garcia, Luis P. F.
    de Souto, Marcilio P.
    Lorena, Ana C.
    de Carvalho, Andre
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,