Instance reduction for supervised learning using input-output clustering method

被引:0
|
作者
Anusorn Yodjaiphet
Nipon Theera-Umpon
Sansanee Auephanwiriyakul
机构
[1] Chiang Mai University,Electrical Engineering Department, Faculty of Engineering
[2] Chiang Mai University,Biomedical Engineering Center
[3] Chiang Mai University,Computer Engineering Department, Faculty of Engineering
来源
关键词
instance reduction; input-output clustering; fuzzy c-means clustering; support vector regression; supervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed. The proposed method clusters the output data into groups and clusters the input data in accordance with the groups of output data. Then, a set of prototypes are selected from the clustered input data. The inessential data can be ultimately discarded from the data set. The proposed method can reduce the effect from outliers because only the prototypes are used. This method is applied to reduce the data set in regression problems. Two standard synthetic data sets and three standard real-world data sets are used for evaluation. The root-mean-square errors are compared from support vector regression models trained with the original data sets and the corresponding instance-reduced data sets. From the experiments, the proposed method provides good results on the reduction and the reconstruction of the standard synthetic and real-world data sets. The numbers of instances of the synthetic data sets are decreased by 25%-69%. The reduction rates for the real-world data sets of the automobile miles per gallon and the 1990 census in CA are 46% and 57%, respectively. The reduction rate of 96% is very good for the electrocardiogram (ECG) data set because of the redundant and periodic nature of ECG signals. For all of the data sets, the regression results are similar to those from the corresponding original data sets. Therefore, the regression performance of the proposed method is good while only a fraction of the data is needed in the training process.
引用
收藏
页码:4740 / 4748
页数:8
相关论文
共 50 条
  • [21] DEVELOPING AND USING INPUT-OUTPUT INFORMATION
    ROBBINS, PR
    [J]. JOURNAL OF FARM ECONOMICS, 1963, 45 (04): : 831 - 838
  • [22] Fuzzy learning control of nonlinear systems using input-output linearization
    Boukezzoula, R
    Galichet, S
    Foulloy, L
    [J]. 1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 2095 - 2100
  • [23] A Method to Point Out Anomalous Input-Output Patterns in a Database for Training Neuro-Fuzzy System with a Supervised Learning Rule
    Colla, Valentina
    Matarese, Nicola
    Reyneri, Leonardo M.
    [J]. 2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, : 1307 - +
  • [24] A Novel Method to Analyze Input-Output Controllability
    Ibrahim, Muhammad
    Hameed, Imran
    [J]. 2021 18TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATIC CONTROL (CCE 2021), 2021,
  • [25] ITERATION METHOD FOR SOLVING INPUT-OUTPUT RELATIONS
    SEKERKA, B
    [J]. EKONOMICKO-MATEMATICKY OBZOR, 1981, 17 (03): : 241 - 250
  • [26] INPUT-OUTPUT BASED MODEL REDUCTION FOR INTERCONNECTED SYSTEMS
    Holzwarth, Philip
    Eberhard, Peter
    [J]. 11TH WORLD CONGRESS ON COMPUTATIONAL MECHANICS; 5TH EUROPEAN CONFERENCE ON COMPUTATIONAL MECHANICS; 6TH EUROPEAN CONFERENCE ON COMPUTATIONAL FLUID DYNAMICS, VOLS II - IV, 2014, : 464 - 474
  • [27] The Research on Evaluation of Ecological Industrial Park Using the Input-output Method
    Wang Hong
    [J]. COMPREHENSIVE EVALUATION OF ECONOMY AND SOCIETY WITH STATISTICAL SCIENCE, 2010, : 232 - 240
  • [28] Optimizing parameters of supervised learning techniques (ANN) for precise mapping of the input-output relationship in TMCP steels
    Datta, S
    Banerjee, MK
    [J]. SCANDINAVIAN JOURNAL OF METALLURGY, 2004, 33 (06) : 310 - 315
  • [29] Learning Moore Machines from Input-Output Traces
    Giantamidis, Georgios
    Tripakis, Stavros
    [J]. FM 2016: FORMAL METHODS, 2016, 9995 : 291 - 309
  • [30] Learning Moore machines from input-output traces
    Giantamidis, Georgios
    Tripakis, Stavros
    Basagiannis, Stylianos
    [J]. INTERNATIONAL JOURNAL ON SOFTWARE TOOLS FOR TECHNOLOGY TRANSFER, 2021, 23 (01) : 1 - 29