Synthetic Data Generator for Classification Rules Learning

被引:0
|
作者
Liu, Runzong [1 ]
Fang, Bin [1 ]
Tang, Yuan Yan [2 ]
Chan, Patrick P. K. [3 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
[2] Univ Macau, Fac Sci & Technol, Macau, Peoples R China
[3] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou, Guangdong, Peoples R China
关键词
Synthetic data; Automatic decision support; Data mining; Decision tree; DECISION TREE;
D O I
10.1109/CCBD.2016.78
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A standard data set is useful to empirically evaluate classification rules learning algorithms. However, there is still no standard data set which is common enough for various situations. Data sets from the real world are limited to specific applications. The sizes of attributes, the rules and samples of the real data are fixed. A data generator is proposed here to produce synthetic data set which can be as big as the experiments demand. The size of attributes, rules, and samples of the synthetic data sets can be easily changed to meet the demands of evaluation on different learning algorithms. In the generator, related attributes are created at first. And then, rules are created based on the attributes. Samples are produced following the rules. Three decision tree algorithms are evaluated used synthetic data sets produced by the proposed data generator.
引用
收藏
页码:357 / 361
页数:5
相关论文
共 50 条
  • [1] Learning classification rules from data
    An, A
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2003, 45 (4-5) : 737 - 748
  • [2] SYNTHETIC DATA GENERATOR FOR TESTING OF CLASSIFICATION RULE ALGORITHMS
    Seidlova, R.
    Pozivil, J.
    Seidl, J.
    Malecl, L.
    [J]. NEURAL NETWORK WORLD, 2017, 27 (02) : 215 - 229
  • [3] Learning fuzzy classification rules from data
    Roubos, H
    Setnes, M
    Abonyi, J
    [J]. DEVELOPMENTS IN SOFT COMPUTING, 2001, : 108 - 115
  • [4] Learning Interpretable Rules for Scalable Data Representation and Classification
    Wang, Zhuo
    Zhang, Wei
    Liu, Ning
    Wang, Jianyong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (02) : 1121 - 1133
  • [5] Learning fuzzy classification rules from labeled data
    Roubos, JA
    Setnes, M
    Abonyi, J
    [J]. INFORMATION SCIENCES, 2003, 150 (1-2) : 77 - 93
  • [6] An n-Spheres Based Synthetic Data Generator for Supervised Classification
    Sanchez-Monedero, Javier
    Antonio Gutierrez, Pedro
    Perez-Ortiz, Maria
    Hervas-Martinez, Cesar
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, PT I, 2013, 7902 : 613 - 621
  • [7] Early classification of multivariate data by learning optimal decision rules
    Anshul Sharma
    Sanjay Kumar Singh
    [J]. Multimedia Tools and Applications, 2021, 80 : 35081 - 35104
  • [8] Early classification of multivariate data by learning optimal decision rules
    Sharma, Anshul
    Singh, Sanjay Kumar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 35081 - 35104
  • [9] Classification of UTGen Synthetic Traffic Generator
    Patil, Abhishek G.
    Surve, Anil
    Gupta, Anil Kumar
    [J]. 2016 CONFERENCE ON ADVANCES IN SIGNAL PROCESSING (CASP), 2016, : 280 - 285
  • [10] A Prototype of Synthetic Data Generator
    Garcia, D.
    Milian, M.
    [J]. 2011 6TH COLOMBIAN COMPUTING CONGRESS (CCC), 2011,