A hybrid evolutionary approach to construct optimal decision trees with large data sets

被引:0
|
作者
Patil, D. V. [1 ]
Bichkar, R. S. [2 ]
机构
[1] SGGS Inst Engn & Tech, Nanded, MS, India
[2] SGGS Inst Engn & Tech, Dept Elect & Telecommun Engn, Nanded, MS, India
关键词
large data sets; decision tree; genetic algorithm; genetically evolved decision Tree; training set size; and classification accuracy; Comprehensibility;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining environments produces large Volume of data. The large amount of knowledge contains can be utilized to improve decision-making process of an organization. Large amount of available data when used for decision tree construction builds large sized trees that are incomprehensible to human experts. The learning process on this high volume data becomes very slow, as it has to be done serially on available large datasets. Our ultimate goal is to build smaller trees with equally accurate solutions with randomly selected sampled data. We experimented on techniques based on the idea of incremental random sampling combined with genetic algorithms that uses global search techniques to evolve decision Trees to obtain compact representation of large data set. Experiments performed on some data sets proved that the proposed random sampling procedures with genetic algorithms to build decision Trees gives relatively smaller trees as compared to other methods but equally accurate solution as other methods. The method incorporates optimization with the Comprehensibility and scalability. We tried to explore the method using that we can avoid problems like slow execution, overloading of memory and processor with very large database can be avoided using the technique.
引用
收藏
页码:603 / +
页数:2
相关论文
共 50 条
  • [41] A multi-objective evolutionary approach to Pareto-optimal model trees
    Marcin Czajkowski
    Marek Kretowski
    Soft Computing, 2019, 23 : 1423 - 1437
  • [42] A multi-objective evolutionary approach to Pareto-optimal model trees
    Czajkowski, Marcin
    Kretowski, Marek
    SOFT COMPUTING, 2019, 23 (05) : 1423 - 1437
  • [43] A Transformation Approach Towards Big Data Multilabel Decision Trees
    Rivera Rivas, Antonio Jesus
    Charte Ojeda, Francisco
    Javier Pulgar, Francisco
    Jose del Jesus, Maria
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2017, PT I, 2017, 10305 : 73 - 84
  • [44] A Hybrid Approach Based on Decision Trees and Clustering for Breast Cancer Classification
    Elouedi, Hind
    Meliani, Walid
    Elouedi, Zied
    Ben Amor, Nahla
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 226 - 231
  • [45] A TARGETED APPROACH FOR ANALYZING LARGE LIPIDOMIC DATA SETS
    Paulson, D.
    Mazzer, P.
    PROCEEDINGS OF THE SOUTH DAKOTA ACADEMY OF SCIENCE, VOL 96, 2017, 96 : 223 - 223
  • [46] A Genetic Algorithm Approach for Clustering Large Data Sets
    Luchi, Diego
    Rodrigues, Alexandre
    Varejao, Flavio Miguel
    Santos, Willian
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 570 - 576
  • [47] Evolutionary approach to construct robust codes for DNA-based data storage
    Rasool, Abdur
    Jiang, Qingshan
    Wang, Yang
    Huang, Xiaoluo
    Qu, Qiang
    Dai, Junbiao
    FRONTIERS IN GENETICS, 2023, 14
  • [48] Efficient hierarchical clustering of large data sets using P-trees
    Denton, A
    Ding, Q
    Perrizo, W
    Ding, Q
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2002, : 138 - 141
  • [49] A new approach to construct near-optimal binary search trees using genetic algorithm
    Fatemi, Afsaneh
    Zamanifar, Kamran
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, 2007, : 428 - +
  • [50] An approach for induction of decision trees under entropy and characteristic relation based rough sets'
    Li, Tianrui
    Hu, Chunsheng
    Dong, Xiaozhao
    Luo, Hedan
    COMPUTATIONAL INTELLIGENCE IN DECISION AND CONTROL, 2008, 1 : 581 - 586