A Parallel Algorithm to Induce Decision Trees for Large Datasets

被引:0
|
作者
Franco-Arcega, A. [1 ]
Suarez-Cansino, J. [1 ]
Flores-Flores, L. G. [1 ]
机构
[1] Autonomous Univ State Hidalgo, Basic Sci & Engn Inst, Informat & Syst Technol Res Ctr, Mineral De La Reforma 42184, Hidalgo, Mexico
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper introduces a new parallel algorithm called ParDTLT and discusses some of its advantages with respect to a set of well known sequential and parallel algorithms. The parallel process occurs in every node in the decision tree, which is constructed during the supervised training phase. The basis of the distribution of a parallel task is on the attributes of the training objects and the growing of the tree is based on two criteria, who are defined by the maximum number of training objects that every node can support and an entropic gain ratio criterion. Different experiments have been made to compare the behavior of the parallel algorithm ParDTLT with the behavior of the sequential algorithms C4.5, VFDT, YaDT and DTLT and with the parallel algorithm called Synchronous. The experimental results show that ParDTLT keeps the quality of classification and it reduces the execution time.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Coevolutive clustering algorithm for large datasets
    Fabris, Fabio
    Luchi, Diego
    Varejao, Flavio Miguel
    [J]. 2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [22] A new clustering algorithm for large datasets
    Qing-feng Li
    Wen-feng Peng
    [J]. Journal of Central South University, 2011, 18 : 823 - 829
  • [23] A parallel approximate SS-ELM algorithm based on MapReduce for large-scale datasets
    Chen, Cen
    Li, Kenli
    Ouyang, Aijia
    Li, Keqin
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2017, 108 : 85 - 94
  • [24] A new clustering algorithm for large datasets
    李清峰
    彭文峰
    [J]. Journal of Central South University, 2011, 18 (03) : 823 - 829
  • [25] A PARALLEL ALGORITHM FOR BISECTION WIDTH IN TREES
    GOLDBERG, M
    MILLER, Z
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1988, 15 (04) : 259 - 266
  • [26] Disentangling error structures of precipitation datasets using decision trees
    Sui, Xinxin
    Li, Zhi
    Tang, Guoqiang
    Yang, Zong-Liang
    Niyogi, Dev
    [J]. REMOTE SENSING OF ENVIRONMENT, 2022, 280
  • [27] An efficient decision tree construction for large datasets
    Van, Uyen Nguyen Thi
    Chung, Tae Choong
    [J]. 2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2, 2007, : 502 - 506
  • [28] Decision Tree based Classifiers for Large Datasets
    Franco-Arcega, Anilu
    Ariel Carrasco-Ochoa, Jesus
    Sanchez-Diaz, Guillermo
    Francisco Martinez-Trinidad, Jose
    [J]. COMPUTACION Y SISTEMAS, 2013, 17 (01): : 95 - 102
  • [29] On Algorithm for Building of Optimal α-Decision Trees
    Alkhalid, Abdulaziz
    Chikalov, Igor
    Moshkov, Mikhail
    [J]. ROUGH SETS AND CURRENT TRENDS IN COMPUTING, PROCEEDINGS, 2010, 6086 : 438 - 445
  • [30] An Improved Error-Based Pruning Algorithm of Decision Trees on Large Data Sets
    Peng, Yi
    Lu, Yu-Tong
    Chen, Zhi-Guang
    [J]. 2021 IEEE 6TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2021), 2021, : 33 - 37