Parallel formulations of decision-tree classification algorithms

被引:70
|
作者
Srivastava, A
Han, EH
Kumar, V
Singh, V
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Army HPC Res Ctr, Minneapolis, MN 55455 USA
[2] Hitachi Amer Inc, Informat Technol Lab, Tarrytown, NY 10591 USA
基金
美国国家科学基金会;
关键词
data mining; parallel processing; classification; scalability; decision trees;
D O I
10.1023/A:1009832825273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, etc. Highly parallel algorithms for constructing classification decision trees are desirable for dealing with large data sets in reasonable amount of time. Algorithms for building classification decision trees have a natural concurrency, but are difficult to parallelize due to the inherent dynamic nature of the computation. In this paper, we present parallel formulations of classification decision tree learning algorithm based on induction. We describe two basic parallel formulations. One is based on Synchronous Tree Construction Approach and the other is based on Partitioned Tree Construction Approach. We discuss the advantages and disadvantages of using these methods and propose a hybrid method that employs the good features of these methods. We also provide the analysis of the cost of computation and communication of the proposed hybrid method. Moreover, experimental results on an IBM SP-2 demonstrate excellent speedups and scalability.
引用
收藏
页码:237 / 261
页数:25
相关论文
共 50 条
  • [31] Evolving Decision-Tree Induction Algorithms with a Multi-Objective Hyper-Heuristic
    Basgalupp, Marcio P.
    Barros, Rodrigo C.
    Podgorelec, Vili
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 110 - 117
  • [32] Breast Cancer Classification using Decision Tree Algorithms
    Tarawneh, Omar
    Otair, Mohammed
    Husni, Moath
    Abuaddous, Hayfa Y.
    Tarawneh, Monther
    Almomani, Malek A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 676 - 680
  • [33] An Assessment of Decision Tree based Classification and Regression Algorithms
    Pathak, Soham
    Mishra, Indivar
    Swetapadma, Aleena
    PROCEEDINGS OF THE 2018 3RD INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2018), 2018, : 92 - 95
  • [34] An Energy-efficient TCAM-based Packet Classification with Decision-tree Mapping
    Ruan, Zhao
    Li, Xianfeng
    Li, Wenjun
    2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [35] Statistical decision-tree based fault classification scheme for protection of power transmission lines
    Upendar, J.
    Gupta, C. P.
    Singh, G. K.
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2012, 36 (01) : 1 - 12
  • [36] A LiDAR-based decision-tree classification of open water surfaces in an Arctic delta
    Crasto, N.
    Hopkinson, C.
    Forbes, D. L.
    Lesack, L.
    Marsh, P.
    Spooner, I.
    van der Sanden, J. J.
    REMOTE SENSING OF ENVIRONMENT, 2015, 164 : 90 - 102
  • [37] Risk classification of medicare HMO enrollee cost levels using a decision-tree approach
    Anderson, RT
    Balkrishnan, R
    Camacho, F
    AMERICAN JOURNAL OF MANAGED CARE, 2004, 10 (02): : 89 - 98
  • [38] DECISION-TREE APPROACH TO EARNINGS PER SHARE
    BIRD, FA
    JONES, PA
    ACCOUNTING REVIEW, 1970, 45 (04): : 779 - 783
  • [39] Improving Splice-Junctions Classification employing a Novel Encoding Schema and Decision-Tree
    Salekdeh, Amin Yazdani
    Wiese, Kay C.
    2011 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2011, : 1302 - 1307
  • [40] Pediatric cochlear reimplantation: Decision-tree efficacy
    Distinguin, L.
    Blanchard, M.
    Rouillon, I
    Parodi, M.
    Loundon, N.
    EUROPEAN ANNALS OF OTORHINOLARYNGOLOGY-HEAD AND NECK DISEASES, 2018, 135 (04) : 243 - 247