An approach of improving decision tree classifier using condensed informative data

被引:1
|
作者
Panhalkar, Archana R. [1 ]
Doye, Dharmpal D. [1 ]
机构
[1] Shri Guru Gobind Singhji Inst Engn & Technol, Nanded, Maharashtra, India
关键词
Data mining; Decision tree classifier; K-means clustering; C4; 5; Instance reduction;
D O I
10.1007/s40622-020-00265-3
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
The advancement of new technologies in today's era produces a vast amount of data. To store, analyze and mine knowledge from huge data requires large space as well as better execution speed. To train classifiers using a large amount of data requires more time and space. To avoid wastage of time and space, there is a need to mine significant information from a huge collection of data. Decision tree is one of the promising classifiers which mine knowledge from huge data. This paper aims to reduce the data to construct efficient decision tree classifier. This paper presents a method which finds informative data to improve the performance of decision tree classifier. Two clustering-based methods are proposed for dimensionality reduction and utilizing knowledge from outliers. These condensed data are applied to the decision tree for high prediction accuracy. The uniqueness of the first method is that it finds the representative instances from clusters that utilize knowledge of its neighboring data. The second method uses supervised clustering which finds the number of cluster representatives for the reduction of data. With an increase in the prediction accuracy of a tree, these methods decrease the size, building time and space required for decision tree classifiers. These novel methods are united into a single supervised and unsupervised Decision Tree based on Cluster Analysis Pre-processing (DTCAP) which hunts the informative instances from a small, medium and large dataset. The experiments are conducted on a standard UCI dataset of different sizes. It illustrates that the method with its simplicity performs a reduction of data up to 50%. It produces a qualitative dataset which enhances the performance of the decision tree classifier.
引用
收藏
页码:431 / 445
页数:15
相关论文
共 50 条
  • [41] DECISION TREE CLASSIFIER - DESIGN AND POTENTIAL
    SWAIN, PH
    HAUSKA, H
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 1977, 15 (03): : 142 - 147
  • [42] A SURVEY OF DECISION TREE CLASSIFIER METHODOLOGY
    SAFAVIAN, SR
    LANDGREBE, D
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1991, 21 (03): : 660 - 674
  • [43] Combining Classifier based on Decision Tree
    Yao Yu
    Fu Zhong-liang
    Zhao Xiang-hui
    Cheng Wen-fang
    2009 WASE INTERNATIONAL CONFERENCE ON INFORMATION ENGINEERING, ICIE 2009, VOL II, 2009, : 37 - +
  • [44] Comparative Analysis of Hepatitis C Using Decision Tree Classifier and Artificial Neural Network Classifier
    Sravanthi, D.
    Rani, Jenila
    CARDIOMETRY, 2022, (25): : 1017 - 1023
  • [45] Comparing the Knowledge Quality in Rough Classifier and Decision Tree Classifier
    Mohsin, Mohamad Farhan Mohamad
    Wahab, Mohd Helmy Abd
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 1109 - +
  • [46] Intrusion detection using decision tree classifier with feature reduction technique
    Raza, Syed Atir
    Shamim, Sania
    Khan, Abdul Hannan
    Anwar, Aqsa
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2023, 42 (02) : 30 - 37
  • [47] Malware Detection Using Decision Tree Based SVM Classifier for IoT
    Hilal, Anwer Mustafa
    Hassine, Siwar Ben Haj
    Larabi-Marie-Sainte, Souad
    Nemri, Nadhem
    Nour, Mohamed K.
    Motwakel, Abdelwahed
    Zamani, Abu Sarwar
    Al Duhayyim, Mesfer
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (01): : 713 - 726
  • [48] VENTRICULAR TACHYCARDIA AND FIBRILLATION DETECTION USING DWT AND DECISION TREE CLASSIFIER
    Mohanty, Monalisa
    Biswal, Pradyut
    Sabut, Sukanta
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2019, 19 (03)
  • [49] Fault diagnostics of spur gear using decision tree and fuzzy classifier
    A. Krishnakumari
    A. Elayaperumal
    M. Saravanan
    C. Arvindan
    The International Journal of Advanced Manufacturing Technology, 2017, 89 : 3487 - 3494
  • [50] Fault diagnostics of spur gear using decision tree and fuzzy classifier
    Krishnakumari, A.
    Elayaperumal, A.
    Saravanan, M.
    Arvindan, C.
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2017, 89 (9-12): : 3487 - 3494