An approach of improving decision tree classifier using condensed informative data

被引:1
|
作者
Panhalkar, Archana R. [1 ]
Doye, Dharmpal D. [1 ]
机构
[1] Shri Guru Gobind Singhji Inst Engn & Technol, Nanded, Maharashtra, India
关键词
Data mining; Decision tree classifier; K-means clustering; C4; 5; Instance reduction;
D O I
10.1007/s40622-020-00265-3
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
The advancement of new technologies in today's era produces a vast amount of data. To store, analyze and mine knowledge from huge data requires large space as well as better execution speed. To train classifiers using a large amount of data requires more time and space. To avoid wastage of time and space, there is a need to mine significant information from a huge collection of data. Decision tree is one of the promising classifiers which mine knowledge from huge data. This paper aims to reduce the data to construct efficient decision tree classifier. This paper presents a method which finds informative data to improve the performance of decision tree classifier. Two clustering-based methods are proposed for dimensionality reduction and utilizing knowledge from outliers. These condensed data are applied to the decision tree for high prediction accuracy. The uniqueness of the first method is that it finds the representative instances from clusters that utilize knowledge of its neighboring data. The second method uses supervised clustering which finds the number of cluster representatives for the reduction of data. With an increase in the prediction accuracy of a tree, these methods decrease the size, building time and space required for decision tree classifiers. These novel methods are united into a single supervised and unsupervised Decision Tree based on Cluster Analysis Pre-processing (DTCAP) which hunts the informative instances from a small, medium and large dataset. The experiments are conducted on a standard UCI dataset of different sizes. It illustrates that the method with its simplicity performs a reduction of data up to 50%. It produces a qualitative dataset which enhances the performance of the decision tree classifier.
引用
收藏
页码:431 / 445
页数:15
相关论文
共 50 条
  • [1] An approach of improving decision tree classifier using condensed informative data
    Archana R. Panhalkar
    Dharmpal D. Doye
    DECISION, 2020, 47 : 431 - 445
  • [2] A New Approach of Boosting using Decision Tree Classifier for Classifying Noisy Data
    Farid, Dewan Md.
    Maruf, Golam Morshed
    Rahman, Chowdhury Mofizur
    2013 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2013,
  • [3] Improving the Performance of a Proxy Cache Using Very Fast Decision Tree Classifier
    Benadit, Julian P.
    Francis, Sagayaraj F.
    INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 : 304 - 312
  • [4] An Experimental Study on Decision Tree Classifier Using Discrete and Continuous Data
    Jena, Monalisa
    Dehuri, Satchidananda
    COGNITIVE INFORMATICS AND SOFT COMPUTING, 2020, 1040 : 321 - 331
  • [5] Verifying cuts as a tool for improving a classifier based on a decision tree
    Dydo, Lukasz
    Bazan, Jan C.
    Buregwa-Czuma, Sylwia
    Rzasa, Wojciech
    Skowron, Andrzej
    PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2016, 8 : 17 - 20
  • [6] Packet filtering using a decision tree classifier
    Li, CY
    Lin, W
    Yang, YT
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 801 - 805
  • [7] Music rhythm tree based partitioning approach to decision tree classifier
    Guggari, Shankru
    Kadappa, Vijayakumar
    Umadevi, V
    Abraham, Ajith
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (06) : 3040 - 3054
  • [8] An Approach to Classify Eligibility Blood Donors Using Decision Tree and Naive Bayes Classifier
    Zulfikar, W. B.
    Gerhana, Y. A.
    Rahmania, A. F.
    2018 6TH INTERNATIONAL CONFERENCE ON CYBER AND IT SERVICE MANAGEMENT (CITSM), 2018, : 563 - 567
  • [9] Gas Classification Using Binary Decision Tree Classifier
    Hassan, Muhammad
    Bermak, Amine
    2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 2579 - 2582
  • [10] Decision Tree Classifier using Theme based Partitioning
    Kadappa, Vijayakumar
    Guggari, Shankru
    Negi, Atul
    2015 INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORK COMMUNICATIONS (COCONET), 2015, : 540 - 546