An Improved C4.5 Algorthm in Bagging Integration Model

被引:4
|
作者
Song, Yu-Qing [1 ]
Yao, Xu [1 ,2 ]
Liu, Zhe [1 ]
Shen, Xianbao [1 ]
Mao, Jingyi [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Telecommun, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Jiangsu Univ Sci & Technol, Sch Comp Sci, Zhenjiang 212013, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Bagging integration; C4.5; algorithm; information entropy; split information; DECISION TREE; CREDAL C4.5; ENSEMBLE; CLASSIFIER;
D O I
10.1109/ACCESS.2020.3032291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The C4.5 algorithm has three shortcomings: the wide range of candidate segmentation threshold sequences for continuous attributes, the comprehensive influence of different attributes and local subsets under the same attribute, and the inter-attribute redundancy. When dealing with continuous attributes, sampling and threshold supplement processing near the transition boundary of the attribute interval corresponding to the adjacent different categories are performed for narrowing the range of candate segmentation threshold sequences. By adding standardizing Euclidean distance of the attribute global and local factors to represent attribute weight, the calculation of C4.5 information gain is otpimized. And averaging Gini index of other attributes and adding correction factor, the influence of redundancy between attributes is greatly decreased. The overall average improvement range of the base classifier and the bagging integration classifier is 0.6%similar to 2.1% and 0.7% similar to 2.7%, respectively, which shows that this integration model can improve the classification accuracy and also validate its feasibility and reliability.
引用
收藏
页码:206866 / 206875
页数:10
相关论文
共 50 条
  • [1] Bagging, boosting, and C4.5
    Quinlan, JR
    PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, 1996, : 725 - 730
  • [2] Improved use of continuous attributes in C4.5
    Quinlan, JR
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 : 77 - 90
  • [3] Improved use of continuous attributes in C4.5
    Quinlan, J.R.
    Journal of Artificial Intelligence Research, 1996, 4 : 77 - 90
  • [4] Improved C4.5 Algorithm for the Analysis of Sales
    Cao, Rong
    Xu, Lizhen
    2009 SIXTH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE, PROCEEDINGS, 2009, : 173 - 176
  • [5] A New Robust Classifier on Noise Domains: Bagging of Credal C4.5 Trees
    Abellan, Joaquin
    Castellano, Javier G.
    Mantas, Carlos J.
    COMPLEXITY, 2017,
  • [6] An Improved TANC Classification Algorith Based on C4.5
    Zhao Xiao-qiang
    Yang Jia-min
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 4992 - 4996
  • [7] Optimization of PBFT Algorithm Based on Improved C4.5
    Zheng, Xiandong
    Feng, Wenlong
    Huang, Mengxing
    Feng, Siling
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [8] Improved C4.5 Algorithm for Rule Based Classification
    Mazid, Mohammed M.
    Ali, A. B. M. Shawkat
    Tickle, Kevin S.
    PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING AND DATA BASES, 2010, : 296 - +
  • [9] A Comparative Analysis of Pruning Methods for C4.5 and Fuzzy C4.5
    Naseer, Tayyeba
    Asghar, Sohail
    Zhuang, Yan
    Fong, Simon
    ADVANCES IN DIGITAL TECHNOLOGIES, 2015, 275 : 304 - 312
  • [10] Fast C4.5
    He, Ping
    Chen, Ling
    Xu, Xiao-Hua
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 2841 - +