Review of Random Forest Classification Techniques to Resolve Data Imbalance

被引:0
|
作者
More, A. S. [1 ]
Rana, Dipti P. [1 ]
机构
[1] Sardar Vallabhbhai Natl Inst Technol, Dept Comp Engn, Surat, India
关键词
Random Forest Classification; Balanced Random Forest; Weighted Random Forest; Sampling; Dynamic Integration Technique;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this current age, numerous ranges of real word applications with imbalanced dataset is one of the foremost focal point of researcher's inattention. There is the enormous increment of data generation and imbalance within dataset. Processing and knowledge extraction of huge amount of imbalanced data becomes a challenge related with space and time necessities. Generally there is a list of an assortment of factual humanity applications which deals with unequal data sample division in to number of classes. Due to this division of data either of class goes into majority or minority with comparably less data count. This outnumbering of data sample in either of one class directs towards the handling of minority class and target on remarkable reduction in error rate. The standard learning methods do not directly focus on this type of classes. Random Forest Classification (RFC) is an ensemble approach that utilizes a number of classifiers to work together in order to identify the class label for unlabeled instances. This approach has proved its high accuracy and superiority with imbalanced datasets. This classifier provides various techniques to resolve class imbalance problem. This paper summarizes, the literature survey from 2000 to 2016 of various techniques related to RFC to resolve class imbalance. Specifically Weighted Random Forest (WRF), Balanced Random Forest (BRF), Sampling (Under Sampling (US)) and Down Sampling (DS), Cost Sensitive Methods have been adapted more to till date. The limitation of this numerous literature is researchers can focus on dynamic integration techniques to resolve class imbalance and increase robustness and versatility of classification.
引用
收藏
页码:72 / 78
页数:7
相关论文
共 50 条
  • [31] Visualisation of Random Forest classification
    Macas, Catarina
    Campos, Joao R.
    Lourenco, Nuno
    Machado, Penousal
    INFORMATION VISUALIZATION, 2024, 23 (04) : 312 - 327
  • [32] Improved Random Forest for Classification
    Paul, Angshuman
    Mukherjee, Dipti Prasad
    Das, Prasun
    Gangopadhyay, Abhinandan
    Chintha, Appa Rao
    Kundu, Saurabh
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 4012 - 4024
  • [33] Face classification by a random forest
    Kouzani, A. Z.
    Nahavandi, S.
    Khoshmanesh, K.
    TENCON 2007 - 2007 IEEE REGION 10 CONFERENCE, VOLS 1-3, 2007, : 652 - 655
  • [34] Optimizing maize germination forecasts with random forest and data fusion techniques
    Wu, Lili
    Xing, Yuqing
    Yang, Kaiwen
    Li, Wenqiang
    Ren, Guangyue
    Zhang, Debang
    Fan, Huiping
    PEERJ COMPUTER SCIENCE, 2024, 10 : 1 - 19
  • [35] GA-optimized random forest classification for high dimensional data
    Pan, Jingchang
    Wei, Peng
    Guo, Qiang
    Zhang, Caiming
    Luo, Ali
    ICIC Express Letters, 2011, 5 (05): : 1529 - 1534
  • [36] A novel Random Forest integrated model for imbalanced data classification problem
    Gu, Qinghua
    Tian, Jingni
    Li, Xuexian
    Jiang, Song
    KNOWLEDGE-BASED SYSTEMS, 2022, 250
  • [37] A Random Forest Model for Peptide Classification Based on Virtual Docking Data
    Feng, Hua
    Wang, Fangyu
    Li, Ning
    Xu, Qian
    Zheng, Guanming
    Sun, Xuefeng
    Hu, Man
    Xing, Guangxu
    Zhang, Gaiping
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (14)
  • [38] Random forest rock type classification with integration of geochemical and photographic data
    Trott, McLean
    Leybourne, Matthew
    Hall, Lindsay
    Layton-Matthews, Daniel
    APPLIED COMPUTING AND GEOSCIENCES, 2022, 15
  • [39] Classification of Travel Data with Multiple Sensor Information using Random Forest
    Shafique, Muhammad Awais
    Hato, Eiji
    19TH EURO WORKING GROUP ON TRANSPORTATION MEETING (EWGT2016), 2017, 22 : 144 - 153
  • [40] Imbalanced educational data classification: an effective approach with resampling and random forest
    Vo Thi Ngoc Chau
    Nguyen Hua Phung
    PROCEEDINGS OF 2013 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF), 2013, : 135 - 140