Review of Random Forest Classification Techniques to Resolve Data Imbalance

被引:0
|
作者
More, A. S. [1 ]
Rana, Dipti P. [1 ]
机构
[1] Sardar Vallabhbhai Natl Inst Technol, Dept Comp Engn, Surat, India
关键词
Random Forest Classification; Balanced Random Forest; Weighted Random Forest; Sampling; Dynamic Integration Technique;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this current age, numerous ranges of real word applications with imbalanced dataset is one of the foremost focal point of researcher's inattention. There is the enormous increment of data generation and imbalance within dataset. Processing and knowledge extraction of huge amount of imbalanced data becomes a challenge related with space and time necessities. Generally there is a list of an assortment of factual humanity applications which deals with unequal data sample division in to number of classes. Due to this division of data either of class goes into majority or minority with comparably less data count. This outnumbering of data sample in either of one class directs towards the handling of minority class and target on remarkable reduction in error rate. The standard learning methods do not directly focus on this type of classes. Random Forest Classification (RFC) is an ensemble approach that utilizes a number of classifiers to work together in order to identify the class label for unlabeled instances. This approach has proved its high accuracy and superiority with imbalanced datasets. This classifier provides various techniques to resolve class imbalance problem. This paper summarizes, the literature survey from 2000 to 2016 of various techniques related to RFC to resolve class imbalance. Specifically Weighted Random Forest (WRF), Balanced Random Forest (BRF), Sampling (Under Sampling (US)) and Down Sampling (DS), Cost Sensitive Methods have been adapted more to till date. The limitation of this numerous literature is researchers can focus on dynamic integration techniques to resolve class imbalance and increase robustness and versatility of classification.
引用
收藏
页码:72 / 78
页数:7
相关论文
共 50 条
  • [21] Random forest Algorithm for the Classification of Spectral Data of Astronomical Objects
    Solorio-Ramirez, Jose-Luis
    Jimenez-Cruz, Raul
    Villuendas-Rey, Yenny
    Yanez-Marquez, Cornelio
    ALGORITHMS, 2023, 16 (06)
  • [22] Exploring issues of training data imbalance and mislabelling on random forest performance for large area land cover classification using the ensemble margin
    Mellor, Andrew
    Boukir, Samia
    Haywood, Andrew
    Jones, Simon
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2015, 105 : 155 - 168
  • [23] An adaptive Laplacian weight random forest imputation for imbalance and mixed-type data
    Ren, Lijuan
    Seklouli, Aicha Sekhari
    Zhang, Haiqing
    Wang, Tao
    Bouras, Abdelaziz
    INFORMATION SYSTEMS, 2023, 111
  • [24] Variance Ranking Attributes Selection Techniques for Binary Classification Problem in Imbalance Data
    Ebenuwa, Solomon H.
    Sharif, Mhd Saeed
    Alazab, Mamoun
    Al-Nemrat, Ameer
    IEEE ACCESS, 2019, 7 : 24649 - 24666
  • [25] Forest type identification by random forest classification combined with SPOT and multitemporal SAR data
    Yu, Ying
    Li, Mingze
    Fu, Yu
    JOURNAL OF FORESTRY RESEARCH, 2018, 29 (05) : 1407 - 1414
  • [26] Forest type identification by random forest classification combined with SPOT and multitemporal SAR data
    Ying Yu
    Mingze Li
    Yu Fu
    JournalofForestryResearch, 2018, 29 (05) : 1407 - 1414
  • [27] Forest type identification by random forest classification combined with SPOT and multitemporal SAR data
    Ying Yu
    Mingze Li
    Yu Fu
    Journal of Forestry Research, 2018, 29 : 1407 - 1414
  • [28] Cancer Disease Prediction with Support Vector Machine and Random Forest Classification Techniques
    Ahmed, Ashfaq K.
    Aljahdali, Sultan
    Hundewale, Nisar
    Ahmed, Ishthaq K.
    2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND CYBERNETICS (CYBERNETICSCOM), 2012, : 16 - 19
  • [29] Improved classification techniques by combining KNN and Random Forest with Naive Bayesian Classifier
    Devi, R. Gayathri
    Sumanjani, P.
    2015 IEEE INTERNATIONAL CONFERENCE ON ENGINEERING AND TECHNOLOGY (ICETECH), 2015, : 95 - 98
  • [30] Classification of Transport Vehicle Noise Events in Magnetotelluric Time Series Data in an Urban area Using Random Forest Techniques
    Kwon, Hyoung-Seok
    Ryu, Kyeongho
    Sim, Ickhyeon
    Lee, Choon-Ki
    Oh, Seokhoon
    GEOPHYSICS AND GEOPHYSICAL EXPLORATION, 2020, 23 (04): : 230 - 242