Review of Random Forest Classification Techniques to Resolve Data Imbalance

被引:0
|
作者
More, A. S. [1 ]
Rana, Dipti P. [1 ]
机构
[1] Sardar Vallabhbhai Natl Inst Technol, Dept Comp Engn, Surat, India
关键词
Random Forest Classification; Balanced Random Forest; Weighted Random Forest; Sampling; Dynamic Integration Technique;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this current age, numerous ranges of real word applications with imbalanced dataset is one of the foremost focal point of researcher's inattention. There is the enormous increment of data generation and imbalance within dataset. Processing and knowledge extraction of huge amount of imbalanced data becomes a challenge related with space and time necessities. Generally there is a list of an assortment of factual humanity applications which deals with unequal data sample division in to number of classes. Due to this division of data either of class goes into majority or minority with comparably less data count. This outnumbering of data sample in either of one class directs towards the handling of minority class and target on remarkable reduction in error rate. The standard learning methods do not directly focus on this type of classes. Random Forest Classification (RFC) is an ensemble approach that utilizes a number of classifiers to work together in order to identify the class label for unlabeled instances. This approach has proved its high accuracy and superiority with imbalanced datasets. This classifier provides various techniques to resolve class imbalance problem. This paper summarizes, the literature survey from 2000 to 2016 of various techniques related to RFC to resolve class imbalance. Specifically Weighted Random Forest (WRF), Balanced Random Forest (BRF), Sampling (Under Sampling (US)) and Down Sampling (DS), Cost Sensitive Methods have been adapted more to till date. The limitation of this numerous literature is researchers can focus on dynamic integration techniques to resolve class imbalance and increase robustness and versatility of classification.
引用
收藏
页码:72 / 78
页数:7
相关论文
共 50 条
  • [1] Random Forest Applied to Mass Imbalance Classification in Wind Turbines
    da Silva, Eduardo G.
    da Silva, Emerson C.
    Franchi, Claiton M.
    Schaf, Frederico M.
    Pinheiro, Humberto
    Tello Gamarra, Daniel Fernando
    2023 15TH SEMINAR ON POWER ELECTRONICS AND CONTROL, SEPOC, 2023,
  • [2] Random forest algorithm for classification of multiwavelength data
    Dan Gao1
    2 Graduate University of Chinese Academy of Sciences
    ResearchinAstronomyandAstrophysics, 2009, 9 (02) : 220 - 226
  • [3] Random forest algorithm for classification of multiwavelength data
    Gao, Dan
    Zhang, Yan-Xia
    Zhao, Yong-Heng
    RESEARCH IN ASTRONOMY AND ASTROPHYSICS, 2009, 9 (02) : 220 - 226
  • [4] Random Forest Pruning Techniques: A Recent Review
    Manzali Y.
    Elfar M.
    Operations Research Forum, 4 (2)
  • [5] A Comprehensive Survey of Imbalance Correction Techniques for Hyperspectral Data Classification
    Paoletti, Mercedes E.
    Mogollon-Gutierrez, Oscar
    Moreno-Alvarez, Sergio
    Sancho, Jose Carlos
    Haut, Juan M.
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 5297 - 5314
  • [6] Random Forest Algorithm for the Classification of Neuroimaging Data in Alzheimer's Disease: A Systematic Review
    Sarica, Alessia
    Cerasa, Antonio
    Quattrone, Aldo
    FRONTIERS IN AGING NEUROSCIENCE, 2017, 9
  • [7] Random forest for gene selection and microarray data classification
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    BIOINFORMATION, 2011, 7 (03) : 142 - 146
  • [8] Random Forest for Gene Selection and Microarray Data Classification
    Moorthy, Kohbalan
    Mohamad, Mohd Saberi
    KNOWLEDGE TECHNOLOGY, 2012, 295 : 174 - 183
  • [9] Investigation of the random forest framework for classification of hyperspectral data
    Ham, J
    Chen, YC
    Crawford, MM
    Ghosh, J
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2005, 43 (03): : 492 - 501
  • [10] A review on longitudinal data analysis with random forest
    Hu, Jianchang
    Szymczak, Silke
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)