Efficient random subspace decision forests with a simple probability dimensionality setting scheme

被引:3
|
作者
Wang, Quan [1 ]
Wang, Fei [1 ]
Li, Zhongheng [2 ]
Jiang, Peilin [3 ]
Ren, Fuji [4 ]
Nie, Feiping [5 ,6 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Shaanxi, Peoples R China
[2] Beijing Inst Spacecraft Syst Engn, Beijing Key Lab Intelligent Space Robot Syst Techn, Beijing 100094, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Shaanxi, Peoples R China
[4] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
[5] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
[6] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Shaanxi, Peoples R China
关键词
Decision forests; Ensemble classifiers; Random subspace method; CLASSIFIERS; ALGORITHM; ENSEMBLE;
D O I
10.1016/j.ins.2023.118993
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Random subspace decision forests are commonly used machine learning methods in a wide range of application domains. How to set the random subspace dimensionality d(s) in decision forests is a considerable issue that impacts classification quality and efficiency, especially for high dimensional cases. To obtain effective and efficient decision forests that are generally suitable for various classification cases, this paper proposes a novel framework, named Efficient Random Subspace decision forest (ERS). A Half-Range Discrete Uniform distribution-based Varied Dimensionality setting (HRDUVD) method is provided for determining the random subspace dimensionality, and the ERS is formed based on the HRDUVD method. In more detail, a simple discrete uniform distribution in a specific range is employed to set with a given probability the number of randomly selected features for each tree in random subspace decision forests. The HRDUVD method removes the hesitation which appropriate d(s) value one should preset for different datasets, while also achieving adequate classification performance along with a relatively short running time. Therefore, setting d(s) using the discrete uniform distribution is a highly useful strategy for the proposed ERS.
引用
收藏
页数:16
相关论文
共 5 条