Feature elimination approach based on random forest for cancer diagnosis

被引:0
|
作者
Nguyen, Ha-Nam [1 ]
Vu, Trung-Nghia [1 ]
Ohn, Syng-Yup [1 ]
Park, Young-Mee [2 ]
Han, Mi Young [3 ]
Kim, Chul Woo [4 ]
机构
[1] Hankuk Aviat Univ, Dept Comp & Informat Engn, Seoul, South Korea
[2] Roswell Pk Cancer Inst, Dept Cell Stress Biol, Buffalo, NY USA
[3] Bioinfra Inc, Seoul, South Korea
[4] Seoul Natl Univ Coll Med, Tumor Immun Med Res Ctr, Dept Pathol, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of learning tasks is very sensitive to the characteristics of training data. There are several ways to increase the effect of learning performance including standardization, normalization, signal enhancement, linear or non-linear space embedding methods, etc. Among those methods, determining the relevant and informative features is one of the key steps in the data analysis process that helps to improve the performance, reduce the generation of data, and understand the characteristics of data. Researchers have developed the various methods to extract the set of relevant features but no one method prevails. Random Forest, which is an ensemble classifier based on the set of tree classifiers, turns out good classification performance. Taking advantage of Random Forest and using wrapper approach first introduced by Kohavi et al, we propose a new algorithm to find the optimal subset of features. The Random Forest is used to obtain the feature ranking values. And these values are applied to decide which features are eliminated in the each iteration of the algorithm. We conducted experiments with two public datasets: colon cancer and leukemia cancer. The experimental results of the real world data showed that the proposed method results in a higher prediction rate than a baseline method for certain data sets and also shows comparable and sometimes better performance than the feature selection methods widely used.
引用
收藏
页码:532 / +
页数:3
相关论文
共 50 条
  • [31] Status diagnosis and feature tracing of the natural gas pipeline weld based on improved random forest model
    Wang, Lin
    Mao, Zhihao
    Xuan, Heng
    Ma, Tingxia
    Hu, Cheng
    Chen, Jiaxin
    You, Xiaohu
    INTERNATIONAL JOURNAL OF PRESSURE VESSELS AND PIPING, 2022, 200
  • [32] Using Random Forest Algorithm for Breast Cancer Diagnosis
    Dai, Bin
    Chen, Rung-Ching
    Zhu, Shun-Zhi
    Zhang, Wei-Wei
    2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 449 - 452
  • [33] SGA-Driven feature selection and random forest classification for enhanced breast cancer diagnosis: A comparative study
    Yaqoob, Abrar
    Verma, Navneet Kumar
    Mir, Mushtaq Ahmad
    Tejani, Ghanshyam G.
    Eisa, Nashwa Hassan Babiker
    Osman, Hind Mamoun Hussien
    Shah, Mohd Asif
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [34] A random forest algorithm under the ensemble approach for feature selection and classification
    Kharwar, Ankit
    Thakor, Devendra
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2023, 29 (04) : 426 - 447
  • [35] Using recursive feature elimination in random forest to account for correlated variables in high dimensional data
    Burcu F. Darst
    Kristen C. Malecki
    Corinne D. Engelman
    BMC Genetics, 19
  • [36] Gender Voice Recognition Using Random Forest Recursive Feature Elimination with Gradient Boosting Machines
    Zvarevashe, Kudakwashe
    Olugbara, Oludayo O.
    2018 INTERNATIONAL CONFERENCE ON ADVANCES IN BIG DATA, COMPUTING AND DATA COMMUNICATION SYSTEMS (ICABCD), 2018,
  • [37] Intrusion Detection System with Recursive Feature Elimination by using Random Forest and Deep Learning Classifier
    Ustebay, Serpil
    Turgut, Zeynep
    Aydin, Muhammed Ali
    2018 INTERNATIONAL CONGRESS ON BIG DATA, DEEP LEARNING AND FIGHTING CYBER TERRORISM (IBIGDELFT), 2018, : 71 - 76
  • [38] Using recursive feature elimination in random forest to account for correlated variables in high dimensional data
    Darst, Burcu F.
    Malecki, Kristen C.
    Engelman, Corinne D.
    BMC GENETICS, 2018, 19
  • [39] Recursive Feature Elimination with Random Forest Classifier for Compensation of Small Scale Drift in Gas Sensors
    Ijaz, Muhammad
    Rehman, Atiq Ur
    Hamdi, Mounir
    Bermak, Amine
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [40] An improved random forest-based rule extraction method for breast cancer diagnosis
    Wang, Sutong
    Wang, Yuyan
    Wang, Dujuan
    Yin, Yunqiang
    Wang, Yanzhang
    Jin, Yaochu
    APPLIED SOFT COMPUTING, 2020, 86