A forest-based algorithm for selecting informative variables using Variable Depth Distribution

被引:4
|
作者
Voronov, Sergii [1 ]
Jung, Voronov Daniel [1 ]
Frisk, Erik [1 ]
机构
[1] Linkoping Univ, Dept Elect Engn, S-58183 Linkoping, Sweden
关键词
Variable selection; Random Survival Forest; Random Forest; Automotive; MISFIRE DETECTION; SURVIVAL;
D O I
10.1016/j.engappai.2020.104073
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Predictive maintenance of systems and their components in technical systems is a promising approach to optimize system usage and reduce system downtime. Various sensor data are logged during system operation for different purposes, but sometimes not directly related to the degradation of a specific component. Variable selection algorithms are necessary to reduce model complexity and improve interpretability of diagnostic and prognostic algorithms. This paper presents a forest-based variable selection algorithm that analyzes the distribution of a variable in the decision tree structure, called Variable Depth Distribution, to measure its importance. The proposed variable selection algorithm is developed for datasets with correlated variables that pose problems for existing forest-based variable selection methods. The proposed variable selection method is evaluated and analyzed using three case studies: survival analysis of lead-acid batteries in heavy-duty vehicles, engine misfire detection, and a simulated prognostics dataset. The results show the usefulness of the proposed algorithm, with respect to existing forest-based methods, and its ability to identify important variables in different applications. As an example, the battery prognostics case study shows that similar predictive performance is achieved when only 17% percent of the variables are used compared to all measured signals.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Selecting informative genes using a multiobjective evolutionary algorithm
    Liu, J
    Iba, H
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 297 - 302
  • [2] A strategy that iteratively retains informative variables for selecting optimal variable subset in multivariate calibration
    Yun, Yong-Huan
    Wang, Wei-Ting
    Tan, Min-Li
    Liang, Yi-Zeng
    Li, Hong-Dong
    Cao, Dong-Sheng
    Lu, Hong-Mei
    Xu, Qing-Song
    ANALYTICA CHIMICA ACTA, 2014, 807 : 36 - 43
  • [3] Causal inference in the presence of missing data using a random forest-based matching algorithm
    Hillis, Tristan
    Guarcello, Maureen A.
    Levine, Richard A.
    Fan, Juanjuan
    STAT, 2021, 10 (01):
  • [4] Random forest-based classsification and analysis of hemiplegia gait using low-cost depth cameras
    Luo, Guoliang
    Zhu, Yean
    Wang, Rui
    Tong, Yang
    Lu, Wei
    Wang, Haolun
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2020, 58 (02) : 373 - 382
  • [5] Selecting instrumental variables based on simulated annealing algorithm
    Hu, Yi
    Wang, Mei-Jin
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2014, 34 (04): : 892 - 898
  • [6] Additive Manufacturing of Prostheses Using Forest-Based Composites
    Stenvall, Erik
    Flodberg, Goran
    Pettersson, Henrik
    Hellberg, Kennet
    Hermansson, Liselotte
    Wallin, Martin
    Yang, Li
    BIOENGINEERING-BASEL, 2020, 7 (03): : 1 - 18
  • [7] Random Forest-based Algorithm for Sleep Spindle Detection in Infant EEG
    Wei, Lan
    Ventura, Soraia
    Lowery, Madeleine
    Ryan, Mary Anne
    Mathieson, Sean
    Boylan, Geraldine B.
    Mooney, Catherine
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 58 - 61
  • [8] EVALUATION OF RANDOM FOREST-BASED ANALYSIS FOR THE GYPSUM DISTRIBUTION IN THE ATACAMA DESERT
    Hoffmeister, D.
    Herbrecht, M.
    Kramm, T.
    Schulte, P.
    2020 IEEE LATIN AMERICAN GRSS & ISPRS REMOTE SENSING CONFERENCE (LAGIRS), 2020, : 44 - 47
  • [9] A random forest-based approach for fault location detection in distribution systems
    Hatice Okumus
    Fatih M. Nuroglu
    Electrical Engineering, 2021, 103 : 257 - 264
  • [10] A random forest-based approach for fault location detection in distribution systems
    Okumus, Hatice
    Nuroglu, Fatih M.
    ELECTRICAL ENGINEERING, 2021, 103 (01) : 257 - 264