Product failure prediction with missing data

被引:19
|
作者
Kang, Seokho [1 ]
Kim, Eunji [2 ,3 ]
Shim, Jaewoong [2 ,3 ]
Chang, Wonsang [4 ]
Cho, Sungzoon [2 ,3 ]
机构
[1] Sungkyunkwan Univ, Dept Syst Management Engn, Suwon, South Korea
[2] Seoul Natl Univ, Dept Ind Engn, Seoul, South Korea
[3] Seoul Natl Univ, Inst Ind Syst Innovat, Seoul, South Korea
[4] Samsung Elect Co Ltd, Global Technol Ctr, Suwon, South Korea
基金
新加坡国家研究基金会;
关键词
data mining; predictive modelling; failure prediction; production data; missing value; NEURAL-NETWORKS; FAULT-DETECTION; DATA IMPUTATION; ROC CURVE; VALUES; QUALITY; AREA; MAP;
D O I
10.1080/00207543.2017.1407883
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In production data, missing values commonly appear for several reasons including changes in measurement and inspection items, sampling inspections, and unexpected process events. When applied to product failure prediction, the incompleteness of data should be properly addressed to avoid performance degradation in prediction models. Well-known approaches for missing data treatment, such as elimination and imputation, would not perform well under usual scenarios in production data, including high missing rate, systematic missing and class imbalance. To address these limitations, here we present a method for predictive modelling with missing data by considering the characteristics of production data. It builds multiple prediction models on different complete data subsets derived from the original data-set, each of which has different coverage of instances and input variables. These models are selectively used to make predictions for new instances with missing values. We demonstrate the effectiveness of the proposed method through a case study using actual data-sets from a home appliance manufacturer.
引用
收藏
页码:4849 / 4859
页数:11
相关论文
共 50 条
  • [21] Modelling competing risks data with missing cause of failure
    Bakoyannis, Giorgos
    Siannis, Fotios
    Touloumi, Giota
    [J]. STATISTICS IN MEDICINE, 2010, 29 (30) : 3172 - 3185
  • [22] Application of a Gaussian, Missing-Data Model to Product Recommendation
    Roberts, William J. J.
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (05) : 509 - 512
  • [23] Using pooled data for genomic prediction in a bivariate framework with missing data
    Baller, Johnna L.
    Kachman, Stephen D.
    Kuehn, Larry A.
    Spangler, Matthew L.
    [J]. JOURNAL OF ANIMAL BREEDING AND GENETICS, 2022, 139 (05) : 489 - 501
  • [24] Ordering attributes for missing values prediction and data classification
    Hruschka, ER
    Ebecken, NFF
    [J]. DATA MINING III, 2002, 6 : 593 - 601
  • [25] ESTIMATION, PREDICTION, AND INTERPOLATION FOR ARIMA MODELS WITH MISSING DATA
    KOHN, R
    ANSLEY, CF
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1986, 81 (395) : 751 - 761
  • [26] AdaBoost Models for Corporate Bankruptcy Prediction with Missing Data
    Zhou, Ligang
    Lai, Kin Keung
    [J]. COMPUTATIONAL ECONOMICS, 2017, 50 (01) : 69 - 94
  • [27] Prediction of missing sequences and branch lengths in phylogenomic data
    Darriba, Diego
    Weiss, Michael
    Stamatakis, Alexandros
    [J]. BIOINFORMATICS, 2016, 32 (09) : 1331 - 1337
  • [28] Ensemble missing data techniques for software effort prediction
    Twala, Bhekisipho
    Cartwright, Michelle
    [J]. INTELLIGENT DATA ANALYSIS, 2010, 14 (03) : 299 - 331
  • [29] AdaBoost Models for Corporate Bankruptcy Prediction with Missing Data
    Ligang Zhou
    Kin Keung Lai
    [J]. Computational Economics, 2017, 50 : 69 - 94
  • [30] Prediction and Characterization of Missing Proteomic Data in Desulfovibrio vulgaris
    Li, Feng
    Nie, Lei
    Wu, Gang
    Qiao, Jianjun
    Zhang, Weiwen
    [J]. COMPARATIVE AND FUNCTIONAL GENOMICS, 2011,