Cost-Sensitive Learning and Threshold-Moving Approach to Improve Industrial Lots Release Process on Imbalanced Datasets

被引:1
|
作者
Lobo, Armindo [1 ]
Oliveira, Pedro [1 ]
Sampaio, Paulo [1 ]
Novais, Paulo [1 ]
机构
[1] Univ Minho, ALGORITMI Ctr, Braga, Portugal
关键词
Cost-sensitive learning; Imbalanced data; Machine learning; Threshold-moving; Lots release;
D O I
10.1007/978-3-031-20859-1_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With Industry 4.0, companies must manage massive and generally imbalanced datasets. In an automotive company, the lots release decision process must cope with this problem by combining data from different sources to determine if a selected group of products can be released to the customers. This work focuses on this process and aims to classify the occurrence of customer complaints with a conception, tune and evaluation of five ML algorithms, namely XGBoost (XGB), LightGBM (LGBM), CatBoost (CatB), Random Forest(RF) and a Decision Tree (DT), based on an imbalanced dataset of automatic production tests. We used a non-sampling approach to deal with the problem of imbalanced datasets by analyzing two different methods, cost-sensitive learning and threshold-moving. Regarding the obtained results, both methods showed an effective impact on boosting algorithms, whereas RF only showed improvements with threshold-moving. Also, considering both approaches, the best overall results were achieved by the threshold-moving method, where RF obtained the best outcome with a F1-Score value of 76.2%.
引用
收藏
页码:280 / 290
页数:11
相关论文
共 10 条
  • [1] A Cost-Sensitive Based Approach for Improving Associative Classification on Imbalanced Datasets
    Waiyamai, Kitsana
    Suwannarattaphoom, Phoonperm
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2014, 2014, 8556 : 31 - 42
  • [2] Cost-Sensitive Learning for Imbalanced Bad Debt Datasets in Healthcare Industry
    Shi, Donghui
    Guan, Jian
    Zurada, Jozef
    [J]. 2015 ASIA-PACIFIC CONFERENCE ON COMPUTER-AIDED SYSTEM ENGINEERING - APCASE 2015, 2015, : 30 - 35
  • [3] Cost-Sensitive Learning from Imbalanced Datasets for Retail Credit Risk Assessment
    Oreski, Stjepan
    Oreski, Goran
    [J]. TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2018, 7 (01): : 59 - 73
  • [4] Cost-Sensitive Neural Network with ROC-Based Moving Threshold for Imbalanced Classification
    Krawczyk, Bartosz
    Wozniak, Michal
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2015, 2015, 9375 : 45 - 52
  • [5] MULTICLASS CLASSIFICATION WITH IMBALANCED DATASETS FOR CAR OWNERSHIP DEMAND MODEL - COST-SENSITIVE LEARNING
    Kaewwichian, Patiphan
    [J]. PROMET-TRAFFIC & TRANSPORTATION, 2021, 33 (03): : 361 - 371
  • [6] Cost-Sensitive Approach to Improve the HTTP Traffic Detection Performance on Imbalanced Data
    Li, Wenmin
    Sun, Sanqi
    Zhang, Shuo
    Zhang, Hua
    Shi, Yijie
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [7] Novel Cost-Sensitive Approach to Improve the Multilayer Perceptron Performance on Imbalanced Data
    Castro, Cristiano L.
    Braga, Antonio P.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (06) : 888 - 899
  • [8] Using Cost-Sensitive Learning and Feature Selection Algorithms to Improve the Performance of Imbalanced Classification
    Feng, Fang
    Li, Kuan-Ching
    Shen, Jun
    Zhou, Qingguo
    Yang, Xuhui
    [J]. IEEE ACCESS, 2020, 8 : 69979 - 69996
  • [9] A batch-adapted cost-sensitive contrastive feature learning network for industrial diagnosis with extremely imbalanced data
    Liu, Yijin
    Li, Zipeng
    Chen, Jinglong
    Zhang, Tianci
    Pan, Tongyang
    He, Shuilong
    [J]. Measurement: Journal of the International Measurement Confederation, 2025, 244
  • [10] Multiscale cost-sensitive learning-based assembly quality prediction approach under imbalanced data
    Wang, Tianyue
    Hu, Bingtao
    Feng, Yixiong
    Gong, Hao
    Zhong, Ruirui
    Yang, Chen
    Tan, Jianrong
    [J]. Advanced Engineering Informatics, 2024, 62