Forecasting the yield of wafer by using improved genetic algorithm, high dimensional alternating feature selection and SVM with uneven distribution and high-dimensional data

被引：3

作者：

Xu Q. ^{[1
,2
]}

Xu C. ^{[3
]}

Wang J. ^{[1
,4
]}

机构：

[1] Institute of Artificial Intelligence, Donghua University, Shanghai

[2] College of Mechanical Engineering, Donghua University, Shanghai

[3] School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai

[4] Shanghai Engineering Research Center of Industrial Big Data and Intelligent System, Shanghai

来源：

Autonomous Intelligent Systems | 2022年 / 2卷 / 01期

关键词：

High dimension; Imbalance; Prediction; Wafer yield;

D O I：

10.1007/s43684-022-00041-3

中图分类号：

学科分类号：

摘要：

Wafer yield prediction, as the basis of quality control, is dedicated to predicting quality indices of the wafer manufacturing process. In recent years, data-driven machine learning methods have received a lot of attention due to their accuracy, robustness, and convenience for the prediction of quality indices. However, the existing studies mainly focus on the model level to improve the accuracy of yield prediction does not consider the impact of data characteristics on yield prediction. To tackle the above issues, a novel wafer yield prediction method is proposed, in which the improved genetic algorithm (IGA) is an under-sampling method, which is used to solve the problem of data overlap between finished products and defective products caused by the similarity of manufacturing processes between finished products and defective products in the wafer manufacturing process, and the problem of data imbalance caused by too few defective samples, that is, the problem of uneven distribution of data. In addition, the high-dimensional alternating feature selection method (HAFS) is used to select key influencing processes, that is, key parameters to avoid overfitting in the prediction model caused by many input parameters. Finally, SVM is used to predict the yield. Furthermore, experiments are conducted on a public wafer yield prediction dataset collected from an actual wafer manufacturing system. IGA-HAFS-SVM achieves state-of-art results on this dataset, which confirms the effectiveness of IGA-HAFS-SVM. Additionally, on this dataset, the proposed method improves the AUC score, G-Mean and F1-score by 21.6%, 34.6% and 0.6% respectively compared with the conventional method. Moreover, the experimental results prove the influence of data characteristics on wafer yield prediction. © 2022, The Author(s).

引用

共 50 条

[1] Feature selection for high-dimensional data
Bolón-Canedo V.
Sánchez-Maroño N.
Alonso-Betanzos A.
[J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
[2] Feature selection for high-dimensional data
Destrero A.
Mosci S.
De Mol C.
Verri A.
Odone F.
[J]. Computational Management Science, 2009, 6 (1) : 25 - 40
[3] Improved neighborhood space based feature selection algorithm for high-dimensional mixed data
Zhang T.-F.
Zhang Y.-D.
Ma F.-M.
[J]. Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 929 - 938
[4] Feature selection algorithm based on optimized genetic algorithm and the application in high-dimensional data processing
Feng, Guilian
[J]. PLOS ONE, 2024, 19 (05):
[5] Feature Selection for Optimized High-Dimensional Biomedical Data Using an Improved Shuffled Frog Leaping Algorithm
Hu, Bin
Dai, Yongqiang
Su, Yun
Moore, Philip
Zhang, Xiaowei
Mao, Chengsheng
Chen, Jing
Xu, Lixin
[J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (06) : 1765 - 1773
[6] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
Verleysen, Michel
[J]. NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
[7] Feature selection for high-dimensional data in astronomy
Zheng, Hongwen
Zhang, Yanxia
[J]. ADVANCES IN SPACE RESEARCH, 2008, 41 (12) : 1960 - 1964
[8] Feature selection for high-dimensional imbalanced data
Yin, Liuzhi
Ge, Yong
Xiao, Keli
Wang, Xuehua
Quan, Xiaojun
[J]. NEUROCOMPUTING, 2013, 105 : 3 - 11
[9] A filter feature selection for high-dimensional data
Janane, Fatima Zahra
Ouaderhman, Tayeb
Chamlal, Hasna
[J]. JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
[10] Feature selection for high-dimensional temporal data
Michail Tsagris
Vincenzo Lagani
Ioannis Tsamardinos
[J]. BMC Bioinformatics, 19

← 1 2 3 4 5 →