On the Data Quality and Imbalance in Machine Learning-based Design and Manufacturing-A Systematic Review

被引:0
|
作者
Xie, Jiarui [1 ]
Sun, Lijun [1 ,2 ]
Zhao, Yaoyao Fiona [1 ]
机构
[1] McGill Univ, Dept Mech Engn, Addit Design & Mfg Lab, Montreal, PQ H3A 0G4, Canada
[2] McGill Univ, Dept Civil Engn, Smart Transportat Lab, Montreal, PQ H3A 0G4, Canada
来源
ENGINEERING | 2025年 / 45卷
关键词
Machine learning; Design and manufacturing; Data quality; Data augmentation; Active learning; CONVOLUTIONAL NEURAL-NETWORK; DATA GOVERNANCE; DEEP; FRAMEWORK; VISION; METHODOLOGY; INSPECTION; SELECTION; MODEL;
D O I
10.1016/j.eng.2024.04.024
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Machine learning (ML) has recently enabled many modeling tasks in design, manufacturing, and condition monitoring due to its unparalleled learning ability using existing data. Data have become the limiting factor when implementing ML in industry. However, there is no systematic investigation on how data quality can be assessed and improved for ML-based design and manufacturing. The aim of this survey is to uncover the data challenges in this domain and review the techniques used to resolve them. To establish the background for the subsequent analysis, crucial data terminologies in ML-based modeling are reviewed and categorized into data acquisition, management, analysis, and utilization. Thereafter, the concepts and frameworks established to evaluate data quality and imbalance, including data quality assessment, data readiness, information quality, data biases, fairness, and diversity, are further investigated. The root causes and types of data challenges, including human factors, complex systems, complicated relationships, lack of data quality, data heterogeneity, data imbalance, and data scarcity, are identified and summarized. Methods to improve data quality and mitigate data imbalance and their applications in this domain are reviewed. This literature review focuses on two promising methods: data augmentation and active learning. The strengths, limitations, and applicability of the surveyed techniques are illustrated. The trends of data augmentation and active learning are discussed with respect to their applications, data types, and approaches. Based on this discussion, future directions for data quality improvement and data imbalance mitigation in this domain are identified. (c) 2024 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and Higher Education Press Limited Company. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:105 / 131
页数:27
相关论文
共 50 条
  • [41] Machine Learning for Industry 4.0: A Systematic Review Using Deep Learning-Based Topic Modelling
    Mazzei, Daniele
    Ramjattan, Reshawn
    SENSORS, 2022, 22 (22)
  • [42] Machine Learning-Based Design Concept Evaluation
    Camburn, Bradley
    He, Yuejun
    Raviselvam, Sujithra
    Luo, Jianxi
    Wood, Kristin
    JOURNAL OF MECHANICAL DESIGN, 2020, 142 (03)
  • [43] Design of Machine Learning-Based Smoke Surveillance
    Ho, Chao-Ching
    ADVANCED SCIENCE LETTERS, 2011, 4 (6-7) : 2272 - 2275
  • [44] Machine learning-based Raman amplifier design
    Zibar, D.
    Ferrari, A.
    Curri, V.
    Carena, A.
    2019 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION (OFC), 2019,
  • [45] Machine Learning-Based Sensor Data Fusion for Animal Monitoring: Scoping Review
    Aguilar-Lazcano, Carlos Alberto
    Espinosa-Curiel, Ismael Edrein
    Rios-Martinez, Jorge Alberto
    Madera-Ramirez, Francisco Alejandro
    Perez-Espinosa, Humberto
    SENSORS, 2023, 23 (12)
  • [46] A review of machine learning in additive manufacturing: design and process
    Chen, Kefan
    Zhang, Peilei
    Yan, Hua
    Chen, Guanglong
    Sun, Tianzhu
    Lu, Qinghua
    Chen, Yu
    Shi, Haichuan
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2024, 135 (3-4): : 1051 - 1087
  • [47] A Review on Machine Learning, Big Data Analytics, and Design for Additive Manufacturing for Aerospace Applications
    Satish Chinchanikar
    Avez A. Shaikh
    Journal of Materials Engineering and Performance, 2022, 31 : 6112 - 6130
  • [48] A Review on Machine Learning, Big Data Analytics, and Design for Additive Manufacturing for Aerospace Applications
    Chinchanikar, Satish
    Shaikh, Avez A.
    JOURNAL OF MATERIALS ENGINEERING AND PERFORMANCE, 2022, 31 (08) : 6112 - 6130
  • [49] Machine Learning-Based Detection Technique for NDT in Industrial Manufacturing
    Niccolai, Alessandro
    Caputo, Davide
    Chieco, Leonardo
    Grimaccia, Francesco
    Mussetta, Marco
    MATHEMATICS, 2021, 9 (11)
  • [50] Machine Learning-Based Software Defect Prediction for Mobile Applications: A Systematic Literature Review
    Jorayeva, Manzura
    Akbulut, Akhan
    Catal, Cagatay
    Mishra, Alok
    SENSORS, 2022, 22 (07)