Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing

被引:15
|
作者
Tripathi, Shailesh [1 ]
Muhr, David [1 ]
Brunner, Manuel [1 ]
Jodlbauer, Herbert [1 ]
Dehmer, Matthias [2 ,3 ,4 ,5 ]
Emmert-Streib, Frank [6 ,7 ]
机构
[1] Univ Appl Sci Upper Austria, Prod & Operat Management, Linz, Austria
[2] Swiss Distance Univ Appl Sci, Dept Comp Sci, Brig, Switzerland
[3] Xian Technol Univ, Sch Sci, Xian, Peoples R China
[4] UMIT The Hlth & Life Sci Univ, Dept Biomed Comp Sci & Mechatron, Hall In Tirol, Austria
[5] Nankai Univ, Coll Artificial Intelligence, Tianjin, Peoples R China
[6] Tampere Univ, Fac Informat Technol & Commun Sci, Predict Soc & Data Analyt Lab, Tampere, Finland
[7] Tampere Univ, Inst Biosci & Med Technol, Tampere, Finland
来源
基金
奥地利科学基金会;
关键词
machine learning; robustness; industry; 4.0; smart manufacturing; industrial production; CRISP-; DM; INTELLIGENT FAULT-DIAGNOSIS; SUPPLY CHAIN MANAGEMENT; FEATURE-SELECTION; MULTIPLE IMPUTATION; DATA QUALITY; REGRESSION; SHRINKAGE; CHALLENGES; WRAPPERS; INTERNET;
D O I
10.3389/frai.2021.576892
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Enhancing Data-Driven Models with Knowledge from Engineering Models in Manufacturing
    Auris, Felix
    Fisch, Jessica
    Brandl, Michael
    Suess, Sebastian
    Soubar, Abedalhameed
    Diedrich, Christian
    [J]. 2018 IEEE 14TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2018, : 653 - 656
  • [2] Data-Driven Discovery of Closure Models
    Pan, Shaowu
    Duraisamy, Karthik
    [J]. SIAM JOURNAL ON APPLIED DYNAMICAL SYSTEMS, 2018, 17 (04): : 2381 - 2413
  • [3] Paleontology Knowledge Graph for Data-Driven Discovery
    Yiying Deng
    Sicun Song
    Junxuan Fan
    Mao Luo
    Le Yao
    Shaochun Dong
    Yukun Shi
    Linna Zhang
    Yue Wang
    Haipeng Xu
    Huiqing Xu
    Yingying Zhao
    Zhaohui Pan
    Zhangshuai Hou
    Xiaoming Li
    Boheng Shen
    Xinran Chen
    Shuhan Zhang
    Xuejin Wu
    Lida Xing
    Qingqing Liang
    Enze Wang
    [J]. Journal of Earth Science, 2024, 35 (03) : 1024 - 1034
  • [4] Paleontology Knowledge Graph for Data-Driven Discovery
    Deng, Yiying
    Song, Sicun
    Fan, Junxuan
    Luo, Mao
    Yao, Le
    Dong, Shaochun
    Shi, Yukun
    Zhang, Linna
    Wang, Yue
    Xu, Haipeng
    Xu, Huiqing
    Zhao, Yingying
    Pan, Zhaohui
    Hou, Zhangshuai
    Li, Xiaoming
    Shen, Boheng
    Chen, Xinran
    Zhang, Shuhan
    Wu, Xuejin
    Xing, Lida
    Liang, Qingqing
    Wang, Enze
    [J]. JOURNAL OF EARTH SCIENCE, 2024, 35 (03) : 1024 - 1034
  • [5] Data-driven Causal Association Discovery in Manufacturing Industries
    Li, Yiming
    Xu, Jia
    Li, Li
    Iung, Benoit
    [J]. 2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 5566 - 5571
  • [6] Integrative Systems Biology for Data-Driven Knowledge Discovery
    Greene, Casey S.
    Troyanskaya, Olga G.
    [J]. SEMINARS IN NEPHROLOGY, 2010, 30 (05) : 443 - 454
  • [7] Knowledge discovery of geochemical patterns from a data-driven perspective
    Yin, Bojun
    Zuo, Renguang
    Xiong, Yihui
    Li, Yongsheng
    Yang, Weigang
    [J]. JOURNAL OF GEOCHEMICAL EXPLORATION, 2021, 231
  • [8] KNOWLEDGE DISCOVERY AND ROBUSTNESS ANALYSIS IN MANUFACTURING SIMULATIONS
    Feldkamp, Niclas
    Bergmann, Soeren
    Strassburger, Steffen
    Schulze, Thomas
    [J]. 2017 WINTER SIMULATION CONFERENCE (WSC), 2017, : 3952 - 3963
  • [9] The geoscience knowledge system, ontology and knowledge graph for data-driven discovery: Preface
    Ma, Xiaogang
    Ma, Chao
    Lv, Hairong
    Hu, Xiumian
    [J]. GEOSCIENCE FRONTIERS, 2023, 14 (05)
  • [10] Data-driven Approach for Discovery of Energy Saving Potentials in Manufacturing Factory
    Song, Bin
    Ao, Yintai
    Xiang, Li
    Lionel, K. Y. Ng
    [J]. 25TH CIRP LIFE CYCLE ENGINEERING (LCE) CONFERENCE, 2018, 69 : 330 - 335