Debugging Machine Learning Pipelines

被引:1
|
作者
Lourenco, Raoni [1 ]
Freire, Juliana [1 ]
Shasha, Dennis [1 ]
机构
[1] NYU, New York, NY 10003 USA
基金
美国国家科学基金会;
关键词
EXPLANATIONS;
D O I
10.1145/3329486.3329489
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce incorrect results. Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought, and is both time consuming and error prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our source code and experimental data will be available for reproducibility and enhancement.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Machine Learning approach to corrosion assessment in subsea pipelines
    De Masi, Giulia
    Gentile, Manuela
    Vichi, Roberta
    Bruschi, Roberto
    Gabetta, Giovanna
    [J]. OCEANS 2015 - GENOVA, 2015,
  • [32] Machine Learning Pipelines: Training, Deployment and Opportunities for Reconfigurable Hardware
    Becker, Jurgen
    Prasanna, Viktor K.
    [J]. 2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 81 - 81
  • [33] Automated evolutionary approach for the design of composite machine learning pipelines
    Nikitin, Nikolay O.
    Vychuzhanin, Pavel
    Sarafanov, Mikhail
    Polonskaia, Iana S.
    Revin, Ilia
    V. Barabanova, Irina
    Maximov, Gleb
    Kalyuzhnaya, Anna, V
    Boukhanovsky, Alexander
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 127 : 109 - 125
  • [34] Engineering Carbon Emission-aware Machine Learning Pipelines
    Husom, Erik Johannes
    Sen, Sagar
    Goknil, Arda
    [J]. PROCEEDINGS 2024 IEEE/ACM 3RD INTERNATIONAL CONFERENCE ON AI ENGINEERING-SOFTWARE ENGINEERING FOR AI, CAIN 2024, 2024, : 118 - 128
  • [35] A Machine Learning Approach for Big Data in Oil and Gas Pipelines
    Mohamed, Abduljalil
    Hamdi, Mohamed Salah
    Tahar, Sofiene
    [J]. 2015 3RD INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD) AND INTERNATIONAL CONFERENCE ON OPEN AND BIG (OBD), 2015, : 585 - 590
  • [36] Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities
    Xin, Doris
    Miao, Hui
    Parameswaran, Aditya
    Polyzotis, Neoklis
    [J]. SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2639 - 2652
  • [37] FeedbackLogs: Recording and Incorporating Stakeholder Feedback into Machine Learning Pipelines
    Barker, Matthew
    Kallina, Emma
    Ashok, Dhananjay
    Collins, Katherine M.
    Casovan, Ashley
    Weller, Adrian
    Talwalkar, Ameet
    Chen, Valerie
    Bhatt, Umang
    [J]. PROCEEDINGS OF 2023 ACM CONFERENCE ON EQUITY AND ACCESS IN ALGORITHMS, MECHANISMS, AND OPTIMIZATION, EAAMO 2023, 2023,
  • [38] Diagnosing Machine Learning Pipelines with Fine-grained Lineage
    Zhang, Zhao
    Sparks, Evan R.
    Franklin, Michael J.
    [J]. HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 143 - 153
  • [39] Provenance Tracking for End-to-End Machine Learning Pipelines
    Grafberger, Stefan
    Groth, Paul
    Schelter, Sebastian
    [J]. COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1512 - 1512
  • [40] Prediction of Corrosion of Oil Pipelines in Ecuador based on Machine Learning
    Mera, Klever
    Paz, Henry
    [J]. PROCEEDINGS OF THE 2022 XXIV ROBOTICS MEXICAN CONGRESS (COMROB), 2022, : 125 - 131