Tool Support for Improving Software Quality in Machine Learning Programs

被引:0
|
作者
Cheng, Kwok Sun [1 ]
Huang, Pei-Chi [1 ]
Ahn, Tae-Hyuk [2 ]
Song, Myoungkyu [1 ]
机构
[1] Univ Nebraska Omaha, Dept Comp Sci, Omaha, NE 68182 USA
[2] St Louis Univ, Dept Comp Sci, St Louis, MO 63103 USA
关键词
software quality; anomaly detection; quality validation; machine learning applications; ARTIFICIAL-INTELLIGENCE AI; CANCER;
D O I
10.3390/info14010053
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) techniques discover knowledge from large amounts of data. Modeling in ML is becoming essential to software systems in practice. The accuracy and efficiency of ML models have been focused on ML research communities, while there is less attention on validating the qualities of ML models. Validating ML applications is a challenging and time-consuming process for developers since prediction accuracy heavily relies on generated models. ML applications are written by relatively more data-driven programming based on the black box of ML frameworks. All of the datasets and the ML application need to be individually investigated. Thus, the ML validation tasks take a lot of time and effort. To address this limitation, we present a novel quality validation technique that increases the reliability for ML models and applications, called MLVal. Our approach helps developers inspect the training data and the generated features for the ML model. A data validation technique is important and beneficial to software quality since the quality of the input data affects speed and accuracy for training and inference. Inspired by software debugging/validation for reproducing the potential reported bugs, MLVal takes as input an ML application and its training datasets to build the ML models, helping ML application developers easily reproduce and understand anomalies in the ML application. We have implemented an Eclipse plugin for MLVal that allows developers to validate the prediction behavior of their ML applications, the ML model, and the training data on the Eclipse IDE. In our evaluation, we used 23,500 documents in the bioengineering research domain. We assessed the ability of the MLVal validation technique to effectively help ML application developers: (1) investigate the connection between the produced features and the labels in the training model, and (2) detect errors early to secure the quality of models from better data. Our approach reduces the cost of engineering efforts to validate problems, improving data-centric workflows of the ML application development.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] IMPROVING SOFTWARE QUALITY USING MACHINE LEARNING
    Chandra, Kanika
    Kapoor, Gagan
    Kohli, Rashi
    Gupta, Archana
    [J]. 2016 1ST INTERNATIONAL CONFERENCE ON INNOVATION AND CHALLENGES IN CYBER SECURITY (ICICCS 2016), 2016, : 115 - 118
  • [2] Improving Design Quality of Software Using Machine Learning Techniques
    Prabha, C. Lakshmi
    Shivakumar, N.
    [J]. 2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 583 - 588
  • [3] Support tool for software quality assurance in software development
    Ibarra, Saul
    Munoz, Mirna
    [J]. 2018 7TH INTERNATIONAL CONFERENCE ON SOFTWARE PROCESS IMPROVEMENT (CIMPS): APPLICATIONS IN SOFTWARE ENGINEERING, 2018, : 13 - 19
  • [4] Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality
    Malhotra, Ruchika
    Jain, Ankita
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2012, 8 (02): : 241 - 262
  • [5] Machine learning: A tool to support usability?
    Finlay, J
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 1997, 11 (7-8) : 633 - 651
  • [6] Quality Assurance of Machine Learning Software
    Nakajima, Shin
    [J]. 2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 601 - 604
  • [7] A machine learning software tool for multiclass classification
    Wang, Shangzhou
    Lu, Haohui
    Khan, Arif
    Hajati, Farshid
    Khushi, Matloob
    Uddin, Shahadat
    [J]. SOFTWARE IMPACTS, 2022, 13
  • [8] A learning support tool for testing Java']Java programs
    Kamigochi, Nobuyuki
    Matsuura, Saeko
    [J]. PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, 2007, : 273 - +
  • [9] Investigating Statistical Machine Learning as a Tool for Software Development
    Patel, Kayur
    Fogarty, James
    Landay, James A.
    Harrison, Beverly
    [J]. CHI 2008: 26TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2008, : 667 - 676
  • [10] Software Quality Prediction Using Machine Learning
    Desai, Bhoushika
    Sungkur, Roopesh Kevin
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2022, 10 (01)