Understanding Software-2.0: A Study of Machine Learning Library Usage and Evolution

被引:39
|
作者
Dilhara, Malinda [2 ]
Ketkar, Ameya [1 ]
Dig, Danny [2 ]
机构
[1] Oregon State Univ, Corvallis, OR 97333 USA
[2] Univ Colorado, Boulder, CO 80301 USA
关键词
Machine learning libraries; empirial studies; Software-2.0; SUPPORT; RECOMMENDATION;
D O I
10.1145/3453478
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Enabled by a rich ecosystem of Machine Learning (ML) libraries, programming using learned models, i.e., Software-2.0, has gained substantial adoption. However, we do not know what challenges developers encounter when they use ML libraries. With this knowledge gap, researchers miss opportunities to contribute to new research directions, tool builders do not invest resources where automation is most needed, library designers cannot make informed decisions when releasing ML library versions, and developers fail to use common practices when using ML libraries. We present the first large-scale quantitative and qualitative empirical study to shed light on how developers in Software-2.0 use ML libraries, and how this evolution affects their code. Particularly, using static analysis we perform a longitudinal study of 3,340 top-rated open-source projects with 46,110 contributors. To further understand the challenges of ML library evolution, we survey 109 developers who introduce and evolve ML libraries. Using this rich dataset we reveal several novel findings. Among others, we found an increasing trend of using ML libraries: The ratio of new Python projects that use ML libraries increased from 2% in 2013 to 50% in 2018. We identify several usage patterns including the following: (i) 36% of the projects use multiple ML libraries to implement various stages of the ML workflows, (ii) developers update ML libraries more often than the traditional libraries, (iii) strict upgrades are the most popular for ML libraries among other update kinds, (iv) ML library updates often result in cascading library updates, and (v) ML libraries are often downgraded (22.04% of cases). We also observed unique challenges when evolving and maintaining Software-2.0 such as (i) binary incompatibility of trained ML models and (ii) benchmarking ML models. Finally, we present actionable implications of our findings for researchers, tool builders, developers, educators, library vendors, and hardware vendors.
引用
收藏
页数:42
相关论文
共 50 条
  • [21] The XLIB indigenous library-automation software: A case study of software innovation in a learning organization
    Akomolafe-Fatuyi, Esther
    [J]. CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2005, 29 (03): : 371 - 371
  • [22] A Study on Software Effort Prediction Using Machine Learning Techniques
    Zhang, Wen
    Yang, Ye
    Wang, Qing
    [J]. EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, ENASE 2011, 2013, 275 : 1 - 15
  • [23] Machine Learning Applied to Software Testing: A Systematic Mapping Study
    Durelli, Vinicius H. S.
    Durelli, Rafael S.
    Borges, Simone S.
    Endo, Andre T.
    Eler, Marcelo M.
    Dias, Diego R. C.
    Guimaraes, Marcelo P.
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2019, 68 (03) : 1189 - 1212
  • [24] Comprehensive Study on Machine Learning Techniques for Software Bug Prediction
    Khleel, Nasraldeen Alnor Adam
    Nehez, Karoly
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 726 - 735
  • [25] Machine learning techniques for software vulnerability prediction: a comparative study
    Gul Jabeen
    Sabit Rahim
    Wasif Afzal
    Dawar Khan
    Aftab Ahmed Khan
    Zahid Hussain
    Tehmina Bibi
    [J]. Applied Intelligence, 2022, 52 : 17614 - 17635
  • [26] Machine learning techniques for software vulnerability prediction: a comparative study
    Jabeen, Gul
    Rahim, Sabit
    Afzal, Wasif
    Khan, Dawar
    Khan, Aftab Ahmed
    Hussain, Zahid
    Bibi, Tehmina
    [J]. APPLIED INTELLIGENCE, 2022, 52 (15) : 17614 - 17635
  • [27] An experimental study for software quality prediction with machine learning methods
    Ceran, A. Ayberk
    Tanriover, O. Ozgur
    [J]. 2ND INTERNATIONAL CONGRESS ON HUMAN-COMPUTER INTERACTION, OPTIMIZATION AND ROBOTIC APPLICATIONS (HORA 2020), 2020, : 93 - 96
  • [28] A Study on Machine Learning Applied to Software Bug Priority Prediction
    Malhotra, Ruchika
    Dabas, Ajay
    Hariharasudhan, A. S.
    Pant, Manish
    [J]. 2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 965 - 970
  • [29] Understanding Autism Using Machine Learning: A Structural MRI Study
    Ali, Mohamed T.
    ElNakieb, Yaser
    Shalaby, Ahmed
    Elnakib, Ahmed
    Mahmoud, Ali
    Zai, Huma
    Ghazal, Mohammed
    Barnes, Gregory
    El-Baz, Ayman
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4350 - 4357
  • [30] Using web2.0 software and mobile devices for creating shared understanding among virtual learning communities
    Laru, Jari
    Jarvela, Sana
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON WIRELESS, MOBILE AND UBIQUITOUS TECHNOLOGIES IN EDUCATION, PROCEEDINGS, 2008, : 228 - 230