Library adoption in public software repositories

被引:1
|
作者
Krohn, Rachel [1 ]
Weninger, Tim [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
关键词
Information adoption; Software libraries; GitHub; !text type='Python']Python[!/text; StackOverflow; Classification; SVM; Modelling; Git; Repository; Commit; Software development; Cognitive science; Text mining; WORD-OF-MOUTH; ONLINE; DIFFUSION; CONSUMERS; NETWORK; MODELS;
D O I
10.1186/s40537-019-0201-8
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study the the spread and adoption of libraries within Python projects hosted in public software repositories on GitHub. By modelling the use of Git pull, merge, commit, and other actions as deliberate cognitive activities, we are able to better understand the dynamics of what happens when users adopt new and cognitively demanding information. For this task we introduce a large corpus containing all commits, diffs, messages, and source code from 259,690 Python repositories (about 13% of all Python projects on Github), including all Git activity data from 89,311 contributing users. In this initial work we ask two primary questions: (1) What kind of behavior change occurs near an adoption event? (2) Can we model future adoption activity of a user? Using a fine-grained analysis of user behavior, we show that library adoptions are followed by higher than normal activity within the first 6 h, implying that a higher than normal cognitive effort is involved with an adoption. Further study is needed to understand the specific types of events that surround the adoption of new information, and the cause of these dynamics. We also show that a simple linear model is capable of classifying future commits as being an adoption or not, based on the commit contents and the preceding history of the user and repository. Additional work in this vein may be able to predict the content of future commits, or suggest new libraries to users.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Negotiating open source software adoption in the UK public sector
    Shaikh, Maha
    [J]. GOVERNMENT INFORMATION QUARTERLY, 2016, 33 (01) : 115 - 132
  • [22] A Survey on Mining Software Repositories
    Jung, Woosung
    Lee, Eunjoo
    Wu, Chisu
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (05): : 1384 - 1406
  • [23] Ethics in the mining of software repositories
    Nicolas E. Gold
    Jens Krinke
    [J]. Empirical Software Engineering, 2022, 27
  • [24] Research Friendly Software Repositories
    Herraiz, Israel
    Robles, Gregorio
    Gonzalez-Barahona, Jesus M.
    [J]. IWPSE-EVOL 09: ERCIM WORKSHOP ON SOFTWARE EVOLUTION (EVOL) AND INTERNATIONAL WORKSHOP ON PRINCIPLES OF SOFTWARE EVOLUTION (IWPSE), 2009, : 19 - 23
  • [25] Ethics in the mining of software repositories
    Gold, Nicolas E.
    Krinke, Jens
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (01)
  • [26] Software Repositories: A Strategic Asset
    Hassan, Ahmed E.
    [J]. IEEE SOFTWARE, 2009, 26 (01) : 67 - 68
  • [27] Tools in Mining Software Repositories
    Chaturvedi, K. K.
    Singh, V. B.
    Singh, Prashast
    [J]. PROCEEDINGS OF THE 2013 13TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ITS APPLICATIONS (ICCSA 2013), 2013, : 89 - 98
  • [28] Process mining software repositories
    Poncin, Wouter
    Serebrenik, Alexander
    van den Brand, Mark
    [J]. 2011 15TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2011, : 5 - 13
  • [29] MetricHunter: A software metric dataset generator utilizing SourceMonitor upon public GitHub repositories
    Ozcevik, Yusuf
    Altay, Osman
    [J]. SOFTWAREX, 2023, 23