Representation of Developer Expertise in Open Source Software

被引:13
|
作者
Dey, Tapajit [1 ]
Karnauch, Andrey [1 ]
Mockus, Audris [1 ]
机构
[1] Univ Tennessee, Knoxville, TN 37996 USA
基金
美国国家科学基金会;
关键词
Expertise; Developer Expertise; Vector Embedding; Doc2Vec; API; API embedding; Project embedding; Developer embedding; Skill Space; Machine Learning; Open Source; World of Code; SYSTEMS;
D O I
10.1109/ICSE43902.2021.00094
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Background: Accurate representation of developer expertise has always been an important research problem. While a number of studies proposed novel methods of representing expertise within individual projects, these methods are difficult to apply at an ecosystem level. However, with the focus of software development shifting from monolithic to modular, a method of representing developers' expertise in the context of the entire OSS development becomes necessary when, for example, a project tries to find new maintainers and look for developers with relevant skills. Aim: We aim to address this knowledge gap by proposing and constructing the Skill Space where each API, developer, and project is represented and postulate how the topology of this space should reflect what developers know (and projects need). Method: we use the World of Code infrastructure to extract the complete set of APIs in the files changed by open source developers and, based on that data, employ Doc2Vec embeddings for vector representations of APIs, developers, and projects. We then evaluate if these embeddings reflect the postulated topology of the Skill Space by predicting what new APIs/projects developers use/join, and whether or not their pull requests get accepted. We also check how the developers' representations in the Skill Space align with their self-reported API expertise. Result: Our results suggest that the proposed embeddings in the Skill Space appear to satisfy the postulated topology and we hope that such representations may aid in the construction of signals that increase trust (and efficiency) of open source ecosystems at large and may aid investigations of other phenomena related to developer proficiency and learning.
引用
收藏
页码:995 / 1007
页数:13
相关论文
共 50 条
  • [1] Replication package for Representation of Developer Expertise in Open Source Software
    Dey, Tapajit
    Karnauch, Andrey
    Mockus, Audris
    [J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021), 2021, : 236 - 237
  • [2] Who is an open source software developer?
    Dempsey, BJ
    Weiss, D
    Jones, P
    Greenberg, J
    [J]. COMMUNICATIONS OF THE ACM, 2002, 45 (02) : 67 - 72
  • [3] Open Source Software Developer and Project Networks
    Van Antwerp, Matthew
    Madey, Greg
    [J]. OPEN SOURCE SOFTWARE: NEW HORIZONS, 2010, 319 : 407 - 412
  • [4] Impact of Developer Turnover on Quality in Open-Source Software
    Foucault, Matthieu
    Palyart, Marc
    Blanc, Xavier
    Murphy, Gail C.
    Falleri, Jean-Remy
    [J]. 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 829 - 841
  • [5] How Open Source Is Changing the Software Developer's Career
    Riehle, Dirk
    [J]. COMPUTER, 2015, 48 (05) : 51 - 57
  • [6] Open-source software systems understanding bug prediction and software developer roles
    Lenin R.B.
    Ramaswamy S.
    Yu L.
    Govindan R.B.
    [J]. International Journal of Open Source Software and Processes, 2010, 2 (04) : 28 - 47
  • [7] Developer Heterogeneity and Formation of Communication Networks in Open Source Software Projects
    Singh, Param Vir
    Tan, Yong
    [J]. JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2010, 27 (03) : 179 - 210
  • [8] Mining Developer Contribution in Open Source Software Using Visualization Techniques
    Xu Ben
    Shen Beijun
    Yang Weicheng
    [J]. 2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 934 - 937
  • [9] Exploring factors affecting developer abandonment of open source software projects
    Kaur, Rajdeep
    Chahal, Kuljit Kaur
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2022, 34 (09)
  • [10] Network-Based Ranking for Open Source Software Developer Prediction
    Wu, Zhefu
    Li, Jianan
    Fu, Chenbo
    Xuan, Qi
    Xiang, Yun
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2018, 28 (06) : 845 - 868