Representation of Developer Expertise in Open Source Software

被引:13
|
作者
Dey, Tapajit [1 ]
Karnauch, Andrey [1 ]
Mockus, Audris [1 ]
机构
[1] Univ Tennessee, Knoxville, TN 37996 USA
基金
美国国家科学基金会;
关键词
Expertise; Developer Expertise; Vector Embedding; Doc2Vec; API; API embedding; Project embedding; Developer embedding; Skill Space; Machine Learning; Open Source; World of Code; SYSTEMS;
D O I
10.1109/ICSE43902.2021.00094
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Background: Accurate representation of developer expertise has always been an important research problem. While a number of studies proposed novel methods of representing expertise within individual projects, these methods are difficult to apply at an ecosystem level. However, with the focus of software development shifting from monolithic to modular, a method of representing developers' expertise in the context of the entire OSS development becomes necessary when, for example, a project tries to find new maintainers and look for developers with relevant skills. Aim: We aim to address this knowledge gap by proposing and constructing the Skill Space where each API, developer, and project is represented and postulate how the topology of this space should reflect what developers know (and projects need). Method: we use the World of Code infrastructure to extract the complete set of APIs in the files changed by open source developers and, based on that data, employ Doc2Vec embeddings for vector representations of APIs, developers, and projects. We then evaluate if these embeddings reflect the postulated topology of the Skill Space by predicting what new APIs/projects developers use/join, and whether or not their pull requests get accepted. We also check how the developers' representations in the Skill Space align with their self-reported API expertise. Result: Our results suggest that the proposed embeddings in the Skill Space appear to satisfy the postulated topology and we hope that such representations may aid in the construction of signals that increase trust (and efficiency) of open source ecosystems at large and may aid investigations of other phenomena related to developer proficiency and learning.
引用
收藏
页码:995 / 1007
页数:13
相关论文
共 50 条
  • [21] Discovering software developer's coding expertise through deep learning
    Javeed, Farooq
    Siddique, Ansar
    Munir, Akhtar
    Shehzad, Basit
    Lali, Muhammad I. U.
    IET SOFTWARE, 2020, 14 (03) : 213 - 220
  • [22] Developer autonomy in the FreeBSD open source project
    Jorgensen, Niels
    JOURNAL OF MANAGEMENT & GOVERNANCE, 2007, 11 (02) : 119 - 128
  • [23] Software Developer Activity as a Source for Identifying Hidden Source Code Dependencies
    Konopka, Martin
    Bielikova, Maria
    SOFSEM 2015: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2015, 8939 : 449 - 462
  • [24] New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems
    Seker, Abdulkadir
    Diri, Banu
    Arslan, Halil
    APPLIED SCIENCES-BASEL, 2021, 11 (03): : 1 - 26
  • [25] Developer-Centric Knowledge Mining from Large Open-Source Software Repositories (CROSSMINER)
    Bagnato, Alessandra
    Barmpis, Konstantinos
    Bessis, Nik
    Cabrera-Diego, Luis Adrian
    Di Rocco, Juri
    Di Ruscio, Davide
    Gergely, Tamas
    Hansen, Scott
    Kolovos, Dimitris
    Krief, Philippe
    Korkontzelos, Ioannis
    Lauriere, Stephane
    Lopez de la Fuente, Jose Manrique
    Malo, Pedro
    Paige, Richard F.
    Spinellis, Diomidis
    Thomas, Cedric
    Vinju, Jurgen
    SOFTWARE TECHNOLOGIES: APPLICATIONS AND FOUNDATIONS, STAF 2017, 2018, 10748 : 375 - 384
  • [26] Using Developer Factors and Horizontal Partitioning to Recommend Bug Severity in Open-Source Software Projects
    Amjad, Hafiz Muhammad Waqas
    Rana, Zeeshan Ali
    2022 17TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET'22), 2022, : 130 - 135
  • [27] Profiling Developer Expertise across Software Communities with Heterogeneous Information Network Analysis
    Yan, Jiafei
    Sun, Hailong
    Wang, Xu
    Liu, Xudong
    Song, Xiaotao
    INTERNETWARE'18: PROCEEDINGS OF THE TENTH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE, 2018,
  • [28] Open source software
    Irwin, B
    LIBRARY JOURNAL, 2000, 125 (02) : 8 - 8
  • [29] Open Source Software
    Gaff, Brian M.
    Ploussios, Gregory J.
    COMPUTER, 2012, 45 (06) : 9 - 11
  • [30] CODE IS SPEECH: Legal Tinkering, Expertise, and Protest among Free and Open Source Software Developers
    Coleman, Gabriella
    CULTURAL ANTHROPOLOGY, 2009, 24 (03) : 420 - 454