A Large-Scale Empirical Study on Java']Java Library Migrations: Prevalence, Trends, and Rationales

被引:22
|
作者
He, Hao [1 ,2 ]
He, Runzhi [1 ,2 ]
Gu, Haiqiao [3 ]
Zhou, Minghui [1 ,2 ]
机构
[1] Peking Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
[2] Minist Educ, Key Lab High Confidence Software Technol, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Phys, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
library migration; mining software repositories; evolution and maintenance; empirical software engineering;
D O I
10.1145/3468264.3468571
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the rise of open-source software and package hosting platforms, reusing 3rd-party libraries has become a common practice. Due to various failures during software evolution, a project may remove a used library and replace it with another library, which we call library migration. Despite substantial research on dependency management, the understanding of how and why library migrations occur is still lacking. Achieving this understanding may help practitioners optimize their library selection criteria, develop automated approaches to monitor dependencies, and provide migration suggestions for their libraries or software projects. In this paper, through a fine-grained commit-level analysis of 19,652 Java GitHub projects, we extract the largest migration dataset to-date (1,194 migration rules, 3,163 migration commits). We show that 8,065 (41.04%) projects having at least one library removal, 1,564 (7.96%, lower-bound) to 5,004 (25.46%, upper-bound) projects have at least one migration, and a median project with migrations has 2 to 4 migrations in total. We discover that library migrations are dominated by several domains (logging, JSON, testing and web service) presenting a long tail distribution. Also, migrations are highly unidirectional in that libraries are either mostly abandoned or mostly chosen in our project corpus. A thematic analysis on related commit messages, issues, and pull requests identifies 14 frequently mentioned migration reasons (e.g., lack of maintenance, usability, integration, etc), 7 of which are not discussed in previous work. Our findings can be operationalized into actionable insights for package hosting platforms, project maintainers, and library developers. We provide a replication package at https://doi.org/10.5281/zenodo.4816752.
引用
收藏
页码:478 / 490
页数:13
相关论文
共 50 条
  • [1] A large-scale empirical study on Java library migrations: Prevalence, trends, and rationales
    He, Hao
    He, Runzhi
    Gu, Haiqiao
    Zhou, Minghui
    [J]. ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021, : 478 - 490
  • [2] A study of library migrations in Java']Java
    Teyton, Cedric
    Falleri, Jean-Remy
    Palyart, Marc
    Blanc, Xavier
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2014, 26 (11) : 1030 - 1052
  • [3] Characteristics of method extractions in Java']Java: a large scale empirical study
    Hora, Andre
    Robbes, Romain
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2020, 25 (03) : 1798 - 1833
  • [4] A large-scale empirical study of code smells in Java']JavaScript projects
    Johannes, David
    Khomh, Foutse
    Antoniol, Giuliano
    [J]. SOFTWARE QUALITY JOURNAL, 2019, 27 (03) : 1271 - 1314
  • [5] Large-scale image deblurring in Java']Java
    Wendykier, Piotr
    Nagy, James G.
    [J]. COMPUTATIONAL SCIENCE - ICCS 2008, PT 1, 2008, 5101 : 721 - 730
  • [6] Large-scale characterization of Java']Java streams
    Rosales, Eduardo
    Basso, Matteo
    Rosa, Andrea
    Binder, Walter
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2023, 53 (09): : 1763 - 1792
  • [7] Java']Java for large-scale scientific computations?
    Krall, A
    Tomsich, P
    [J]. LARGE-SCALE SCIENTIFIC COMPUTING, 2001, 2179 : 228 - 235
  • [8] Large-scale parallel geophysical algorithms in Java']Java: a feasibility study
    Jacob, M
    Philippsen, M
    Karrenbach, M
    [J]. CONCURRENCY-PRACTICE AND EXPERIENCE, 1998, 10 (11-13): : 1143 - 1153
  • [9] Java']Java communications for large-scale parallel computing
    Getov, V
    Philippsen, M
    [J]. LARGE-SCALE SCIENTIFIC COMPUTING, 2001, 2179 : 33 - 45
  • [10] A large-scale study on the usage of Java']Java's concurrent programming constructs
    Pinto, Gustavo
    Torres, Weslley
    Fernandes, Benito
    Castor, Fernando
    Barros, Roberto S. M.
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2015, 106 : 59 - 81