Cheating Death: A Statistical Survival Analysis of Publicly Available Python']Python Projects

被引:2
|
作者
Ali, Rao Hamza [1 ]
Parlett-Pelleriti, Chelsea [1 ]
Linstead, Erik [1 ]
机构
[1] Chapman Univ, Machine Learning & Assist Technol Lab, Orange, CA 92866 USA
关键词
open source software projects; survival analysis; software repository health; hazard ratios;
D O I
10.1145/3379597.3387511
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We apply survival analysis methods to a dataset of publicly-available software projects in order to examine the attributes that might lead to their inactivity over time. We ran a Kaplan-Meier analysis and fit a Cox Proportional-Hazards model to a subset of Software Heritage Graph Dataset, consisting of 3052 popular Python projects hosted on GitLab/GitHub, Debian, and PyPI, over a period of 165 months. We show that projects with repositories on multiple hosting services, a timeline of publishing major releases, and a good network of developers, remain healthy over time and should be worthy of the effort put in by developers and contributors.
引用
收藏
页码:6 / 10
页数:5
相关论文
共 24 条
  • [1] Two Approaches to Survival Analysis of Open Source Python']Python Projects
    Robinson, Derek
    Enns, Keanelek
    Koulecar, Neha
    Sihag, Manish
    [J]. 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2022), 2022, : 660 - 669
  • [2] Statistical Analysis of Machinery Variance by Python']Python
    Ostrowski, Joao Gabriel
    Menyhart, Jozsef
    [J]. ACTA POLYTECHNICA HUNGARICA, 2020, 17 (05) : 151 - 168
  • [3] CoastSat: A Google Earth Engine-enabled Python']Python toolkit to extract shorelines from publicly available satellite imagery
    Vos, Kilian
    Splinter, Kristen D.
    Harley, Mitchell D.
    Simmons, Joshua A.
    Turner, Ian L.
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2019, 122
  • [4] A Python']Python Toolbox for Unbiased Statistical Analysis of Fluorescence Intermittency of Multilevel Emitters
    Palstra, Isabelle M.
    Koenderink, A. Femius
    [J]. JOURNAL OF PHYSICAL CHEMISTRY C, 2021, 125 (22): : 12050 - 12060
  • [5] STARGATE-X: a Python']Python package for statistical analysis on the REACTOME network
    Sinaimeri, Blerina
    Marino, Andrea
    Tronci, Enrico
    Calamoneri, Tiziana
    [J]. JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2024, 20 (03)
  • [6] ExGUtils: A Python']Python Package for Statistical Analysis With the ex-Gaussian Probability Density
    Moret-Tatay, Carmen
    Gamermann, Daniel
    Navarro-Pardo, Esperanza
    de Cordoba Castella, Pedro Fernandez
    [J]. FRONTIERS IN PSYCHOLOGY, 2018, 9
  • [7] A Python']Python package based on robust statistical analysis for serial crystallography data processing
    Hadian-Jazi, Marjan
    Sadri, Alireza
    [J]. ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2023, 79 : 820 - 829
  • [8] Superposed epoch analysis using time-normalization: A Python']Python tool for statistical event analysis
    Walton, Samuel D. D.
    Murphy, Kyle R. R.
    [J]. FRONTIERS IN ASTRONOMY AND SPACE SCIENCES, 2022, 9
  • [9] A novel Python']Python module for statistical analysis of turbulence (P-SAT) in geophysical flows
    Agarwal, Mayank
    Deshpande, Vishal
    Katoshevski, David
    Kumar, Bimlesh
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [10] DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python']Python
    Peng, Jinglin
    Wu, Weiyuan
    Lockhart, Brandon
    Bian, Song
    Yan, Jing Nathan
    Xu, Linghao
    Chi, Zhixuan
    Rzeszotarski, Jeffrey M.
    Wang, Jiannan
    [J]. SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2271 - 2280