An Empirical Study of Type-Related Defects in Python']Python Projects

被引:6
|
作者
Khan, Faizan [1 ]
Chen, Boqi [1 ]
Varro, Daniel [1 ]
McIntosh, Shane [2 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ H3A 0G4, Canada
[2] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
关键词
!text type='Python']Python[!/text; Annotations; Tools; Task analysis; Ecosystems; Software systems; Software measurement; Software defects; static type checkers; dynamic type systems; empirical study; DYNAMIC TYPE SYSTEMS;
D O I
10.1109/TSE.2021.3082068
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, Python has experienced an explosive growth in adoption, particularly among open source projects. While Python's dynamically-typed nature provides developers with powerful programming abstractions, that same dynamic type system allows for type-related defects to accumulate in code bases. To aid in the early detection of type-related defects, type annotations were introduced into the Python ecosystem (i.e., PEP-484) and static type checkers like mypy have appeared on the market. While applying a type checker like mypy can in theory help to catch type-related defects before they impact users, little is known about the real impact of adopting a type checker to reveal defects in Python projects. In this paper, we study the extent to which Python projects benefit from such type checking features. For this purpose, we mine the issue tracking and version control repositories of 210 Python projects on GitHub. Inspired by the work of Gao et al. on type-related defects in JavaScript, we add type annotations to test whether mypy detects an error that would have helped developers to avoid real defects. We observe that 15 percent of the defects could have been prevented by mypy. Moreover, we find that there is no significant difference between the experience level of developers committing type-related defects and the experience of developers committing defects that are not type-related. In addition, a manual analysis of the anti-patterns that most commonly lead to type-checking faults reveals that the redefinition of Python references, dynamic attribute initialization and incorrectly handled Null objects are the most common causes of type-related faults. Since our study is conducted on fixed public defects that have gone through code reviews and multiple test cycles, these results represent a lower bound on the benefits of adopting a type checker. Therefore, we recommend incorporating a static type checker like mypy into the development workflow, as not only will it prevent type-related defects but also mitigate certain anti-patterns during development.
引用
收藏
页码:3145 / 3158
页数:14
相关论文
共 50 条
  • [41] Investigating Type Declaration Mismatches in Python']Python
    Pascarella, Luca
    Ram, Achyudh
    Nadeem, Azqa
    Bisesser, Dinesh
    Knyazev, Norman
    Bacchelli, Alberto
    [J]. 2018 IEEE WORKSHOP ON MACHINE LEARNING TECHNIQUES FOR SOFTWARE QUALITY EVALUATION (MALTESQUE), 2018, : 43 - 48
  • [42] An Empirical Analysis of Vulnerabilities in Python']Python Packages for Web Applications
    Ruohonen, Jukka
    [J]. 2018 9TH INTERNATIONAL WORKSHOP ON EMPIRICAL SOFTWARE ENGINEERING IN PRACTICE (IWESEP), 2018, : 25 - 30
  • [43] Identification and characterization of two closely related unclassifiable endogenous retroviruses in python']pythons (Python']Python molurus and Python']Python curtus)
    Huder, JB
    Böni, J
    Hatt, JM
    Soldati, G
    Lutz, H
    Schüpbach, J
    [J]. JOURNAL OF VIROLOGY, 2002, 76 (15) : 7607 - 7615
  • [44] PyPhotonics: A python']python package for the evaluation of luminescence properties of defects
    Tawfik, Sherif Abdulkader
    Russo, Salvy P.
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 2022, 273
  • [45] On the Use of Lambda Expressions in 760 Open Source Python']Python Projects
    Sangle, Shubham
    Muvva, Sandeep
    [J]. ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 1232 - 1234
  • [46] NICHE: A Curated Dataset of Engineered Machine Learning Projects in Python']Python
    Widyasari, Ratnadira
    Yang, Zhou
    Thung, Ferdian
    Sim, Sheng Qin
    Wee, Fiona
    Lok, Camellia
    Phan, Jack
    Qi, Haodi
    Tan, Constance
    Tay, Qijin
    Lo, David
    [J]. 2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 62 - 66
  • [47] An empirical analysis of the transition from Python 2 to Python 3
    Brian A. Malloy
    James F. Power
    [J]. Empirical Software Engineering, 2019, 24 : 751 - 778
  • [48] Two Approaches to Survival Analysis of Open Source Python']Python Projects
    Robinson, Derek
    Enns, Keanelek
    Koulecar, Neha
    Sihag, Manish
    [J]. 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2022), 2022, : 660 - 669
  • [49] An Empirical Study on the Characteristics of Python']Python Fine-Grained Source Code Change Types
    Lin, Wei
    Chen, Zhifei
    Ma, Wanwangying
    Chen, Lin
    Xu, Lei
    Xu, Baowen
    [J]. 32ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2016), 2016, : 189 - 200
  • [50] Method Chaining Redux: An Empirical Study of Method Chaining in Java']Java, Kotlin, and Python']Python
    Keshk, Ali M.
    Dyer, Robert
    [J]. 2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 546 - 557