A Study of Potential Code Borrowing and License Violations in Java']Java Projects on GitHub

被引:15
|
作者
Golubev, Yaroslav [1 ,2 ]
Eliseeva, Maria [3 ]
Povarov, Nikita [1 ]
Bryksin, Timofey [1 ,4 ]
机构
[1] JetBrains Res, Belgrade, Serbia
[2] ITMO Univ, St Petersburg, Russia
[3] Higher Sch Econ, St Petersburg, Russia
[4] St Petersburg State Univ, St Petersburg, Russia
关键词
CLONE DETECTION; SOFTWARE;
D O I
10.1145/3379597.3387455
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With an ever-increasing amount of open-source software, the popularity of services like GitHub that facilitate code reuse, and common misconceptions about the licensing of open-source software, the problem of license violations in the code is getting more and more prominent. In this study, we compile an extensive corpus of popular Java projects from GitHub, search it for code clones, and perform an original analysis of possible code borrowing and license violations on the level of code fragments. We chose Java as a language because of its popularity in industry, where the plagiarism problem is especially relevant because of possible legal action. We analyze and discuss distribution of 94 different discovered and manually evaluated licenses in files and projects, differences in the licensing of files, distribution of potential code borrowing between licenses, various types of possible license violations, most violated licenses, etc. Studying possible license violations in specific blocks of code, we have discovered that 29.6% of them might be involved in potential code borrowing and 9.4% of them could potentially violate original licenses.
引用
收藏
页码:54 / 64
页数:11
相关论文
共 50 条
  • [1] License Usage and Changes: A Large-Scale Study of Java']Java Projects on GitHub
    Vendome, Christopher
    Linares-Vasquez, Mario
    Bavota, Gabriele
    Di Penta, Massimiliano
    German, Daniel
    Poshyvanyk, Denys
    [J]. 2015 IEEE 23RD INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION ICPC 2015, 2015, : 218 - 228
  • [2] AUTOMATING TEST CASE IDENTIFICATION IN JAVA']JAVA OPEN SOURCE PROJECTS ON GITHUB
    Madeja, Matej
    Poruban, Jaroslav
    Bacikova, Michaela
    Sulir, Matus
    Juhar, Jan
    Chodarev, Sergej
    Gurbal, Filip
    [J]. COMPUTING AND INFORMATICS, 2021, 40 (03) : 575 - 605
  • [3] GRADESTYLE: GitHub-Integrated and Automated Assessment of Java']Java Code Style
    Iddon, Callum
    Giacaman, Nasser
    Terragni, Valerio
    [J]. 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING-SOFTWARE ENGINEERING EDUCATION AND TRAINING, ICSE-SEET, 2023, : 192 - 197
  • [4] The Relationship between Commit Message Detail and Defect Proneness in Java']Java Projects on GitHub
    Barnett, Jacob G.
    Gathuru, Charles K.
    Soldano, Luke S.
    McIntosh, Shane
    [J]. 13TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2016), 2016, : 496 - 499
  • [5] On the Nature of Merge Conflicts: A Study of 2,731 Open Source Java']Java Projects Hosted by GitHub
    Ghiotto, Gleiph
    Murta, Leonardo
    Barros, Marcio
    van der Hoek, Andre
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2020, 46 (08) : 892 - 915
  • [6] An Empirical Study of Code Smells in Java']JavaScript Projects
    Saboury, Amir
    Musavi, Pooya
    Khomh, Foutse
    Antoniol, Giulio
    [J]. 2017 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), 2017, : 294 - 305
  • [7] On the Nature of Code Cloning in Open-Source Java']Java Projects
    Golubev, Yaroslav
    Bryksin, Timofey
    [J]. 2021 IEEE 15TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES, IWSC 2021, 2021, : 22 - 28
  • [8] On the Diffuseness of Code Technical Debt in Java']Java Projects of the Apache Ecosystem
    Saarimaki, Nyyti
    Lenarduzzi, Valentina
    Taibi, Davide
    [J]. 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON TECHNICAL DEBT (TECHDEBT 2019), 2019, : 98 - 107
  • [9] Code Reuse in Stack Overflow and Popular Open Source Java']Java Projects
    Lotter, Adriaan
    Licorish, Sherlock A.
    Savarimuthu, Bastin Tony Roy
    Meldrum, Sarah
    [J]. 2018 25TH AUSTRALASIAN SOFTWARE ENGINEERING CONFERENCE (ASWEC), 2018, : 141 - 150
  • [10] How Do Java Developers Reuse StackOverflow Answers in Their GitHub Projects?
    Chen, Juntong
    Zhao, Yan
    Meng, Na
    [J]. arXiv, 2023,