Repositories with public data about software development

被引:1
|
作者
Gonzalez-Barahona J.M. [1 ]
Izquierdo-Cortazar D. [1 ]
Squire M. [2 ]
机构
[1] Universidad Rey Juan Carlos, Spain
[2] Elon University, United States
基金
美国国家科学基金会;
关键词
Code forges; Meta-repositories; Project repositories; Repository of repositories; Software engineering research;
D O I
10.4018/jossp.2010040101
中图分类号
学科分类号
摘要
Empirical research on software development based on data obtained from project repositories and code forges is increasingly gaining attention in the software engineering research community. The studies in this area typically start by retrieving or monitoring some subset of data found in the repository or forge, and this data is later analyzed to find interesting patterns. However, retrieving information from these locations can be a challenging task. Meta-repositories providing public information about software development are useful tools that can simplify and streamline the research process. Public data repositories that collect and clean the data from other project repositories or code forges can help ensure that research studies are based on good quality data. This paper provides some insight as to how these meta-repositories (sometimes called a "repository of repositories", RoR) of data about open source projects should be used to help researchers. This paper describes in detail two of the most widely used collections of data about software development: FLOSSmole and FLOSSMetrics. © 2010, IGI Global.
引用
收藏
页码:1 / 13
页数:12
相关论文
共 50 条
  • [31] Emerging topics in mining software repositories: Machine learning in software repositories and datasets
    Güemes-Peña D.
    López-Nozal C.
    Marticorena-Sánchez R.
    Maudes-Raedo J.
    [J]. Progress in Artificial Intelligence, 2018, 7 (3) : 237 - 247
  • [32] Workflow analysis of data science code in public GitHub repositories
    Dhivyabharathi Ramasamy
    Cristina Sarasua
    Alberto Bacchelli
    Abraham Bernstein
    [J]. Empirical Software Engineering, 2023, 28
  • [33] MicroBooNE Public Data Sets: a Collaborative Tool for LArTPC Software Development
    Cerati, Giuseppe
    [J]. 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
  • [34] Integrating Open Source Software Repositories on the Web through Linked Data
    Iqbal, Aftab
    Decker, Stefan
    [J]. 2015 IEEE 16TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2015, : 114 - 121
  • [35] Analysis of Intercrossed Open-Source Software Repositories Data in GitHub
    Farah, Gabriel
    Correal, Dario
    [J]. 2013 8TH COMPUTING COLOMBIAN CONFERENCE (8CCC), 2013, : 37 - 42
  • [37] Private Information Retrieval in Large Scale Public Data Repositories
    Ahmad, Ishtiyaque
    Agrawal, Divyakant
    El Abbadi, Amr
    Gupta, Trinabh
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3868 - 3871
  • [38] Automating Technical Reviews in Software Forges and Repositories Based on Linked Data
    Manuel Dodero, Juan
    Ruiz-Rube, Ivan
    Traverso, Ignacio
    [J]. METADATA AND SEMANTICS RESEARCH, MTSR 2014, 2014, 478 : 30 - 41
  • [39] Workflow analysis of data science code in public GitHub repositories
    Ramasamy, Dhivyabharathi
    Sarasua, Cristina
    Bacchelli, Alberto
    Bernstein, Abraham
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (01)
  • [40] Diversity in Software Development Routines are Attractive: A Preliminary Analysis of GitHub Repositories
    Robinson, William N.
    Deng, Tianjie
    [J]. AMCIS 2015 PROCEEDINGS, 2015,