Predicting OSS Development Success: A Data Mining Approach

被引:1
|
作者
Raja, Uzma [1 ]
Tretter, Marietta J. [2 ]
机构
[1] Univ Alabama, Management Informat Syst, Tuscaloosa, AL 35487 USA
[2] Texas A&M Univ, Dept Informat & Operat Management, College Stn, TX 77843 USA
关键词
Data Mining; Data Models; Decision Trees; Logistic Regression; Neural Networks; Open Source Software; Software Development;
D O I
10.4018/jismd.2011100102
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Open Source Software (OSS) has reached new levels of sophistication and acceptance by users and commercial software vendors. This research creates tests and validates a model for predicting successful development of OSS projects. Widely available archival data was used for OSS projects from Sourceforge. net. The data is analyzed with multiple Data Mining techniques. Initially three competing models are created using Logistic Regression, Decision Trees and Neural Networks. These models are compared for precision and are refined in several phases. Text Mining is used to create new variables that improve the predictive power of the models. The final model is chosen based on best fit to separate training and validation data sets and the ability to explain the relationship among variables. Model robustness is determined by testing it on a new dataset extracted from the SF repository. The results indicate that end-user involvement, project age, functionality, usage, project management techniques, project type and team communication methods have a significant impact on the development of OSS projects.
引用
下载
收藏
页码:27 / 48
页数:22
相关论文
共 50 条
  • [31] Self growing cluster development approach to data mining
    Alahakoon, D
    Halgamuge, SK
    Srinivasan, B
    1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 2901 - 2906
  • [32] Data mining and the critical success factors in data mining projects in China
    Hu Dengfeng
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON INNOVATION & MANAGEMENT, 2005, : 854 - 858
  • [33] Predictive Model for the Factors Influencing International Project Success: A Data Mining Approach
    Dumitrascu-Baldau, Iulia
    Dumitrascu, Danut-Dumitru
    Dobrota, Gabriela
    SUSTAINABILITY, 2021, 13 (07)
  • [34] Predicting Billboard Success Using Data-Mining in P2P Networks
    Koenigstein, Noam
    Shavitt, Yuval
    Zilberman, Noa
    2009 11TH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2009), 2009, : 465 - 470
  • [35] Designing and evaluating a big data analytics approach for predicting students’ success factors
    Kiran Fahd
    Shah J. Miah
    Journal of Big Data, 10
  • [36] Designing and evaluating a big data analytics approach for predicting students' success factors
    Fahd, Kiran
    Miah, Shah J.
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [37] Predicting consumer preference for fast-food franchises: a data mining approach
    Hayashi, Y.
    Hsieh, M-H
    Setiono, R.
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2009, 60 (09) : 1221 - 1229
  • [38] Predicting Grades by Principal Component Analysis A Data Mining Approach to Learning Analyics
    Figueira, Alvaro
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT), 2016, : 465 - 467
  • [39] A data mining approach to the development of a diagnostic test for male infertility
    Dzeroski, S
    Hristovski, D
    Kunej, T
    Peterlin, B
    MEDICAL INFOBAHN FOR EUROPE, PROCEEDINGS, 2000, 77 : 779 - 783
  • [40] Community Data Mining Approach for Surface Complexation Database Development
    Zavarin, Mavrik
    Chang, Elliot
    Wainwright, Haruko
    Parham, Nicholas
    Kaukuntla, Rahul
    Zouabe, Jadallah
    Deinhart, Amanda
    Genetti, Victoria
    Shipman, Sam
    Bok, Frank
    Brendler, Vinzenz
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2022, 56 (04) : 2827 - 2838