Predicting OSS Development Success: A Data Mining Approach

被引:1
|
作者
Raja, Uzma [1 ]
Tretter, Marietta J. [2 ]
机构
[1] Univ Alabama, Management Informat Syst, Tuscaloosa, AL 35487 USA
[2] Texas A&M Univ, Dept Informat & Operat Management, College Stn, TX 77843 USA
关键词
Data Mining; Data Models; Decision Trees; Logistic Regression; Neural Networks; Open Source Software; Software Development;
D O I
10.4018/jismd.2011100102
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Open Source Software (OSS) has reached new levels of sophistication and acceptance by users and commercial software vendors. This research creates tests and validates a model for predicting successful development of OSS projects. Widely available archival data was used for OSS projects from Sourceforge. net. The data is analyzed with multiple Data Mining techniques. Initially three competing models are created using Logistic Regression, Decision Trees and Neural Networks. These models are compared for precision and are refined in several phases. Text Mining is used to create new variables that improve the predictive power of the models. The final model is chosen based on best fit to separate training and validation data sets and the ability to explain the relationship among variables. Model robustness is determined by testing it on a new dataset extracted from the SF repository. The results indicate that end-user involvement, project age, functionality, usage, project management techniques, project type and team communication methods have a significant impact on the development of OSS projects.
引用
收藏
页码:27 / 48
页数:22
相关论文
共 50 条
  • [21] An Approach for Predicting River Water Quality Using Data Mining Technique
    Gulyani, Bharat B.
    Mangai, J. Alamelu
    Fathima, Arshia
    ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS, ICDM 2015, 2015, 9165 : 233 - 243
  • [22] Research on predicting prosodic parameters for Chinese synthesis by data mining approach
    WANG Wei CAI Lianhong(Department of Computer Science and Technology
    Chinese Journal of Acoustics, 2003, (02) : 184 - 192
  • [23] Ensemble Vote Approach for Predicting Primary Tumors Using Data Mining
    Naib, Mehak
    Chhabra, Amit
    2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 97 - 102
  • [24] Data Mining Approach For Predicting Student and Institution's Placement Percentage
    Ashok, M., V
    Apoorva, A.
    2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 336 - 340
  • [25] Predicting the total suspended solids in wastewater: A data-mining approach
    Verma, Anoop
    Wei, Xiupeng
    Kusiak, Andrew
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (04) : 1366 - 1372
  • [26] Predicting Academic Performance of Students Using a Hybrid Data Mining Approach
    Francis, Bindhia K.
    Babu, Suvanam Sasidhar
    JOURNAL OF MEDICAL SYSTEMS, 2019, 43 (06)
  • [27] Predicting customer demand for remanufactured products: A data-mining approach
    Truong Van Nguyen
    Zhou, Li
    Chong, Alain Yee Loong
    Li, Boying
    Pu, Xiaodie
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 281 (03) : 543 - 558
  • [28] Predicting Academic Performance of Students Using a Hybrid Data Mining Approach
    Bindhia K. Francis
    Suvanam Sasidhar Babu
    Journal of Medical Systems, 2019, 43
  • [29] Predicting survival time for kidney dialysis patients: a data mining approach
    Kusiak, A
    Dixon, B
    Shah, S
    COMPUTERS IN BIOLOGY AND MEDICINE, 2005, 35 (04) : 311 - 327
  • [30] PREDICTING RETINOPATHY RISK AMONG DIABETIC PATIENTS: A DATA MINING APPROACH
    Foshati, Saghar
    Sabeti, Malihe
    Zamani, Ali
    BIOMEDICAL ENGINEERING-APPLICATIONS BASIS COMMUNICATIONS, 2019, 31 (02):