A Machine Learning Based Ensemble Method for Automatic Multiclass Classification of Decisions

被引:6
|
作者
Fu, Liming [1 ]
Liang, Peng [1 ]
Li, Xueying [1 ]
Yang, Chen [2 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] IBO Technol Shenzhen Co Ltd, Shenzhen, Peoples R China
基金
国家重点研发计划;
关键词
Decision; Automatic Classification; Ensemble Classifier; Software Development; Hibernate;
D O I
10.1145/3463274.3463325
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Stakeholders make various types of decisions with respect to requirements, design, management, and so on during the software development life cycle. Nevertheless, these decisions are typically not well documented and classified due to limited human resources, time, and budget. To this end, automatic approaches provide a promising way. In this paper, we aimed at automatically classifying decisions into five types to help stakeholders better document and understand decisions. First, we collected a dataset from the Hibernate developer mailing list. We then experimented and evaluated 270 configurations regarding feature selection, feature extraction techniques, and machine learning classifiers to seek the best configuration for classifying decisions. Especially, we applied an ensemble learning method and constructed ensemble classifiers to compare the performance between ensemble classifiers and base classifiers. Our experiment results show that (1) feature selection can decently improve the classification results; (2) ensemble classifiers can outperform base classifiers provided that ensemble classifiers are well constructed; (3) BoW + 50% features selected by feature selection with an ensemble classifier that combines Naive Bayes (NB), Logistic Regression (LR), and Support Vector Machine (SVM) achieves the best classification result (with a weighted precision of 0.750, a weighted recall of 0.739, and a weighted F1-score of 0.727) among all the configurations. Our work can benefit various types of stakeholders in software development through providing an automatic approach for effectively classifying decisions into specific types that are relevant to their interests.
引用
收藏
页码:40 / 49
页数:10
相关论文
共 50 条
  • [31] Multiclass Classification of Cancer Based on Microarray Data Using Extreme Learning Machine
    Khadijah
    Rismiyati
    Mantau, Aprinaldi Jasa
    2017 1ST INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS), 2017, : 159 - 164
  • [32] Application of Machine Learning on Brain Cancer Multiclass Classification
    Panca, V.
    Rustam, Z.
    INTERNATIONAL SYMPOSIUM ON CURRENT PROGRESS IN MATHEMATICS AND SCIENCES 2016 (ISCPMS 2016), 2017, 1862
  • [33] Multiclass classification machine based on the analytical center
    Li, XQ
    Yue, JH
    Leng, YG
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1471 - 1474
  • [34] An ensemble classification method based on machine learning models for malicious Uniform Resource Locators (URL)
    Sankaranarayanan, Suresh
    Sivachandran, Arvinthan Thevar
    Khairuddin, Anis Salwa Mohd
    Hasikin, Khairunnisa
    Sait, Abdul Rahman Wahab
    PLOS ONE, 2024, 19 (05):
  • [35] An Ensemble Machine Learning Method for Single and Clustered Cervical Cell Classification
    Kuko, Mohammed
    Pourhomayoun, Mohammad
    2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019), 2019, : 216 - 222
  • [36] Evaluation of the Improved Extreme Learning Machine for Machine Failure Multiclass Classification
    Surantha, Nico
    Gozali, Isabella D.
    ELECTRONICS, 2023, 12 (16)
  • [37] Imbalanced Data Classification Method Based on Ensemble Learning
    Xiang, Yu
    Xie, Yongping
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 18 - 24
  • [38] A Method of Imbalanced Traffic Classification Based on Ensemble Learning
    Ding, Yaojun
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2015, : 265 - 268
  • [39] Ensemble of Weighted Code Mixed Feature Engineering and Machine Learning-Based Multiclass Classification for Enhanced Opinion Mining on Unstructured Data
    Sharma, Ruchi
    Shrinath, Pravin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 1220 - 1230
  • [40] Large-scale Malware Automatic Detection Based On Multiclass Features and Machine Learning
    Wang, Zhiqiang
    Tang, Yao
    Yao, Jing
    Qian, Rong
    Zhang, Zheng
    Ma, Pingchuan
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,