A Comparative Analysis of Data Mining Techniques on Breast Cancer Diagnosis Data using WEKA Toolbox

被引:0
|
作者
Alshammari, Majdah [1 ]
Mezher, Mohammad [1 ]
机构
[1] Fahad Bin Sultan Univ, Dept Comp Sci, Tabuk, Saudi Arabia
关键词
Data mining; breast cancer; data mining techniques; classification; WEKA toolbox;
D O I
10.14569/IJACSA.2020.0110829
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Breast cancer is considered the second most common cancer in women compared to all other cancers. It is fatal in less than half of all cases and is the main cause of mortality in women. It accounts for 16% of all cancer mortalities worldwide. Early diagnosis of breast cancer increases the chance of recovery. Data mining techniques can be utilized in the early diagnosis of breast cancer. In this paper, an academic experimental breast cancer dataset is used to perform a data mining practical experiment using the Waikato Environment for Knowledge Analysis (WEKA) tool. The WEKA Java application represents a rich resource for conducting performance metrics during the execution of experiments. Pre-processing and feature extraction are used to optimize the data. The classification process used in this study was summarized through thirteen experiments. Additionally, 10 experiments using various different classification algorithms were conducted. The introduced algorithms were: Naive Bayes, Logistic Regression, Lazy IBK (Instance-Bases learning with parameter K), Lazy Kstar, Lazy Locally Weighted Learner, Rules ZeroR, Decision Stump, Decision Trees J48, Random Forest and Random Trees. The process of producing a predictive model was automated with the use of classification accuracy. Further, several experiments on classification of Wisconsin Diagnostic Breast Cancer and Wisconsin Breast Cancer, were conducted to compare the success rates of the different methods. Results conclude that Lazy IBK classifier k-NN can achieve 98% accuracy among other classifiers. The main advantages of the study were the compactness of using 13 different data mining models and 10 different performance measurements, and plotting figures of classifications errors.
引用
收藏
页码:224 / 229
页数:6
相关论文
共 50 条
  • [11] A Survey on Breast Cancer Prediction Using Data Mining Techniques
    Jacob, Dona Sara
    Viswan, Rakhi
    Manju, V.
    PadmaSuresh, L.
    Raj, Shine
    [J]. 2018 CONFERENCE ON EMERGING DEVICES AND SMART SYSTEMS (ICEDSS), 2018, : 256 - 258
  • [12] The comparisons of prognostic indexes using data mining techniques and Cox regression analysis in the breast cancer data
    Ture, Mevlut
    Tokatli, Fusun
    Omurlu, Imran Kurt
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 8247 - 8254
  • [13] Breast Cancer Prediction Using Data Mining Classification Techniques
    Kazi, Abdul Karim
    Waseemullah
    Baig, Mirza Adnan
    Khan, Shahzaib
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (09): : 696 - 704
  • [14] A comparative analysis of heterogeneity in road accident data using data mining techniques
    Kumar S.
    Toshniwal D.
    Parida M.
    [J]. Evolving Systems, 2017, 8 (2) : 147 - 155
  • [15] An Analysis Program Used in Data Mining: WEKA
    Aksu, Gokhan
    Dogan, Nuri
    [J]. JOURNAL OF MEASUREMENT AND EVALUATION IN EDUCATION AND PSYCHOLOGY-EPOD, 2019, 10 (01): : 80 - 95
  • [16] COMPARATIVE ANALYSIS OF DATA MINING TECHNIQUES FOR MEDICAL DATA CLASSIFICATION
    Lashari, S. A.
    Ibrahim, R.
    [J]. COMPUTING & INFORMATICS, 4TH INTERNATIONAL CONFERENCE, 2013, 2013, : 365 - 370
  • [17] Data mining techniques in breast cancer diagnosis at the cellular-molecular level
    Yang, Jian
    Kadir, Dler Hussein
    [J]. JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (14) : 12605 - 12620
  • [18] Prediction of benign and malignant breast cancer using data mining techniques
    Chaurasia, Vikas
    Pal, Saurabh
    Tiwari, B. B.
    [J]. JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2018, 12 (02) : 119 - 126
  • [19] A Review on Prediction Of Breast Cancer Using Various Data Mining Techniques
    Deepika, M.
    Gladence, L. Mary
    Keerthana, R. Madhu
    [J]. RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2016, 7 (01): : 808 - 814
  • [20] Intelligent Breast Cancer Prediction Model Using Data Mining Techniques
    Shen, Runjie
    Yang, Yuanyuan
    Shao, Fengfeng
    [J]. 2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL 1, 2014, : 384 - 387