A Novel Leukemia Gene Features Extraction and Selection Technique for Robust Type Prediction Using Machine Learning

被引:0
|
作者
Ilyas, Mahwish [1 ]
Aamir, Khalid Mahmood [1 ]
Jaleel, Abdul [2 ]
Deriche, Mohamed [3 ]
机构
[1] Univ Sargodha, Dept Comp Sci & Informat Technol, Sargodha 40162, Punjab, Pakistan
[2] Univ Engn & Technol, Dept Comp Sci, GRW, RCET, Lahore, Pakistan
[3] Ajman Univ, Coll Engn & Informat Technol, Artificial Intelligence Res Ctr AIRC, Ajman, U Arab Emirates
关键词
Leukemia prediction; Gene features extraction; Linear discriminant analysis; Dimensionality reduction; LINEAR DISCRIMINANT-ANALYSIS; EXPRESSION DATA; CLASSIFICATION; ALGORITHM; HYBRID;
D O I
10.1007/s13369-024-09254-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The broad term 'leukemia' refers to different types of cancer related to blood cells. Detecting and identifying the specific type of leukemia continues to be a major challenge in the medical field. Diverse machine learning techniques can be vital in analyzing gene expression data from microarray experiments in cancer research related to leukemia. In particular, the Leukemia Gene Expression data from the Curated Microarray Database (CuMiDa) is used here. Microarrays can be challenging in determining expression patterns. In this work, we use Fisher's linear discriminant analysis, a popular technique for dimensionality reduction, together with a new feature selection approach to predict leukemia using microarray data. Our machine learning model is used to predict five types of leukemia including AML, PBSC CD34, Bone Marrow, and CD34 from the bone marrow. This is achieved by first rescaling the data features. We then use a feature selection technique to obtain the 25 most significant features from the dataset's 22,283 features, then further reduce the dimension to 5 features only, to reduce computational complexity. These features are then fed into a Fisher's linear discriminant module and a likelihood-based index for classification. The overall performance of our model was excellent. We examine the results using 2, 4, 5, 6, and 7 selected features. The best classification accuracies are 89.6%, 96.92%, and 96.15%, for 2, 5, and 7 selected features, respectively. Our results outperform the state-of-the-art by about 4%, with an excellent task completion time of less than 100 ms.
引用
收藏
页码:16845 / 16863
页数:19
相关论文
共 50 条
  • [1] Fish Classification Based on Robust Features Selection Using Machine Learning Techniques
    Hnin, Than Thida
    Lynn, Khin Thidar
    GENETIC AND EVOLUTIONARY COMPUTING, VOL I, 2016, 387 : 237 - 245
  • [2] Effect of Features Extraction and Selection on the Evaluation of Machine Learning
    Habibi, Omar
    Chemmakha, Mohammed
    Lazaar, Mohamed
    IFAC PAPERSONLINE, 2022, 55 (12): : 462 - 467
  • [3] Stroke Treatment Prediction Using Features Selection Methods and Machine Learning Classifiers
    Chourib, I.
    Guillard, G.
    Farah, I. R.
    Solaiman, B.
    IRBM, 2022, 43 (06) : 678 - 686
  • [4] Robust Band Profile Extraction Using Constrained Nonparametric Machine-Learning Technique
    Khan, Shadab
    Sanches, Joao
    Ventura, Rodrigo
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2010, 57 (10) : 2587 - 2591
  • [5] Solar Flare Prediction Using Advanced Feature Extraction, Machine Learning, and Feature Selection
    Omar W. Ahmed
    Rami Qahwaji
    Tufan Colak
    Paul A. Higgins
    Peter T. Gallagher
    D. Shaun Bloomfield
    Solar Physics, 2013, 283 : 157 - 175
  • [6] Solar Flare Prediction Using Advanced Feature Extraction, Machine Learning, and Feature Selection
    Ahmed, Omar W.
    Qahwaji, Rami
    Colak, Tufan
    Higgins, Paul A.
    Gallagher, Peter T.
    Bloomfield, D. Shaun
    SOLAR PHYSICS, 2013, 283 (01) : 157 - 175
  • [7] Deep learning based features extraction for facial gender classification using ensemble of machine learning technique
    Waris, Fazal
    Da, Feipeng
    Liu, Shanghuan
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [8] Optimal Feature Selection of Technical Indicator and Stock Prediction Using Machine Learning Technique
    Naik, Nagaraj
    Mohan, Biju R.
    EMERGING TECHNOLOGIES IN COMPUTER ENGINEERING: MICROSERVICES IN BIG DATA ANALYTICS, 2019, 985 : 261 - 268
  • [9] Intelligent fault diagnosis of rotating machine elements using machine learning through optimal features extraction and selection
    Tayyab, Syed Muhammad
    Asghar, Eram
    Pennacchi, Paolo
    Chatterton, Steven
    30TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM2021), 2020, 51 : 266 - 273
  • [10] Heart Disease Prediction System Using Model Of Machine Learning and Sequential Backward Selection Algorithm for Features Selection
    Ul Haq, Amin
    Li, Jianping
    Memon, Muhammad Hammad
    Memon, Muhammad Hunain
    Khan, Jalaluddin
    Marium, Syeda Munana
    2019 IEEE 5TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2019,