Feature Selection Techniques for Bioinformatics Data Analysis

被引:2
|
作者
Theng, Dipti [1 ]
Bhoyar, K. K. [2 ]
机构
[1] YCCE, Dept Comp Technol, Nagpur, Maharashtra, India
[2] YCCE, Dept Comp Sci & Engn, Nagpur, Maharashtra, India
关键词
feature selection; machine learning; bioinformatics; biomarker selection;
D O I
10.1109/GECOST55694.2022.10010541
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The emergence of microarray datasets sparked a new branch of bioinformatics research. An important problem in microarray research is the selection of true biomarkers (a most relevant feature subset) for disease identification/classification using gene expression data. This problem has received the most interest in the context of cancer research to date. In this work we have experimented with various machine learning techniques for true biomarker selection for cancer classification. This research work carried on two bioinformatics datasets with application of six feature selection algorithms. Major contribution of the work is to identify these top performing efficient feature selection algorithms through an exhaustive survey and then implementing ensemble learning technique using these algorithms. A novel contribution of this research work is to implement feature weighting as an ensemble operator and compare efficiency of feature set selection using this approach with the others. The experimental results revealed that the proposed feature weighting as an ensemble operator can choose feature sets efficiently.
引用
收藏
页码:46 / 50
页数:5
相关论文
共 50 条
  • [1] A Review of the Stability of Feature Selection Techniques for Bioinformatics Data
    Awada, Wael
    Khoshgoftaar, Taghi M.
    Dittman, David
    Wald, Randall
    Napolitano, Amri
    [J]. 2012 IEEE 13TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2012, : 356 - 363
  • [2] A review of feature selection techniques in bioinformatics
    Saeys, Yvan
    Inza, Inaki
    Larranaga, Pedro
    [J]. BIOINFORMATICS, 2007, 23 (19) : 2507 - 2517
  • [3] Threshold-based feature selection techniques for high-dimensional bioinformatics data
    Van Hulse J.
    Khoshgoftaar T.M.
    Napolitano A.
    Wald R.
    [J]. Network Modeling Analysis in Health Informatics and Bioinformatics, 2012, 1 (1-2) : 47 - 61
  • [4] Feature Selection in Bioinformatics
    Wang, Lipo
    [J]. INDEPENDENT COMPONENT ANALYSES, COMPRESSIVE SAMPLING, WAVELETS, NEURAL NET, BIOSYSTEMS, AND NANOENGINEERING X, 2012, 8401
  • [5] Feature selection techniques in the context of big data: taxonomy and analysis
    Hudhaifa Mohammed Abdulwahab
    S. Ajitha
    Mufeed Ahmed Naji Saif
    [J]. Applied Intelligence, 2022, 52 : 13568 - 13613
  • [6] Feature selection techniques in the context of big data: taxonomy and analysis
    Abdulwahab, Hudhaifa Mohammed
    Ajitha, S.
    Saif, Mufeed Ahmed Naji
    [J]. APPLIED INTELLIGENCE, 2022, 52 (12) : 13568 - 13613
  • [7] Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics
    Iqbal, Muhammad Javed
    Faye, Ibrahima
    Samir, Brahim Belhaouari
    Said, Abas Md
    [J]. SCIENTIFIC WORLD JOURNAL, 2014,
  • [8] Investigating Random Undersampling and Feature Selection on Bioinformatics Big Data
    Hasanin, Tawfiq
    Khoshgoftaar, Taghi M.
    Leevy, Joffrey
    Seliya, Naeem
    [J]. 2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2019), 2019, : 346 - 356
  • [9] Development and Application of Feature Selection Techniques in Protein Data Analysis and Prediction
    Lin, Hao
    [J]. LETTERS IN ORGANIC CHEMISTRY, 2017, 14 (09) : 619 - 620
  • [10] Feature Selection Techniques for Big Data Analytics
    Albattah, Waleed
    Khan, Rehan Ullah
    Alsharekh, Mohammed F.
    Khasawneh, Samer F.
    [J]. ELECTRONICS, 2022, 11 (19)