Feature Selection and Classification of Protein Subfamilies Using Rough Sets

被引:3
|
作者
Rahman, Shuzlina Abdul [1 ]
Abu Bakar, Azuraliza [1 ]
Hussein, Zeti Azura Mohamed [2 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Syst & Technol, Dept Management Syst & Sci, Bangi 43600, Selangor, Malaysia
[2] Univ Kebangsaan Malaysia, Fac Sci & Technol, Sch Biosci & Biotechnol, Bangi 43600, Selangor, Malaysia
关键词
Feature Selection; Protein Function Classification; Rough Sets; PREDICTION; SEQUENCE;
D O I
10.1109/ICEEI.2009.5254822
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning methods are known to be inefficient when faced with many features that are unnecessary for rule discovery. In coping with this issue, many methods have been proposed for selecting important features. Among them. is feature selection that selects a subset of discriminative features or attribute for model building due to its ability to avoid overfitting issue, improve model performance, provide faster and producing reliable model. This paper proposes a new method based on Rough Set algorithms, which is a rule-based data mining method to select the important features in bioinformatics datasets. Amino acid compositions are used as conditional features for the classification task. However, our results indicate that all amino acid composition features are equally important thus selecting the features are unnecessary. We do confirm the need of having a balance classes in classifying the protein function by demonstrating an increase of more than 15% in accuracy.
引用
收藏
页码:32 / 35
页数:4
相关论文
共 50 条
  • [1] FEATURE SELECTION AND IMAGE CLASSIFICATION USING ROUGH SETS THEORY
    Aguiar Pessoa, Alex Sandro
    Stephany, Stephan
    Garcia Fonseca, Leila Maria
    [J]. 2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 2904 - 2907
  • [2] Feature selection with rough sets for web page classification
    An, AJ
    Huang, YH
    Huang, XJ
    Cercone, N
    [J]. TRANSACTIONS ON ROUGH SETS II: ROUGH SETS AND FUZZY SETS, 2004, 3135 : 1 - 13
  • [3] Using Rough Sets with Heuristics for Feature Selection
    Ning Zhong
    Juzhen Dong
    Setsuo Ohsuga
    [J]. Journal of Intelligent Information Systems, 2001, 16 : 199 - 214
  • [4] Using rough sets with heuristics for feature selection
    Dong, JZ
    Zhong, N
    Ohsuga, S
    [J]. NEW DIRECTIONS IN ROUGH SETS, DATA MINING, AND GRANULAR-SOFT COMPUTING, 1999, 1711 : 178 - 187
  • [5] Using rough sets with heuristics for feature selection
    Zhong, N
    Dong, J
    Ohsuga, S
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2001, 16 (03) : 199 - 214
  • [6] Parallel Approaches to Neighborhood Rough Sets: Classification and Feature Selection
    Zhang, Junbo
    Wang, Chizheng
    Pan, Yi
    Li, Tianrui
    [J]. KNOWLEDGE ENGINEERING AND MANAGEMENT , ISKE 2013, 2014, 278 : 1 - 10
  • [7] Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification
    Sun, Lin
    Wang, Tianxiang
    Ding, Weiping
    Xu, Jiucheng
    Lin, Yaojin
    [J]. INFORMATION SCIENCES, 2021, 578 : 887 - 912
  • [8] A novel approach for feature selection using Rough Sets
    [J]. 1600, Institute of Electrical and Electronics Engineers Inc., United States
  • [9] A Novel Approach for Feature Selection using Rough Sets
    Yadav, Nidhika
    Chatterjee, Niladri
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND ELECTRONICS (COMPTELIX), 2017, : 195 - 199
  • [10] Online streaming feature selection using rough sets
    Eskandari, S.
    Javidi, M. M.
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2016, 69 : 35 - 57