Medical data mining by fuzzy modeling with selected features

被引:90
|
作者
Ghazavi, Sean N. [1 ]
Liao, Thunshun W. [1 ]
机构
[1] Louisiana State Univ, Dept Ind Engn, Baton Rouge, LA 70803 USA
关键词
feature selection; fuzzy models; data mining; medical data; diagnosis;
D O I
10.1016/j.artmed.2008.04.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: Medical data is often very high dimensional. Depending upon the use, some data dimensions might be more relevant than others. In processing medical data, choosing the optimal. subset of features is such important, not only to reduce the processing cost but also to improve the usefulness of the model built from the selected data. This paper presents a data mining study of medical data with fuzzy modeling methods that use feature subsets selected by some indices/methods. Methods: Specifically, three fuzzy modeling methods including the fuzzy k-nearest neighbor algorithm, a fuzzy clustering-based modeling, and the adaptive network-based fuzzy inference system are employed. For feature selection, a total of 11 indices/methods are used. Medical data mined include the Wisconsin breast cancer dataset and the Pima Indians diabetes dataset. The classification accuracy and computational time are reported. To show how good the best performer is, the globally optimal. was also found by carrying out an exhaustive testing of all possible combinations of feature subsets with three features. Results: For the Wisconsin breast cancer dataset, the best accuracy of 97.17% was obtained, which is only 0.25% tower than that was obtained by exhaustive testing. For the Pima Indians diabetes dataset, the best accuracy of 77.65% was obtained, which is only 0.13% lower than that obtained by exhaustive testing. Conclusion: This paper has shown that feature selection is important to mining medical data for reducing processing time and for increasing classification accuracy. However, not all combinations of feature selection and modeling methods are equally effective and the best combination is often data-dependent, as supported by the breast cancer and diabetes data analyzed in this paper. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:195 / 206
页数:12
相关论文
共 50 条
  • [1] Data mining and fuzzy modeling
    Pedrycz, W
    [J]. 1996 BIENNIAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1996, : 263 - 267
  • [2] Fuzzy Modeling Method Based on Data Mining
    Wang, Yongfu
    Zhao, Hong
    Liu, Jiren
    Chai, Tianyou
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 5223 - +
  • [3] Fuzzy modeling method based on data mining
    Chen, XQ
    Wang, YF
    Huang, XL
    [J]. Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 1924 - 1930
  • [4] Computer Modeling and Data Mining of Medical Images
    Hilbelink, Don R.
    [J]. FASEB JOURNAL, 2011, 25
  • [5] MEDICAL DATA MINING USING BGA AND RGA FOR WEIGHTING OF FEATURES IN FUZZY K-NN CLASSIFICATION
    Tang, Ping-Hung
    Tseng, Ming-Hseng
    [J]. PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 3070 - +
  • [6] Fuzzy Modeling Built Through a Data Mining Process
    Wilges, B.
    Mateus, G. P.
    Nassar, S. M.
    Bastos, R. C.
    [J]. IEEE LATIN AMERICA TRANSACTIONS, 2012, 10 (02) : 1622 - 1626
  • [7] Application of fuzzy cluster analysis for medical image data mining
    Wang, Shuyan
    Zhou, Mingquan
    Geng, Guohua
    [J]. 2005 IEEE International Conference on Mechatronics and Automations, Vols 1-4, Conference Proceedings, 2005, : 631 - 636
  • [8] Mining Temporal Medical Data Using Adaptive Fuzzy Cognitive Maps
    Froelich, Wojciech
    Wakulicz-Deja, Alicja
    [J]. HSI: 2009 2ND CONFERENCE ON HUMAN SYSTEM INTERACTIONS, 2009, : 13 - 20
  • [9] Modeling with Fuzzy Transforms - a New Tool of Data Mining and Quantitative Finance
    Perfilieva, Irina
    [J]. 2017 6TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO), 2017, : 16 - 21
  • [10] Selected data mining concepts
    Abello, James
    Cormode, Graham
    Fradkin, Dmitriy
    Madigan, David
    Melnik, Ofer
    Muchnik, Ilya
    [J]. DISCRETE METHODS IN EPIDEMIOLOGY, 2006, 70 : 1 - 40