Increasing Data Set Incompleteness May Improve Rule Set Quality

被引:0
|
作者
Grzymala-Busse, Jerzy W. [1 ,2 ]
Grzymala-Busse, Witold J. [3 ]
机构
[1] Univ Kansas, Dept Elect Engn & Comp Sci, Lawrence, KS 66045 USA
[2] Polish Acad Sci, Inst Comp Sci, PL-01237 Warsaw, Poland
[3] Touchnet Informat Syst Inc, Lenexa, KS 66129 USA
来源
关键词
Rough set theory; Rule induction; MLEM2; algorithm; Missing attribute values; Lost values; Attribute-concept values; do not care" conditions; ROUGH; APPROXIMATIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new methodology to improve the quality of rule sets. We performed a series of data mining experiments on completely specified data sets. In these experiments we removed some specified attribute values, or, in different words, replaced such specified values by symbols of missing attribute values, and used these data for rule induction while original, complete data sets were used for testing. In our experiments we used the MLEM2 rule induction algorithm of the LERS data mining system, based on rough sets. Our approach to missing attribute values was based on rough set theory as well. Results of our experiments show that for some data sets and some interpretation of missing attribute values, the error rate was smaller than for the original, complete data sets. Thus, rule sets induced from some data sets may be improved by increasing incompleteness of data sets. It appears that by removing some attribute values, the rule induction system, forced to induce rules from remaining information, may induce better rule sets.
引用
收藏
页码:200 / +
页数:3
相关论文
共 50 条
  • [21] Universal perturbative explicitly correlated basis set incompleteness correction
    Torheyden, Martin
    Valeev, Edward F.
    JOURNAL OF CHEMICAL PHYSICS, 2009, 131 (17):
  • [22] On increasing sequences of topologies on a set
    Bose, Manoj Kumar
    Tiwari, Rupesh
    RIVISTA DI MATEMATICA DELLA UNIVERSITA DI PARMA, 2007, 7 : 173 - 183
  • [23] Core data set anaesthesia 3.0/2010-Updated data set for external quality control in anaesthesia
    Heinrichs, W.
    Blumrich, W.
    Deil, S.
    Freitag, M.
    Kutz, N.
    Luedtke, I.
    Roehrig, R.
    Streuf, R.
    ANASTHESIOLOGIE & INTENSIVMEDIZIN, 2010, 51 : S33 - S55
  • [24] Compactness rate as a rule selection index based on Rough Set Theory to improve data analysis for personal investment portfolios
    Shyng, Jhieh-Yu
    Shieh, How-Ming
    Tzeng, Gwo-Hshiung
    APPLIED SOFT COMPUTING, 2011, 11 (04) : 3671 - 3679
  • [25] Building Endgame Data Set to Improve Opponent Modeling Approach
    Zhang Jiajia
    Liu Hong
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 255 - 260
  • [26] Interest Set Mechanism to Improve the Transport of Named Data Networking
    Jiang, Xiaoke
    Bi, Jun
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2013, 43 (04) : 515 - 516
  • [27] Modifications of Classification Strategies in Rule Set Based Bagging for Imbalanced Data
    Napierala, Krystyna
    Stefanowski, Jerzy
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT II, 2012, 7209 : 514 - 525
  • [28] Quality Aware Network for Set to Set Recognition
    Liu, Yu
    Yan, Junjie
    Ouyang, Wanli
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4694 - 4703
  • [29] Application and Design of MEDS Based on Rough Set Data Mining Rule
    Luo, Hai-chang
    Zhong, Yu-bin
    FUZZY INFORMATION AND ENGINEERING 2010, VOL 1, 2010, 78 : 683 - +
  • [30] Communications: Intramolecular basis set superposition error as a measure of basis set incompleteness: Can one reach the basis set limit without extrapolation?
    Balabin, Roman M.
    JOURNAL OF CHEMICAL PHYSICS, 2010, 132 (21):