Increasing Data Set Incompleteness May Improve Rule Set Quality

被引:0
|
作者
Grzymala-Busse, Jerzy W. [1 ,2 ]
Grzymala-Busse, Witold J. [3 ]
机构
[1] Univ Kansas, Dept Elect Engn & Comp Sci, Lawrence, KS 66045 USA
[2] Polish Acad Sci, Inst Comp Sci, PL-01237 Warsaw, Poland
[3] Touchnet Informat Syst Inc, Lenexa, KS 66129 USA
来源
关键词
Rough set theory; Rule induction; MLEM2; algorithm; Missing attribute values; Lost values; Attribute-concept values; do not care" conditions; ROUGH; APPROXIMATIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new methodology to improve the quality of rule sets. We performed a series of data mining experiments on completely specified data sets. In these experiments we removed some specified attribute values, or, in different words, replaced such specified values by symbols of missing attribute values, and used these data for rule induction while original, complete data sets were used for testing. In our experiments we used the MLEM2 rule induction algorithm of the LERS data mining system, based on rough sets. Our approach to missing attribute values was based on rough set theory as well. Results of our experiments show that for some data sets and some interpretation of missing attribute values, the error rate was smaller than for the original, complete data sets. Thus, rule sets induced from some data sets may be improved by increasing incompleteness of data sets. It appears that by removing some attribute values, the rule induction system, forced to induce rules from remaining information, may induce better rule sets.
引用
收藏
页码:200 / +
页数:3
相关论文
共 50 条
  • [11] A Rough Set Based Rule Induction Approach to Geoscience Data
    Hossain, Touhid Mohammad
    Watada, Junzo
    Hermana, Maman
    Shukri, Siti Rohkmah Bt M.
    Sakai, Hiroshi
    2018 INTERNATIONAL CONFERENCE ON UNCONVENTIONAL MODELLING, SIMULATION AND OPTIMIZATION - SOFT COMPUTING AND META HEURISTICS - UMSO, 2018,
  • [12] Postprocessing of rule sets induced from a melanoma data set
    Grzymala-Busse, JW
    Hippe, ZS
    26TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, PROCEEDINGS, 2002, : 1146 - 1151
  • [13] PAPRICOURSE SET FOR MAY
    THOMPSON, KM
    PULP & PAPER-CANADA, 1983, 84 (02) : 9 - 9
  • [14] Groundwater Quality Modeling with a Small Data Set
    Sakizadeh, Mohamad
    Malian, Abbass
    Ahmadpour, Eisa
    GROUNDWATER, 2016, 54 (01) : 115 - 120
  • [15] The data set must focus on service quality
    Byskov, J
    Olsen, OE
    BULLETIN OF THE WORLD HEALTH ORGANIZATION, 2005, 83 (08) : 639 - 639
  • [16] On the Definability of a Set and Rough Set-Based Rule Generation
    Sakai, Hiroshi
    Wu, Mao
    Yamaguchi, Naoto
    2014 IIAI 3RD INTERNATIONAL CONFERENCE ON ADVANCED APPLIED INFORMATICS (IIAI-AAI 2014), 2014, : 122 - 125
  • [17] Structures of Association Rule Set
    Anh Tran
    Tin Truong
    Bac Le
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2012), PT II, 2012, 7197 : 361 - 370
  • [18] Incomplete Tests for Undetectable Faults to Improve Test Set Quality
    Pomeranz, Irith
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2019, 24 (02)
  • [19] Mining rule of quality control for spinning process with rough set theory
    Xiang, Qian
    Lv, Zhi-jun
    Yang, Jian-guo
    Yin, Xiang-gang
    INFORMATION ENGINEERING FOR MECHANICS AND MATERIALS SCIENCE, PTS 1 AND 2, 2011, 80-81 : 1021 - +
  • [20] Decision Making Under Incompleteness Based on Soft Set Theory
    Alcantud, Jose Carlos R.
    Santos-Garcia, Gustavo
    INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS: THEORY AND FOUNDATIONS, PT II, 2018, 854 : 583 - 595