Direct mining of rules from data with missing values

被引:0
|
作者
Gorodetsky, V [1 ]
Karsaev, O [1 ]
Samoilov, V [1 ]
机构
[1] St Petersburg Inst Informat & Automat, St Petersburg 199178, Russia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper presents an approach to and technique for direct mining of binary data with missing values aiming at extraction of classification rules, whose premises are represented in a conjunctive form. This approach does not assume an imputation of missing values. The idea is (1) to generate two sets of rules serving as the upper and low bounds for any other sets of rules corresponding to all arbitrary assignments of missing values, and then, (2) based on these upper and low bounds of the rules' sets, on testing procedure and on a classification criterion to select a subset of rules to be used for classification. The approach is primarily oriented to the application domains where an imputation is either cannot be theoretically justified or is impossible at all. Examples of such applications are given by domains where information used for classification is composed of asynchronous data streams of various frequencies and thus possessing different "life time", or such information is missing due to peculiarities of information collection system. Instead of missing value imputation, the proposed approach uses training dataset to cut down the potential rules set via forming its low and upper bounds with the subsequent testing the rules of the upper bound against the new dataset with missing values and selection of the most appropriate rules. The approach was applied to learning of intrusions detection in computer network based on asynchronous data streams incoming from multiple data sources. Experimental results confirm that the proposed approach to direct mining of data with missing values can yield good results.
引用
收藏
页码:233 / 264
页数:32
相关论文
共 50 条
  • [21] A comparison of traditional and rough set approaches to missing attribute values in data mining
    Grzymala-Busse, J. W.
    [J]. DATA MINING X: DATA MINING, PROTECTION, DETECTION AND OTHER SECURITY TECHNOLOGIES, 2009, 42 : 155 - 163
  • [22] MINING DATA WITH MISSING ATTRIBUTE VALUES: A COMPARISON OF PROBABILISTIC AND ROUGH SET APPROACHES
    Grzymala-Busse, J. W.
    [J]. INTELLIGENT DECISION MAKING SYSTEMS, VOL. 2, 2010, : 153 - +
  • [23] Combined association rules for dealing with missing values
    Shen, Jau-Ji
    Chang, Chin-Chen
    Li, Yu-Chiang
    [J]. JOURNAL OF INFORMATION SCIENCE, 2007, 33 (04) : 468 - 480
  • [24] Replacing missing values using trustworthy data values from web data sources
    Jaya, M. Izham
    Sidi, Fatimah
    Yusof, Sharmila Mat
    Affendey, Lilly Suriani
    Ishak, Iskandar
    Jabar, Marzanah A.
    [J]. 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL MATHEMATICS (ICCSCM 2017), 2017, 892
  • [25] Data mining of missing persons data
    Blackmore, K
    Bossomaier, T
    Foy, S
    Thomson, D
    [J]. CLASSIFICATION AND CLUSTERING FOR KNOWLEDGE DISCOVERY, 2005, 4 : 305 - 314
  • [26] Data mining and the impact of missing data
    Brown, ML
    Kros, JF
    [J]. INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2003, 103 (8-9) : 611 - 621
  • [27] Computing Bayes Factors From Data With Missing Values
    Hoijtink, Herbert
    Gu, Xin
    Mulder, Doris
    Rosseel, Yves
    [J]. PSYCHOLOGICAL METHODS, 2019, 24 (02) : 253 - 268
  • [28] Missing Data in Collaborative Data Mining
    Anton, Carmen Ana
    Matei, Oliviu
    Avram, Anca
    [J]. COMPUTATIONAL STATISTICS AND MATHEMATICAL MODELING METHODS IN INTELLIGENT SYSTEMS, VOL. 2, 2019, 1047 : 100 - 109
  • [29] Mining customer value: From association rules to direct marketing
    Wang, K
    Zhou, SQ
    Yeung, JMS
    Yang, Q
    [J]. 19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, : 738 - 740
  • [30] Mining Customer Value: From Association Rules to Direct Marketing
    Ke Wang Wong
    Senqiang Zhou
    Qiang Yang
    Jack Man Shun Yeung
    [J]. Data Mining and Knowledge Discovery, 2005, 11 : 57 - 79