Direct mining of rules from data with missing values

被引:0
|
作者
Gorodetsky, V [1 ]
Karsaev, O [1 ]
Samoilov, V [1 ]
机构
[1] St Petersburg Inst Informat & Automat, St Petersburg 199178, Russia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper presents an approach to and technique for direct mining of binary data with missing values aiming at extraction of classification rules, whose premises are represented in a conjunctive form. This approach does not assume an imputation of missing values. The idea is (1) to generate two sets of rules serving as the upper and low bounds for any other sets of rules corresponding to all arbitrary assignments of missing values, and then, (2) based on these upper and low bounds of the rules' sets, on testing procedure and on a classification criterion to select a subset of rules to be used for classification. The approach is primarily oriented to the application domains where an imputation is either cannot be theoretically justified or is impossible at all. Examples of such applications are given by domains where information used for classification is composed of asynchronous data streams of various frequencies and thus possessing different "life time", or such information is missing due to peculiarities of information collection system. Instead of missing value imputation, the proposed approach uses training dataset to cut down the potential rules set via forming its low and upper bounds with the subsequent testing the rules of the upper bound against the new dataset with missing values and selection of the most appropriate rules. The approach was applied to learning of intrusions detection in computer network based on asynchronous data streams incoming from multiple data sources. Experimental results confirm that the proposed approach to direct mining of data with missing values can yield good results.
引用
收藏
页码:233 / 264
页数:32
相关论文
共 50 条
  • [1] Discovery of Association Rules from Data including Missing Values
    Sakurai, Shigeaki
    Mori, Kouichirou
    Orihara, Ryohei
    [J]. CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 67 - 74
  • [2] Mining Class Association Rules on Dataset with Missing Data
    Hoang-Lam Nguyen
    Nguyen, Loan T. T.
    Kozierkiewicz, Adrianna
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 104 - 116
  • [3] The completion of missing values by neural nets for data mining
    Ultsch, A
    Rolf, S
    [J]. CLASSIFICATION, AUTOMATION, AND NEW MEDIA, 2002, : 227 - 234
  • [4] DIRECT: a system for mining data value conversion rules from disparate data sources
    Fan, WG
    Lu, HJ
    Madnick, SE
    Cheung, D
    [J]. DECISION SUPPORT SYSTEMS, 2002, 34 (01) : 19 - 39
  • [5] A Valued Tolerance Approach to Missing Attribute Values in Data Mining
    Grzymala-Busse, Jerzy W.
    Hippe, Zdzislaw S.
    Rzasa, Wojciech
    Vasudevan, Supriya
    [J]. HSI: 2009 2ND CONFERENCE ON HUMAN SYSTEM INTERACTIONS, 2009, : 217 - 224
  • [6] Applying data mining algorithms to inpatient dataset with missing values
    Liu, Peng
    El-Darzi, Elia
    Lei, Lei
    Vasilakis, Christos
    Chountas, Panagiotis
    Huang, Wei
    [J]. JOURNAL OF ENTERPRISE INFORMATION MANAGEMENT, 2007, 21 (01) : 81 - +
  • [7] Decision-rule solutions for data mining with missing values
    Weiss, SM
    Indurkhya, N
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2000, 1952 : 1 - 10
  • [8] SPECTRA FROM DATA WITH MISSING VALUES
    HARRIS, RW
    [J]. MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 1987, 1 (01) : 97 - 104
  • [9] Role Mining with Missing Values
    Vavilis, Sokratis
    Egner, Alexandru Ionut
    Petkovic, Milan
    Zannone, Nicola
    [J]. PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, (ARES 2016), 2016, : 167 - 176
  • [10] KERNEL CLASSIFICATION RULES FROM MISSING DATA
    PAWLAK, M
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) : 979 - 988