Fuzzy Discretization for Data Mining

被引:0
|
作者
Berka, Petr [1 ]
机构
[1] Univ Econ, Dept Informat & Knowledge Engn, W Churchill Sq 4, Prague 13067 3, Czech Republic
关键词
data mining; discretization; decision rules; fuzzy intervals;
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
Data preprocessing or data preparation is the most time consuming and most laborious step in the whole data mining process. The reason for data preprocessing is twofold; it is necessary to select (or create) from available data characteristics relevant for given data mining task, and to represent these characteristics in a form suitable for selected data mining algorithm. Among the typical data operations performed in this step discretization of numeric attributes plays an important role as algorithms for creating either association or decision rules cannot handle numeric attributes directly. Different approaches to discretization can be used. Equidistant or equifrequent discretization are typical examples of so-called "class-blind" methods since they deal only with the discretized attribute. Another group of discretization are "class-sensitive" methods; the fact that the examples (objects) belong to different classes is taken into account here. The discretization procedures typically generate sharp boundaries (thresholds) between intervals. The paper describes a class-sensitive discretization method where the boundary between intervals is defined using fuzzy membership function. The paper shows an experimental evaluation of the proposed method on some benchmark data and also compares the proposed method with more standard (i.e. crisp) discretization. A naive Bayesian classifier as well as a tree learning algorithm are used in the experiments.
引用
收藏
页码:54 / 59
页数:6
相关论文
共 50 条
  • [1] Fuzzy data mining: Effect of fuzzy discretization
    Ishibuchi, H
    Yamamoto, T
    Nakashima, T
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 241 - 248
  • [2] Discretization and grouping: Preprocessing steps for data mining
    Berka, P
    Bruha, I
    [J]. PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 1510 : 239 - 245
  • [3] Fuzzy computing for data mining
    Hirota, K
    Pedrycz, W
    [J]. PROCEEDINGS OF THE IEEE, 1999, 87 (09) : 1575 - 1600
  • [4] Fuzzy signatures in data mining
    Vámos, T
    Kóczy, LT
    Biró, G
    [J]. JOINT 9TH IFSA WORLD CONGRESS AND 20TH NAFIPS INTERNATIONAL CONFERENCE, PROCEEDINGS, VOLS. 1-5, 2001, : 2842 - 2846
  • [5] Fuzzy logic and data mining
    Nakajima, H
    [J]. SOFT COMPUTING IN INTELLIGENT SYSTEMS AND INFORMATION PROCESSING, 1996, : 133 - 138
  • [6] Data mining and fuzzy modeling
    Pedrycz, W
    [J]. 1996 BIENNIAL CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1996, : 263 - 267
  • [7] Why fuzzy in data mining?
    Vamos, T
    [J]. ISUMA 2003: FOURTH INTERNATIONAL SYMPOSIUM ON UNCERTAINTY MODELING AND ANALYSIS, 2003, : 46 - 49
  • [8] Fuzzy spatial data mining
    Smith, GB
    Bridges, SM
    [J]. 2002 ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY PROCEEDINGS, 2002, : 184 - 189
  • [9] Fuzzy mining of meteorological data
    Rubin, SH
    [J]. 1998 CONFERENCE OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY - NAFIPS, 1998, : 44 - 49
  • [10] A data mining algorithm for fuzzy transaction data
    Chin-Yuan Chen
    Gin-Shuh Liang
    Yuhling Su
    Mao-Sheng Liao
    [J]. Quality & Quantity, 2014, 48 : 2963 - 2971