Discovering knowledge from noisy databases using genetic programming

被引:0
|
作者
Wong, ML [1 ]
Leung, KS
Cheng, JCY
机构
[1] Lingnan Univ, Dept Informat Syst, Tuen Mun, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[3] Chinese Univ Hong Kong, Dept Orthopaed & Traumatol, Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1002/(SICI)1097-4571(2000)51:9<870::AID-ASI90>3.0.CO;2-R
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In data mining, we emphasize the need for learning from huge, incomplete, and imperfect data sets. To handle noise in the problem domain, existing learning systems avoid overfitting the imperfect training examples by excluding insignificant patterns. The problem is that these systems use a limiting attribute-value language for representing the training examples and the induced knowledge. Moreover, some important patterns are ignored because they are statistically insignificant. In this article, we present a framework that combines Genetic Programming and Inductive Logic Programming to induce knowledge represented in various knowledge representation formalisms from noisy databases. The framework is based on a formalism of logic grammars, and it can specify the search space declaratively. An implementation of the framework, LOGENPRO (The Logic grammar based GENetic PROgramming system), has been developed. The performance of LOGENPRO is evaluated on the chess end-game domain. We compare LOGENPRO with FOIL and other learning systems in detail, and find its performance is significantly better than that of the others, This result indicates that the Darwinian principle of natural selection is a plausible noise handling method that can avoid overfitting and identify important patterns at the same time. Moreover, the system is applied to one real-life medical database. The knowledge discovered provides insights to and allows better understanding of the medical domains.
引用
收藏
页码:870 / 881
页数:12
相关论文
共 50 条
  • [1] Discovering knowledge from medical databases
    Wong, ML
    Lam, W
    Leung, KS
    Cheng, JCY
    [J]. WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL 8, PROCEEDINGS: CONCEPTS AND APPLICATIONS OF SYSTEMICS, CYBERNETICS AND INFORMATICS, 1999, : 241 - 246
  • [2] Discovering knowledge from large databases using prestored information
    Tsai, PSM
    Chen, CM
    [J]. INFORMATION SYSTEMS, 2001, 26 (01) : 1 - 14
  • [3] Discovering knowledge from medical databases using evolutionary algorithms
    Wong, ML
    Lam, W
    Leung, KS
    Ngan, PS
    Cheng, JCY
    [J]. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2000, 19 (04): : 45 - 55
  • [4] Discovering robust knowledge from databases that change
    Hsu, CN
    Knoblock, CA
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (01) : 69 - 95
  • [5] Discovering Robust Knowledge from Databases that Change
    Chun-Nan Hsu
    Craig A. Knoblock
    [J]. Data Mining and Knowledge Discovery, 1998, 2 : 69 - 95
  • [6] Detecting Motion from Noisy Scenes using Genetic Programming
    Pinto, Brian
    Song, Andy
    [J]. 2009 24TH INTERNATIONAL CONFERENCE IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ 2009), 2009, : 322 - 327
  • [7] Discovering knowledge in corporate databases
    Yoon, Y
    [J]. INFORMATION SYSTEMS MANAGEMENT, 1999, 16 (02) : 64 - 71
  • [8] An efficient approach to discovering knowledge from large databases
    Yen, SJ
    Chen, ALP
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS, 1996, : 8 - 18
  • [9] Discovering Test Statistics Using Genetic Programming
    Moore, Jason H.
    Olson, Randal S.
    Chen, Yong
    Sipper, Moshe
    [J]. PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 29 - 30
  • [10] Discovering new knowledge from graph data using inductive logic programming
    Miyahara, T
    Shoudai, T
    Uchida, T
    Kuboyama, T
    Takahashi, K
    Ueda, H
    [J]. INDUCTIVE LOGIC PROGRAMMING, 1999, 1634 : 222 - 233