A new rule-based knowledge extraction approach for imbalanced datasets

被引:6
|
作者
Mahani, Aouatef [1 ]
Baba-Ali, Ahmed Riadh [1 ]
机构
[1] Univ Sci & Technol Houari Boumediene, Dept Comp Sci, POB 32, Algiers 16111, Algeria
关键词
Classification; Class imbalance problem; Data mining; Genetic algorithms; Imbalanced datasets sampling; NEURAL-NETWORKS; SOFTWARE TOOL; CLASSIFICATION; ALGORITHMS; CLASSIFIERS; PERFORMANCE; RECALL; KEEL;
D O I
10.1007/s10115-019-01330-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification consists of extracting a classifier from large datasets. A dataset is imbalanced if it contains more instances in one class compared to the others. An imbalanced dataset contains majority instances and minority ones. It is worth noting that classical learning algorithms have a bias toward majority instances. If classification is applied to imbalanced datasets, it is called partial classification. Its approaches are generally based on sampling methods or algorithmic methods. In this paper, we propose a new hybrid approach using a three-phase-rule-based extraction process. Initially, the first classifier is extracted; it contains classification rules representing only majority instances. Then, we delete the majority instances, which are well classified by these rules, to produce a balanced dataset. The deleted majority instances are replaced by the extracted classification rules, which prevent any information loss. Subsequently, our algorithm is applied to the obtained balanced dataset to produce the second classifier which contains rules that represent both majority and minority instances. Finally, we add the rules of the first classifier to the second classifier to obtain the final classifier, which will be post-processed. Our approach has been tested on several imbalanced binary datasets. The obtained results show its efficiency compared to other results.
引用
收藏
页码:1303 / 1329
页数:27
相关论文
共 50 条
  • [1] A new rule-based knowledge extraction approach for imbalanced datasets
    Aouatef Mahani
    Ahmed Riadh Baba-Ali
    [J]. Knowledge and Information Systems, 2019, 61 : 1303 - 1329
  • [2] An approach to rule-based knowledge extraction
    Jin, YC
    von Seelen, W
    Sendhoff, B
    [J]. 1998 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AT THE IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE - PROCEEDINGS, VOL 1-2, 1998, : 1188 - 1193
  • [3] Agent-based evolutionary approach for interpretable rule-based knowledge extraction
    Wang, HL
    Kwong, S
    Jin, YC
    Wei, W
    Man, KF
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2005, 35 (02): : 143 - 155
  • [4] A NOVEL RULE-BASED OVERSAMPLING APPROACH FOR IMBALANCED DATA CLASSIFICATION
    Zhang, Xiao
    Paz, Ivan
    Nebot, Angela
    [J]. 37TH ANNUAL EUROPEAN SIMULATION AND MODELLING CONFERENCE 2023, ESM 2023, 2023, : 208 - 212
  • [5] An algebraic approach to rule-based information extraction
    Reiss, Frederick
    Raghavan, Sriram
    Krishnamurthy, Rajasekar
    Zhu, Huaiyu
    Vaithyanathan, Shivakumar
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 933 - +
  • [6] Rule-based Text Extraction for Multimodal Knowledge Graph
    Norabid, Idza Aisara
    Fauzi, Fariza
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 295 - 304
  • [7] Fuzzy rule extraction from very-imbalanced datasets
    Soler, V
    Roig, J
    Prim, M
    [J]. DMIN '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON DATA MINING, 2005, : 222 - 225
  • [8] Imbalanced datasets classification by fuzzy rule extraction and genetic algorithms
    Soler, Vicenc
    Cerquides, Jesus
    Sabria, Josep
    Roig, Jordi
    Prim, Marta
    [J]. ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 330 - 334
  • [9] A Rule-based Approach for Arabic Temporal Expression Extraction
    Lhioui, Chahira
    Zouaghi, Anis
    Zrigui, Mounir
    [J]. 2017 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2017,
  • [10] A GENETIC RULE LEARNING APPROACH TO DEAL WITH IMBALANCED DATASETS
    Mahani, Aouatef
    Benkhider, Sadjia
    Baba-Ali, Ahmed Riadh
    [J]. PROCEEDINGS OF THE EUROPEAN CONFERENCE ON DATA MINING 2015 AND INTERNATIONAL CONFERENCES ON INTELLIGENT SYSTEMS AND AGENTS 2015 AND THEORY AND PRACTICE IN MODERN COMPUTING 2015, 2015, : 151 - 156