Missing information in imbalanced data stream: fuzzy adaptive imputation approach

被引:0
|
作者
Bohnishikha Halder
Md Manjur Ahmed
Toshiyuki Amagasa
Nor Ashidi Mat Isa
Rahat Hossain Faisal
Md. Mostafijur Rahman
机构
[1] University of Barishal,Department of Computer Science and Engineering
[2] University of Tsukuba,Center for Computational Sciences
[3] Universiti Sains Malaysia,School of Electrical and Electronic Engineering, Engineering Campus
[4] Daffodil International University,Department of Software Engineering
来源
Applied Intelligence | 2022年 / 52卷
关键词
Data imputation; Missing information; Fuzzy adaptive approach; Pattern recognition; Imbalanced data; Data stream;
D O I
暂无
中图分类号
学科分类号
摘要
From a real-world perspective, missing information is an ordinary scenario in data stream. Generally, missing data generate diverse problems in recognizing the pattern of data (i.e., clustering and classification). Particularly, missing data in data stream is a challenging topic. With imbalanced data, the problem of missing data greatly affects pattern recognition. As a solution to all these issues, this study puts forward an adaptive technique with fuzzy-based information decomposition method, which simultaneously solves the problem of incomplete data and overcomes the imbalanced data stream in a dataset. The main purpose of the proposed fuzzy adaptive imputation approach (FAIA) is to represent the effect of missing values in imbalance data stream and handle the missing data problem in imbalance data stream. FAIA is a single pass method. It considers adaptive selection of intervals based on all observed instances by using the interrelationship of attributes to identify correct interval for computing missing instances. Here, the interrelationship of two attributes means one attribute’s value of an instance depends on another attribute’s value of the same instance. In FAIA, after measuring all interval distances from a certain missing value, the least distance is selected for this missing value. Synthetic data of minority class are generated using the same process of missing value imputation for balancing data that is called oversampling. Instances of the datasets are divided into the chunks in data stream to balance data without any ensemble of previous chunks because missing values may misguide the future chunks. To demonstrate the performance of FAIA, the experiment is divided into three parts: missing data imputation, imbalanced information for offline data for data stream, and imbalanced information with missing value for offline data. Eleven numerical datasets with different dimensions from various repositories are considered for the computing performance of missing data imputation and imbalanced data without data stream. Four different datasets are also used to measure the performance of imbalanced data stream. In maximum measuring cases, the proposed method outperforms.
引用
收藏
页码:5561 / 5583
页数:22
相关论文
共 50 条
  • [31] Missing data imputation for fuzzy rule-based classification systems
    Julián Luengo
    José A. Sáez
    Francisco Herrera
    Soft Computing, 2012, 16 : 863 - 881
  • [32] Incremental Missing-Data Imputation for Evolving Fuzzy Granular Prediction
    Garcia, Cristiano
    Leite, Daniel
    Skrjanc, Igor
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (10) : 2348 - 2362
  • [33] Missing data imputation for fuzzy rule-based classification systems
    Luengo, Julian
    Saez, Jose A.
    Herrera, Francisco
    SOFT COMPUTING, 2012, 16 (05) : 863 - 881
  • [34] Missing data imputation: focusing on single imputation
    Zhang, Zhongheng
    ANNALS OF TRANSLATIONAL MEDICINE, 2016, 4 (01)
  • [35] Adaptive kernel fuzzy clustering for missing data
    Rodrigues, Anny K. G.
    Ospina, Raydonal
    Ferreira, Marcelo R. P.
    PLOS ONE, 2021, 16 (11):
  • [36] Learning a Credal Classifier With Optimized and Adaptive Multiestimation for Missing Data Imputation
    Zhang, Zuo-Wei
    Tian, Hong-Peng
    Yan, Ling-Zhi
    Martin, Arnaud
    Zhou, Kuang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (07): : 4092 - 4104
  • [37] A spatiotemporal approach for traffic data imputation with complicated missing patterns
    Li, Huiping
    Li, Meng
    Lin, Xi
    He, Fang
    Wang, Yinhai
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 119
  • [39] From Predictive Methods to Missing Data Imputation: An Optimization Approach
    Bertsimas, Dimitris
    Pawlowski, Colin
    Zhuo, Ying Daisy
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [40] Fuzzy min–max neural networks for categorical data: application to missing data imputation
    Pilar Rey-del-Castillo
    Jesús Cardeñosa
    Neural Computing and Applications, 2012, 21 : 1349 - 1362