Missing information in imbalanced data stream: fuzzy adaptive imputation approach

被引:0
|
作者
Bohnishikha Halder
Md Manjur Ahmed
Toshiyuki Amagasa
Nor Ashidi Mat Isa
Rahat Hossain Faisal
Md. Mostafijur Rahman
机构
[1] University of Barishal,Department of Computer Science and Engineering
[2] University of Tsukuba,Center for Computational Sciences
[3] Universiti Sains Malaysia,School of Electrical and Electronic Engineering, Engineering Campus
[4] Daffodil International University,Department of Software Engineering
来源
Applied Intelligence | 2022年 / 52卷
关键词
Data imputation; Missing information; Fuzzy adaptive approach; Pattern recognition; Imbalanced data; Data stream;
D O I
暂无
中图分类号
学科分类号
摘要
From a real-world perspective, missing information is an ordinary scenario in data stream. Generally, missing data generate diverse problems in recognizing the pattern of data (i.e., clustering and classification). Particularly, missing data in data stream is a challenging topic. With imbalanced data, the problem of missing data greatly affects pattern recognition. As a solution to all these issues, this study puts forward an adaptive technique with fuzzy-based information decomposition method, which simultaneously solves the problem of incomplete data and overcomes the imbalanced data stream in a dataset. The main purpose of the proposed fuzzy adaptive imputation approach (FAIA) is to represent the effect of missing values in imbalance data stream and handle the missing data problem in imbalance data stream. FAIA is a single pass method. It considers adaptive selection of intervals based on all observed instances by using the interrelationship of attributes to identify correct interval for computing missing instances. Here, the interrelationship of two attributes means one attribute’s value of an instance depends on another attribute’s value of the same instance. In FAIA, after measuring all interval distances from a certain missing value, the least distance is selected for this missing value. Synthetic data of minority class are generated using the same process of missing value imputation for balancing data that is called oversampling. Instances of the datasets are divided into the chunks in data stream to balance data without any ensemble of previous chunks because missing values may misguide the future chunks. To demonstrate the performance of FAIA, the experiment is divided into three parts: missing data imputation, imbalanced information for offline data for data stream, and imbalanced information with missing value for offline data. Eleven numerical datasets with different dimensions from various repositories are considered for the computing performance of missing data imputation and imbalanced data without data stream. Four different datasets are also used to measure the performance of imbalanced data stream. In maximum measuring cases, the proposed method outperforms.
引用
收藏
页码:5561 / 5583
页数:22
相关论文
共 50 条
  • [1] Missing information in imbalanced data stream: fuzzy adaptive imputation approach
    Halder, Bohnishikha
    Ahmed, Md Manjur
    Amagasa, Toshiyuki
    Isa, Nor Ashidi Mat
    Faisal, Rahat Hossain
    Rahman, Md Mostafijur
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5561 - 5583
  • [2] A Probabilistic Approach for Missing Data Imputation
    Arefin, Muhammed Nazmul
    Masum, Abdul Kadar Muhammad
    COMPLEXITY, 2024, 2024
  • [3] MissII: Missing Information Imputation for Traffic Data
    Hou, Mingliang
    Tang, Tao
    Xia, Feng
    Sultan, Ibrahim
    Kaur, Roopdeep
    Kong, Xiangjie
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2024, 12 (03) : 752 - 765
  • [4] Imputation of missing information in worldwide patent data
    de Rassenfosse, Gaetan
    Seliger, Florian
    DATA IN BRIEF, 2021, 34
  • [5] Missing data imputation for paired stream and air temperature sensor data
    Li, Han
    Deng, Xinwei
    Smith, Eric
    ENVIRONMETRICS, 2017, 28 (01)
  • [6] Adaptive Missing Data Imputation with Incremental Neuro-Fuzzy Gaussian Mixture Network (INFGMN)
    Mazzutti, Tiago
    Roisenberg, Mauro
    de Freitas Filho, Paulo Jose
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 713 - 720
  • [7] An Imputation for Missing Data Features Based on Fuzzy Swarm Approach in Heart Disease Classification
    Salleh, Mohd Najib Mohd
    Samat, Nurul Ashikin
    ADVANCES IN SWARM INTELLIGENCE, ICSI 2017, PT II, 2017, 10386 : 285 - 292
  • [8] PCA-based missing information imputation for real-time crash likelihood prediction under imbalanced data
    Ke, Jintao
    Zhang, Shuaichao
    Yang, Hai
    Chen, Xiqun
    TRANSPORTMETRICA A-TRANSPORT SCIENCE, 2018, 15 (02) : 872 - 895
  • [9] Missing data imputation using fuzzy-rough methods
    Amiri, Mehran
    Jensen, Richard
    NEUROCOMPUTING, 2016, 205 : 152 - 164
  • [10] Missing data imputation with fuzzy feature selection for diabetes dataset
    Mohamad Faiz Dzulkalnine
    Roselina Sallehuddin
    SN Applied Sciences, 2019, 1