Outlier detection for set-valued data based on rough set theory and granular computing

被引:6
|
作者
Lin, Hai [1 ]
Li, Zhaowen [2 ]
机构
[1] Guangxi Univ, Coll Math & Informat Sci, Nanning, Guangxi, Peoples R China
[2] Yulin Normal Univ, Key Lab Complex Syst Optimizat & Big Data Proc, Dept Guangxi Educ, Yulin, Guangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
RST; GrC; SVIS; outlier detection; outlier factor; INFORMATION GRANULATION; ATTRIBUTE REDUCTION; FUZZY; ALGORITHMS;
D O I
10.1080/03081079.2022.2132491
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Outlier detection has been broadly used in industrial practices such as public security and fraud detection, etc. Outlier detection from various perspectives against different backgrounds has been proposed. However, most of outlier detection consider categorical or numerical data. There are few researches on outlier detection for set-valued data, and a set-valued information system (SVIS) is a proper way of tackling the problem of missing values in data sets. This paper investigates outlier detection for set-valued data based on rough set theory (RST) and granular computing (GrC). First, the similarity between two information values in an SVIS is introduced and a variable parameter to control the similarity is given. Then, the tolerance relations on the object set are defined, and based on this tolerance relation, theta-lower and theta-upper approximations in an SVIS are put forward. Next, the outlier factor in an SVIS is presented and applied to various data sets. Finally, outlier detection method for set-valued data based on RST and GrC is proposed, and the corresponding algorithms are designed. Through numerical experiments based on UCI, the designed algorithm is compared with six other detection algorithms. The experimental results show the designed algorithm is arguably the best choice under the context of an SVIS. It is worth mentioning that for a comprehensive comparison, we use two criteria: AUC value and F-1 measure, to show the superiority of the designed algorithm.
引用
收藏
页码:385 / 413
页数:29
相关论文
共 50 条
  • [31] On set-valued Itô's integrals and set-valued martingales
    Kisielewicz, Michal
    Michta, Mariusz
    STOCHASTIC ANALYSIS AND APPLICATIONS, 2025,
  • [32] SET-VALUED MEASURE AND FUZZY SET-VALUED MEASURE
    ZHANG, WX
    LI, T
    MA, JF
    LI, AJ
    FUZZY SETS AND SYSTEMS, 1990, 36 (01) : 181 - 188
  • [33] ORTHOGONAL THEORY OF A SET-VALUED BIFUNCTOR
    JAMBOR, P
    CZECHOSLOVAK MATHEMATICAL JOURNAL, 1973, 23 (03) : 447 - 454
  • [34] Matrix-Based Rough Set Approach for Dynamic Probabilistic Set-Valued Information Systems
    Huang, Yanyong
    Li, Tianrui
    Luo, Chuan
    Horng, Shi-jinn
    ROUGH SETS, (IJCRS 2016), 2016, 9920 : 197 - 206
  • [35] Feature selection for set-valued data based on D–S evidence theory
    Yini Wang
    Sichun Wang
    Artificial Intelligence Review, 2023, 56 : 2667 - 2696
  • [36] Multivariate Microaggregation of Set-Valued Data
    Imran-Daud, Malik
    Shaheen, Muhammad
    Ahmed, Abbas
    INFORMATION TECHNOLOGY AND CONTROL, 2022, 51 (01): : 104 - 125
  • [37] ON THE THEORY OF BANACH-SPACE VALUED MULTIFUNCTIONS .2. SET-VALUED MARTINGALES AND SET-VALUED MEASURES
    PAPAGEORGIOU, NS
    JOURNAL OF MULTIVARIATE ANALYSIS, 1985, 17 (02) : 207 - 227
  • [38] Rough Fuzzy Set Model for Set-Valued Ordered Fuzzy Decision System
    Bao, Zhongkui
    Yang, Shanlin
    Zhao, Ju
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 : 673 - 682
  • [39] ROUGH SEMI-CONTINUOUS SET-VALUED MAPS
    Akcay, Fatma Gecit
    Aytar, Salih
    TRANSACTIONS OF A RAZMADZE MATHEMATICAL INSTITUTE, 2021, 175 (03) : 313 - 317
  • [40] On a Hybridization of Deep Learning and Rough Set Based Granular Computing
    Ropiak, Krzysztof
    Artiemjew, Piotr
    ALGORITHMS, 2020, 13 (03)