Possibilistic Similarity Measures for Data Science and Machine Learning Applications

被引:4
|
作者
Charfi, Amal [1 ]
Bouhamed, Sonda Ammar [1 ,2 ]
Bosse, Eloi [2 ,3 ]
Kallel, Imene Khanfir [1 ,2 ]
Bouchaala, Wassim [4 ]
Solaiman, Basel [2 ]
Derbel, Nabil [1 ]
机构
[1] Univ Sfax, Natl Sch Engineers Sfax, Control & Energy Managment CEM Lab, Sfax 3038, Tunisia
[2] IMT Atlantique, Image & Informat Proc Dept iTi, F-838182923 Brest, France
[3] Expertises Parafuse Inc, Quebec City, PQ G1W 4N1, Canada
[4] Tunisian Profess Training Agcy, Sfax 3000, Tunisia
关键词
Uncertainty; Possibility theory; Measurement uncertainty; Machine learning; Atmospheric measurements; Particle measurements; Indexes; Classification; distance; entropy; learning; measures of specificity; possibility distributions; similarity; uncertainty; INFORMATION; UNCERTAINTY; NOTION;
D O I
10.1109/ACCESS.2020.2979553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Measuring similarity is of a great interest in many research areas such as in data sciences, machine learning, pattern recognition, text analysis and information retrieval to name a few. Literature has shown that possibility is an attractive notion in the context of distinguishability assessment and can lead to very efficient and computationally inexpensive learning schemes. This paper focuses on determining the similarity between two possibility distributions. A review of existing similarity measures within the possibilistic framework is presented first. Then, similarity measures are analyzed with respect to their capacity to satisfy a set of required properties that a similarity measure should own. Most of the existing possibilistic similarity measures produce undesirable outcomes since they generally depend on the application context. A new similarity measure, called InfoSpecificity, is introduced and the similarity measures are categorized into three main methods: morphic-based, amorphic-based and hybrid. Two experiments are being conducted using four benchmark databases. The aim of the experiments is to compare the efficiency of the possibilistic similarity measures when applied to real data. Empirical experiments have shown good results for the hybrid methods, particularly with the InfoSpecificity measure. In general, the hybrid methods outperform the other two categories when evaluated on small-size samples, i.e., poor-data context (or poor-informed environment) where possibility theory can be used at the greatest benefit.
引用
收藏
页码:49198 / 49211
页数:14
相关论文
共 50 条
  • [41] Small data machine learning in materials science
    Pengcheng Xu
    Xiaobo Ji
    Minjie Li
    Wencong Lu
    npj Computational Materials, 9
  • [42] Machine Learning and Data Science in Chemical Engineering
    Gao, Hanyu
    Zhu, Li-Tao
    Luo, Zheng-Hong
    Fraga, Marco A.
    Hsing, I-Ming
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2022, 61 (24) : 8357 - 8358
  • [43] Linear Necessity Measures and Their Applications to Possibilistic Linear Programming
    Inuiguchi, Masahiro
    Higuchi, Tatsuya
    Tsurumi, Masayo
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 280 - 289
  • [44] Operations and evaluation measures for learning possibilistic graphical models
    Borgelt, C
    Kruse, R
    ARTIFICIAL INTELLIGENCE, 2003, 148 (1-2) : 385 - 418
  • [45] Similarity detection among data files - A machine learning approach
    Dash, M
    Liu, H
    1997 IEEE KNOWLEDGE AND DATA ENGINEERING EXCHANGE WORKSHOP, PROCEEDINGS, 1997, : 172 - 179
  • [46] Integrating node centralities, similarity measures, and machine learning classifiers for link prediction
    Anand, Sameer
    Rahul
    Mallik, Abhishek
    Kumar, Sanjay
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (27) : 38593 - 38621
  • [47] Integrating node centralities, similarity measures, and machine learning classifiers for link prediction
    Sameer Anand
    Abhishek Rahul
    Sanjay Mallik
    Multimedia Tools and Applications, 2022, 81 : 38593 - 38621
  • [48] On the integration of similarity measures with machine learning models to enhance text classification performance
    Abdalla, Hassan I.
    Amer, Ali A.
    INFORMATION SCIENCES, 2022, 614 : 263 - 288
  • [49] Possibilistic Neighborhood Graph: A New Concept of Similarity Graph Learning
    Gao, Can
    Wang, Yangbo
    Zhou, Jie
    Ding, Weiping
    Shen, Linlin
    Lai, Zhihui
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (06): : 1636 - 1650
  • [50] Data Science: Machine Learning and Multivariate Analysis in Learning Styles
    Maiquez, Diego
    Pabon, Diego
    Condor, Mariela
    Rodriguez, Gonzalo
    Farinango, Mauricio
    Oyasa, Ana
    INNOVATION AND RESEARCH-SMART TECHNOLOGIES & SYSTEMS, VOL 2, CI3 2023, 2024, 1041 : 69 - 81