Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure

被引:46
|
作者
Ting, Kai Ming [1 ]
Zhu, Ye [2 ]
Carman, Mark [2 ]
Zhu, Yue [3 ]
Zhou, Zhi-Hua [3 ]
机构
[1] Federat Univ, Churchill, Vic 3842, Australia
[2] Monash Univ, Clayton, Vic 3800, Australia
[3] Nanjing Univ, Nanjing 210023, Jiangsu, Peoples R China
关键词
Data dependent dissimilarity; distance measure; distance-based neighbourhood; probability-mass-based neighbourhood; k nearest neighbours; SIMILARITY; MASS;
D O I
10.1145/2939672.2939779
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces the first generic version of data dependent dissimilarity and shows that it provides a better closest match than distance measures for three existing algorithms in clustering, anomaly detection and multi-label classification. For each algorithm, we show that by simply replacing the distance measure with the data dependent dissimilarity measure, it overcomes a key weakness of the otherwise unchanged algorithm.
引用
收藏
页码:1205 / 1214
页数:10
相关论文
共 50 条
  • [1] Overcoming weaknesses of density peak clustering using a data-dependent similarity measure
    Rasool, Zafaryab
    Aryal, Sunil
    Bouadjenek, Mohamed Reda
    Dazeley, Richard
    [J]. PATTERN RECOGNITION, 2023, 137
  • [2] Data-dependent dissimilarity measure: an effective alternative to geometric distance measures
    Aryal, Sunil
    Ting, Kai Ming
    Washio, Takashi
    Haffari, Gholamreza
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 53 (02) : 479 - 506
  • [3] Data-dependent dissimilarity measure: an effective alternative to geometric distance measures
    Sunil Aryal
    Kai Ming Ting
    Takashi Washio
    Gholamreza Haffari
    [J]. Knowledge and Information Systems, 2017, 53 : 479 - 506
  • [4] A Euclidean distance-based measure of efficiency in data envelopment analysis
    Amirteimoori, Alireza
    Kordrostami, Sohrab
    [J]. OPTIMIZATION, 2010, 59 (07) : 985 - 996
  • [5] Optimizing distance-based methods for large data sets
    Scholl, Tobias
    Brenner, Thomas
    [J]. JOURNAL OF GEOGRAPHICAL SYSTEMS, 2015, 17 (04) : 333 - 351
  • [6] Optimizing distance-based methods for large data sets
    Tobias Scholl
    Thomas Brenner
    [J]. Journal of Geographical Systems, 2015, 17 : 333 - 351
  • [7] Gene expression data classification: some distance-based methods
    Makinde, Olusola Samuel
    [J]. KUWAIT JOURNAL OF SCIENCE, 2019, 46 (03) : 31 - 39
  • [8] An Internal Cluster Validity Index Using a Distance-based Separability Measure
    Guan, Shuyue
    Loew, Murray
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 827 - 834
  • [9] Evaluating the geographic concentration of industries using distance-based methods
    Marcon, E
    Puech, F
    [J]. JOURNAL OF ECONOMIC GEOGRAPHY, 2003, 3 (04) : 409 - 428
  • [10] Benchmarking distance-based partitioning methods for mixed-type data
    Efthymios Costa
    Ioanna Papatsouma
    Angelos Markos
    [J]. Advances in Data Analysis and Classification, 2023, 17 : 701 - 724