Lowest probability mass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms

被引:0
|
作者
Kai Ming Ting
Ye Zhu
Mark Carman
Yue Zhu
Takashi Washio
Zhi-Hua Zhou
机构
[1] Federation University,School of Engineering and Information Technology
[2] Deakin University,School of Information Technology
[3] Monash University,Faculty of Information Technology
[4] Nanjing University,National Key Laboratory for Novel Software Technology
[5] Osaka University,The Institute of Scientific and Industrial Research
来源
Machine Learning | 2019年 / 108卷
关键词
Nearest neighbour; Distance metric; Lowest probability mass neighbour; Mass-based dissimilarity; Classification; Clustering;
D O I
暂无
中图分类号
学科分类号
摘要
The use of distance metrics such as the Euclidean or Manhattan distance for nearest neighbour algorithms allows for interpretation as a geometric model, and it has been widely assumed that the metric axioms are a necessary condition for many data mining tasks. We show that this assumption can in fact be an impediment to producing effective models. We propose to use mass-based dissimilarity, which employs estimates of the probability mass to measure dissimilarity, to replace the distance metric. This substitution effectively converts nearest neighbour (NN) algorithms into lowest probability mass neighbour (LMN) algorithms. Both types of algorithms employ exactly the same algorithmic procedures, except for the substitution of the dissimilarity measure. We show that LMN algorithms overcome key shortcomings of NN algorithms in classification and clustering tasks. Unlike existing generalised data independent metrics (e.g., quasi-metric, meta-metric, semi-metric, peri-metric) and data dependent metrics, the proposed mass-based dissimilarity is unique because its self-dissimilarity is data dependent and non-constant.
引用
收藏
页码:331 / 376
页数:45
相关论文
共 44 条
  • [21] Improving portfolio investment performance with distance-based portfolio-combining algorithms
    Kim, Hongseon
    Lee, Soonbong
    Soh, Seung Bum
    Kim, Seongmoon
    JOURNAL OF FINANCIAL RESEARCH, 2022, 45 (04) : 941 - 959
  • [22] Linkage-Based Distance Metric in the Search Space of Genetic Algorithms
    Kim, Yong-Hyuk
    Yoon, Yourim
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [23] New plane-sweep algorithms for distance-based join queries in spatial databases
    George Roumelis
    Antonio Corral
    Michael Vassilakopoulos
    Yannis Manolopoulos
    GeoInformatica, 2016, 20 : 571 - 628
  • [24] Modified Distance-based Subset Selection for Evolutionary Multi-objective Optimization Algorithms
    Chen, Weiyu
    Ishibuchi, Hisao
    Shang, Ke
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [25] Efficient Algorithms for Distance-Based Representative Skyline Computation in 2D Space
    Cai, Taotao
    Li, Rong-Hua
    Yu, Jeffrey Xu
    Mao, Rui
    Cai, Yadi
    WEB TECHNOLOGIES AND APPLICATIONS (APWEB 2015), 2015, 9313 : 116 - 128
  • [26] Single cell lineage reconstruction using distance-based algorithms and the R package, DCLEAR
    Gong, Wuming
    Kim, Hyunwoo J.
    Garry, Daniel J.
    Kwak, Il-Youp
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [27] New plane-sweep algorithms for distance-based join queries in spatial databases
    Roumelis, George
    Corral, Antonio
    Vassilakopoulos, Michael
    Manolopoulos, Yannis
    GEOINFORMATICA, 2016, 20 (04) : 571 - 628
  • [28] A Distance-Based Mutation Operator for Learning Bayesian Network Structures using Evolutionary Algorithms
    dos Santos, Edimilson B.
    Hruschka, Estevam R., Jr.
    Hruschka, Eduardo R.
    Ebecken, Nelson F. F.
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [29] Single cell lineage reconstruction using distance-based algorithms and the R package, DCLEAR
    Wuming Gong
    Hyunwoo J. Kim
    Daniel J. Garry
    Il-Youp Kwak
    BMC Bioinformatics, 23
  • [30] RETRACTED: An Accurate Method of Determining Attribute Weights in Distance-Based Classification Algorithms (Retracted Article)
    Liu, Fengtao
    Wang, Jialei
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022