Multiplicative distance: a method to alleviate distance instability for high-dimensional data

被引:0
|
作者
Jafar Mansouri
Morteza Khademi
机构
[1] Ferdowsi University of Mashhad,Department of Electrical Engineering
来源
关键词
Distance instability; High-dimensional data; Minkowski and fractional norms; Multiplicative and additive distances;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, it has been shown that under a broad set of conditions, the commonly used distance functions will become unstable in high-dimensional data space; i.e., the distance to the farthest data point approaches the distance to the nearest data point of a given query point with increasing dimensionality. It has been shown that if dimensions are independently distributed, and normalized to have zero mean and unit variance, instability happens. In this paper, it is shown that the normalization condition is not necessary, but all appropriate moments must be finite. Furthermore, a new distance function, namely multiplicative distance, is introduced. It is theoretically proved that this function is stable for data with independent dimensions (with identical or nonidentical distribution). In contrast to usual distance functions which are based on the summation of distances over all dimensions (distance components), the multiplicative distance is based on the multiplication of distance components. Experimental results show the stability of the multiplicative distance for data with independent and correlated dimensions in the high-dimensional space and the superiority of the multiplicative distance over the norm distances for the high-dimensional data.
引用
收藏
页码:783 / 805
页数:22
相关论文
共 50 条
  • [1] Multiplicative distance: a method to alleviate distance instability for high-dimensional data
    Mansouri, Jafar
    Khademi, Morteza
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (03) : 783 - 805
  • [2] Improved visualization of high-dimensional data using the distance-of-distance transformation
    Liu, Jinke
    Vinck, Martin
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (12)
  • [3] Three estimators of the Mahalanobis distance in high-dimensional data
    Holgersson, H. E. T.
    Karlsson, Peter S.
    [J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (12) : 2713 - 2720
  • [4] Mapping high-dimensional data onto a relative distance plane - an exact method for visualizing and characterizing high-dimensional patterns
    Somorjai, RL
    Dolenko, B
    Demko, A
    Mandelzweig, M
    Nikulin, AE
    Baumgartner, R
    Pizzi, NJ
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2004, 37 (05) : 366 - 379
  • [5] Feature selection based on geometric distance for high-dimensional data
    Lee, J. -H.
    Oh, S. -Y.
    [J]. ELECTRONICS LETTERS, 2016, 52 (06) : 473 - 474
  • [6] Asymptotic distribution of the maximum interpoint distance for high-dimensional data
    Tang, Ping
    Lu, Rongrong
    Xie, Junshan
    [J]. STATISTICS & PROBABILITY LETTERS, 2022, 190
  • [7] On the orthogonal distance to class subspaces for high-dimensional data classification
    Zhu, Rui
    Xue, Jing-Hao
    [J]. INFORMATION SCIENCES, 2017, 417 : 262 - 273
  • [8] On the Design and Applicability of Distance Functions in High-Dimensional Data Space
    Hsu, Chih-Ming
    Chen, Ming-Syan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (04) : 523 - 536
  • [9] An effective method for approximating the Euclidean distance in high-dimensional space
    Jeong, Seungdo
    Kim, Sang-Wook
    Kim, Kidong
    Choi, Byung-Uk
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2006, 4080 : 863 - 872
  • [10] Dependence maps, a dimensionality reduction with dependence distance for high-dimensional data
    Lee, Kichun
    Gray, Alexander
    Kim, Heeyoung
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 26 (03) : 512 - 532