Balanced compact clustering for efficient range queries in metric spaces

被引:2
|
作者
Ceselli, Alberto [1 ]
Colombo, Fabio [2 ]
Cordone, Roberto [2 ]
机构
[1] Univ Milan, Dipartimento Informat, I-26013 Crema, Italy
[2] Univ Milan, Dipartimento Informat, I-20135 Milan, Italy
关键词
Similarity search; Clustering; Information retrieval; Integer programming; Tabu search;
D O I
10.1016/j.dam.2013.12.019
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Given a set of points in a metric space, an additional query point and a positive threshold, a range query determines the subset of points whose distance from the query point does not exceed the given threshold. This paper tackles the problem of clustering the set of points so as to minimize the number of distance evaluations required by a range query. This problem models the efficient extraction of information from a database when the user is not interested in an exact match retrieval, but in the search for similar items. Since this need has become widespread in the management of text, image, audio and video databases, several data structures have been proposed to support such queries. Their optimization, however, is still left to extremely simple heuristic rules, if not to random choices. We propose the Balanced Compact Clustering Problem (BCCP) as a combinatorial model of this problem. We discuss its approximation properties and the complexity of special cases. Then, we present two Integer Programming formulations, prove their equivalence and introduce valid inequalities and variable fixing procedures. We discuss the application of a general-purpose solver on the more efficient formulation. Finally, we describe a Tabu Search algorithm and discuss its application to randomly generated and to real-world benchmark instances up to one hundred thousands points. (C) 2013 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:43 / 67
页数:25
相关论文
共 50 条
  • [21] Indexing Metric Uncertain Data for Range Queries
    Chen, Lu
    Gao, Yunjun
    Li, Xinhan
    Jensen, Christian S.
    Chen, Gang
    Zheng, Baihua
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 951 - 965
  • [22] Indexing metric uncertain data for range queries and range joins
    Chen, Lu
    Gao, Yunjun
    Zhong, Aoxiao
    Jensen, Christian S.
    Chen, Gang
    Zheng, Baihua
    VLDB JOURNAL, 2017, 26 (04): : 585 - 610
  • [23] Indexing metric uncertain data for range queries and range joins
    Lu Chen
    Yunjun Gao
    Aoxiao Zhong
    Christian S. Jensen
    Gang Chen
    Baihua Zheng
    The VLDB Journal, 2017, 26 : 585 - 610
  • [24] Images of locally compact metric spaces
    Li Zhaowen
    Acta Mathematica Hungarica, 2003, 99 : 81 - 88
  • [25] θ-Deformations as Compact Quantum Metric Spaces
    Hanfeng Li
    Communications in Mathematical Physics, 2005, 256 : 213 - 238
  • [26] Regular cycles of compact metric spaces
    Steenrod, NE
    ANNALS OF MATHEMATICS, 1940, 41 : 833 - 851
  • [27] Quantum locally compact metric spaces
    Latremoliere, Frederic
    JOURNAL OF FUNCTIONAL ANALYSIS, 2013, 264 (01) : 362 - 402
  • [28] POINT DISTRIBUTIONS IN COMPACT METRIC SPACES
    Skriganov, M. M.
    MATHEMATIKA, 2017, 63 (03) : 1152 - 1171
  • [29] Dynamics of compact quantum metric spaces
    KAAD, J. E. N. S.
    KYED, D. A. V. I. D.
    ERGODIC THEORY AND DYNAMICAL SYSTEMS, 2021, 41 (07) : 2069 - 2109
  • [30] Chains and forms in compact metric spaces
    Pfeffer, Washek F.
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2019, 475 (01) : 51 - 93