ULISSE: ULtra compact Index for Variable-Length Similarity SEarch in Data Series

被引:9
|
作者
Linardi, Michele [1 ]
Palpanas, Themis [1 ]
机构
[1] Paris Descartes Univ, LIPADE, Paris, France
关键词
D O I
10.1109/ICDE.2018.00149
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data series similarity search is an important operation and at the core of several analysis tasks and applications related to data series collections. Despite the fact that data series indexes enable fast similarity search, all existing indexes can only answer queries of a single length (fixed at index construction time), which is a severe limitation. In this work, we propose ULISSE, the first data series index structure designed for answering similarity search queries of variable length. Our contribution is two-fold. First, we introduce a novel representation technique, which effectively and succinctly summarizes multiple sequences of different length. Based on the proposed index, we describe efficient algorithms for approximate and exact similarity search, combining disk based index visits and in-memory sequential scans. We experimentally evaluate our approach using several synthetic and real datasets. The results show that ULISSE is several times (and up to orders of magnitude) more efficient in terms of both space and time cost, when compared to competing approaches.
引用
收藏
页码:1356 / 1359
页数:4
相关论文
共 50 条
  • [1] Scalable, Variable-Length Similarity Search in Data Series: The ULISSE Approach
    Linardi, Michele
    Palpanas, Themis
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (13): : 2236 - 2248
  • [2] CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time Series
    Xiong, Haoran
    Zhang, Hang
    Wang, Zeyu
    He, Zhenying
    Wang, Peng
    Wang, X. Sean
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (09): : 2123 - 2135
  • [3] Compact Dictionaries for Variable-Length Keys and Data with Applications
    Blandford, Daniel K.
    Blelloch, Guy E.
    ACM TRANSACTIONS ON ALGORITHMS, 2008, 4 (02)
  • [4] Manifold Learning for Multivariate Variable-Length Sequences With an Application to Similarity Search
    Ho, Shen-Shyang
    Dai, Peng
    Rudzicz, Frank
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (06) : 1333 - 1344
  • [5] BINARY SEARCH WITH VARIABLE-LENGTH KEYS WITHIN AN INDEX PAGE
    ERKIO, H
    TERKKI, R
    INFORMATION SYSTEMS, 1983, 8 (02) : 137 - 140
  • [6] Ranking and significance of variable-length similarity-based time series motifs
    Serra, Joan
    Serra, Isabel
    Corral, Alvaro
    Lluis Arcos, Josep
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 55 : 452 - 460
  • [7] Variable-Length Subsequence Clustering in Time Series
    Duan, Jiangyong
    Guo, Lili
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 983 - 995
  • [8] VARIABLE-LENGTH DATA ASSEMBLER.
    Dishon, Y.
    Gindi, A.M.
    Martin, R.W.
    IBM Technical Disclosure Bulletin, 1976, 19 (05): : 1892 - 1895
  • [9] Beyond Information Distortion: Imaging Variable-Length Time Series Data for Classification
    Lee, Hyeonsu
    Shin, Dongmin
    SENSORS, 2025, 25 (03)
  • [10] Matrix Profile X: VALMOD - Scalable Discovery of Variable-Length Motifs in Data Series
    Linardi, Michele
    Zhu, Yan
    Palpanas, Themis
    Keogh, Eamonn
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1053 - 1066