Efficient Data Structures for Range Shortest Unique Substring Queries

被引:3
|
作者
Abedin, Paniz [1 ]
Ganguly, Arnab [2 ]
Pissis, Solon P. [3 ,4 ]
Thankachan, Sharma V. [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
[2] Univ Wisconsin, Dept Comp Sci, Whitewater, WI 53190 USA
[3] CWI, Life Sci & Hlth, NL-1098 XG Amsterdam, Netherlands
[4] Vrije Univ, Ctr Integrat Bioinformat, NL-1081 HV Amsterdam, Netherlands
基金
美国国家科学基金会;
关键词
shortest unique substring; suffix tree; heavy-light decomposition; range queries; geometric data structures; ALGORITHMS;
D O I
10.3390/a13110276
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Let T[1, n] be a string of length n and T[i, j] be the substring of T starting at position i and ending at position j. A substring T[i, j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [alpha, beta], return a shortest substring T[i, j] of T with exactly one occurrence in [alpha, beta]. We present an O(n log n)-word data structure with O(log(w) n) query time, where w = Omega(log n) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(root n log(c) n) query time, where epsilon > 0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012].
引用
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [31] Efficient Graph Encryption Scheme for Shortest Path Queries
    Ghosh, Esha
    Kamara, Seny
    Tamassia, Roberto
    ASIA CCS'21: PROCEEDINGS OF THE 2021 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 516 - 525
  • [32] Efficient algorithms for shortest path queries in planar digraphs
    Djidjev, HN
    GRAPH-THEORETIC CONCEPTS IN COMPUTER SCIENCE, 1997, 1197 : 151 - 165
  • [33] Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings
    Kiichi Watanabe
    Yuto Nakashima
    Shunsuke Inenaga
    Hideo Bannai
    Masayuki Takeda
    Theory of Computing Systems, 2020, 64 : 1273 - 1291
  • [34] Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings
    Watanabe, Kiichi
    Nakashima, Yuto
    Inenaga, Shunsuke
    Bannai, Hideo
    Takeda, Masayuki
    THEORY OF COMPUTING SYSTEMS, 2020, 64 (07) : 1273 - 1291
  • [35] A novel method for designing indexes to support efficient substring queries on encrypted databases
    Hoang, Canh Ngoc
    Nguyen, Minh Hieu
    Nguyen, Thuy Thu Thi
    Vu, Huy Quang
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (03) : 20 - 36
  • [36] Efficient Processing of Substring Match Queries with Inverted q-gram Indexes
    Kim, Younghoon
    Woo, Kyoung-Gu
    Park, Hyoungmin
    Shim, Kyuseok
    26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, : 721 - 732
  • [37] Doquet: Differentially Oblivious Range and Join Queries with Private Data Structures
    Qiu, Lina
    Kellaris, Georgios
    Mamoulis, Nikos
    Nissim, Kobbi
    Kollios, George
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (13): : 4160 - 4173
  • [38] An external memory data structure for shortest path queries
    Hutchinson, D
    Maheshwari, A
    Zeh, N
    DISCRETE APPLIED MATHEMATICS, 2003, 126 (01) : 55 - 82
  • [39] Efficient Batch Processing of Shortest Path Queries in Road Networks
    Zhang, Mengxuan
    Li, Lei
    Hua, Wen
    Zhou, Xiaofang
    2019 20TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2019), 2019, : 100 - 105
  • [40] Range queries on uncertain data
    Li, Jian
    Wang, Haitao
    THEORETICAL COMPUTER SCIENCE, 2016, 609 : 32 - 48