Space-efficient algorithms for computing minimal/shortest unique substrings

被引:0
|
作者
Mieno, Takuya [1 ]
Koppl, Dominik [1 ,2 ]
Nakashima, Yuto [1 ]
Inenaga, Shunsuke [1 ,3 ]
Bannai, Hideo [1 ,4 ]
Takeda, Masayuki [1 ]
机构
[1] Kyushu Univ, Dept Informat, Fukuoka, Japan
[2] Japan Soc Promot Sci, Tokyo, Japan
[3] Japan Sci & Technol Agcy, PRESTO, Saitama, Japan
[4] Tokyo Med & Dent Univ, Tokyo, Japan
关键词
String processing algorithm; Shortest unique substring; Minimal unique substring; Compact data structure; SUCCINCT REPRESENTATIONS; SUFFIX ARRAYS; SHORTEST;
D O I
10.1016/j.tcs.2020.09.017
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Given a string T of lengthn, a substring u = T[i.. j] of T is called a shortest unique substring (SUS) for an interval [s, t] if (a) u occurs exactly once in T, (b) u contains the interval [s,t](i.e. i <= s <= t <= j), and (c) every substring v of T with vertical bar v vertical bar < vertical bar u vertical bar containing [s, t] occurs at least twice in T. Given a query interval [s, t] subset of [1, n], the interval SUS problem is to output all the SUSs for the interval [s, t]. In this article, we propose a 4n + o(n) bits data structure answering an interval SUS query in output-sensitive O(occ) time, where occ is the number of returned SUSs. Additionally, we focus on the point SUS problem, which is the interval SUS problem for s = t. Here, we propose a (sic)(log(2)3 + 1)n(sic) + o(n) bits data structure answering a point SUS query in the same output-sensitive time. We also propose space-efficient algorithms for computing the minimal unique substrings of T. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:230 / 242
页数:13
相关论文
empty
未找到相关数据