A DICTIONARY RETRIEVAL ALGORITHM USING 2 TRIE STRUCTURES

被引:1
|
作者
MORIMOTO, K
IRIGUCHI, H
AOE, J
机构
[1] Faculty of Engineering, Tokushima University, Tokushima
关键词
NATURAL LANGUAGE PROCESSING; TRIE STRUCTURE; SEARCHING TECHNIQUE; DICTIONARY SEARCHING;
D O I
10.1002/scj.4690260209
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The trie has the feature that the retrieval can be executed with the character symbols composing the key as the unit, and a high-speed retrieval is realized independently of the total number of keys. Consequently, it used frequently in the search of the natural language dictionary and in other problems. A problem, however, is that the number of trie states increases with the enlargement of the key set, which necessitates a larger memory capacity. To remedy this point, DAWG (Directed Acyclic Word-Graph) is proposed, where the common suffix of the tries is compressed. Then, a new problem arises in that the record information cannot be determined uniquely for the key. For this problem, this paper introduces a new structure, where the number of states is reduced by merging the common suffixes of the tries, while determining uniquely the record information for the key. The algorithm for retrieval, insertion and deletion of the key is proposed for the structure. In the proposed method, the set of keys is represented using two tries. One of the tries memorizes the prefix of the minimum length that can discriminate uniquely between the key from other keys. The other trie stores the suffixes of the remaining keys in order to merge the common suffix. A simulation is executed for various sets of keys such as Chinese character called Kanji, alphabets and Japanese Katakana characters, and it is seen that the number of states is reduced by approximately 30 to 65 percent for the key set of 50,000 words, compared to the ordinary trie.
引用
收藏
页码:85 / 97
页数:13
相关论文
共 50 条
  • [31] A retrieval algorithm for TanSat XCO2 observation: Retrieval experiments using GOSAT data
    Liu Yi
    Yang DongXu
    Cai ZhaoNan
    CHINESE SCIENCE BULLETIN, 2013, 58 (13): : 1520 - 1523
  • [32] A progressive algorithm for cross-language information retrieval based on dictionary translation
    Yuan, Song An
    Yu, Song Nian
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 246 - 255
  • [33] Index-Trie: Efficient archival and retrieval of network traffic
    Xie, Gaogang
    Su, Jingxiu
    Wang, Xin
    He, Taihua
    Zhang, Guangxing
    Uhlig, Steve
    Salamatian, Kave
    COMPUTER NETWORKS, 2017, 124 : 140 - 156
  • [34] A Trie Based Set Similarity Query Algorithm
    Jia, Lianyin
    Tang, Junzhuo
    Li, Mengjuan
    Li, Runxin
    Ding, Jiaman
    Chen, Yinong
    MATHEMATICS, 2023, 11 (01)
  • [35] An Algorithm for URL Routing Based on Trie Structure
    Zhang, Yijun
    Xu, Lizhen
    2015 12TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2015, : 157 - 160
  • [36] Content based medical image retrieval using dictionary learning
    Srinivas, M.
    Naidu, R. Ramu
    Sastry, C. S.
    Mohan, C. Krishna
    NEUROCOMPUTING, 2015, 168 : 880 - 895
  • [37] A trie compaction algorithm for a large set of keys
    Aoe, J
    Morimoto, K
    Shishibori, M
    Park, KH
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (03) : 476 - 491
  • [38] COMPARISON OF DICTIONARY USE WITHIN 2 INFORMATION RETRIEVAL SYSTEMS
    SCHULTZ, CK
    STEINBERG, L
    SCHWARTZ, PD
    AMERICAN DOCUMENTATION, 1961, 12 (04): : 247 - &
  • [39] Dictionary-based color image retrieval using multiset theory
    Besiris, D.
    Zigouris, E.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2013, 24 (07) : 1155 - 1167
  • [40] Improvement of building field association term dictionary using passage retrieval
    Sharif, Uddin Md.
    Ghada, Elmarhomy
    Atlam, Elsayed
    Fuketa, Masao
    Morita, Kazuhiro
    Aoe, Jun-Ichi
    INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (06) : 1793 - 1807