Study for the Double-array Trie Tree Based Algorithm in Word Segmentation

被引:0
|
作者
Yang, Wenchuan [1 ]
Fang, Zeyang [1 ]
Li, Pengfei [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
关键词
double-array; trie tree; time complexity; word segmentation dictionary;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents an improved algorithm-iDAT, which is based on Double-Array Trie Tree for Chinese Word Segmentation Dictionary. After initialization the original dictionary. Chinese word segmentation dictionary based on the Double-Array Trie Tree has higher efficiency of search, but the dynamic insertion will consume a lot of time. We implement a Hash process to the empty sequence index values for base array. The final Hash table stores the sum of the empty sequence before the current empty sequence. This algorithm adopt Sunday jumps algorithm of Single Pattern Matching. With slightly and reasonable space cost increasing, iDAT reduces the average time complexity of the dynamic insertion process in Trie Tree. Practical results shows it has a good operation performance.
引用
收藏
页码:440 / 446
页数:7
相关论文
共 50 条
  • [21] An Optimization Algorithm of Chinese Word Segmentation Based on Dictionary
    Tang, Jun
    Wu, Qing
    Li, Yinghong
    [J]. 2015 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2015, : 259 - 262
  • [22] A non-collision hash trie-tree based fast IP classification algorithm
    Ke Xu
    Jianping Wu
    Zhongchao Yu
    Mingwei Xu
    [J]. Journal of Computer Science and Technology, 2002, 17 : 219 - 226
  • [23] A non-collision hash Trie-tree based fast IP classification algorithm
    Xu, K
    Wu, JP
    Yu, ZC
    Xu, MW
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (02) : 219 - 226
  • [24] A new word segmentation algorithm based on maximum entropy model
    Wei, Yongqing
    Jia, Lijie
    [J]. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 1540 - 1542
  • [25] An ambiguity discovery algorithm on Chinese word segmentation based dictionary
    Sun, Tieli
    Liu, Yanji
    Yang, Lehua
    Li, Zhiying
    Liu, Zhenghong
    [J]. PROCEEDINGS OF THE 2009 SECOND PACIFIC-ASIA CONFERENCE ON WEB MINING AND WEB-BASED APPLICATION, 2009, : 39 - 42
  • [26] Research on Chinese Word Segmentation Algorithm Based on Special Identifiers
    Qun, Zhang
    Yu, Cheng
    [J]. COMPUTING AND INTELLIGENT SYSTEMS, PT III, 2011, 233 : 377 - 385
  • [27] A rule-based Chinese-word segmentation algorithm
    Fu, Shiguang
    Lin, Youfang
    [J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 159 - 162
  • [28] Research on Chinese Word Segmentation Algorithm Based on Special Identifiers
    Zhang Qun
    Shen Haibo
    [J]. 2010 SECOND INTERNATIONAL CONFERENCE ON E-LEARNING, E-BUSINESS, ENTERPRISE INFORMATION SYSTEMS, AND E-GOVERNMENT (EEEE 2010), VOL I, 2010, : 277 - 280
  • [29] A SEGMENTATION ALGORITHM FOR CONNECTED WORD RECOGNITION BASED ON ESTIMATION PRINCIPLES
    ZELINSKI, R
    CLASS, F
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1983, 31 (04): : 818 - 827
  • [30] An efficient face segmentation algorithm based on binary partition tree
    Liu, Z
    Yang, H
    Peng, NS
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2005, 20 (04) : 295 - 314