Research of an Improved Algorithm for Chinese Word Segmentation Dictionary Based on Double-Array Trie Tree

被引:0
|
作者
Yang, Wenchuan [1 ]
Liu, Jian [1 ]
Yu, Miao [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
关键词
Double-Array; Trie Tree; Time Complexity; Word Segmentation Dictionary;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Chinese word segmentation dictionary based on the Double-Array Trie Tree has higher efficiency of search, but the dynamic insertion will consume a lot of time. This paper presents an improved algorithm-iDAT, which is based on Double-Array Trie Tree for Chinese Word Segmentation Dictionary. After initialization the original dictionary. We implement a Hash process to the empty sequence index values for base array. The final Hash table stores the sum of the empty sequence before the current empty sequence. This algorithm adopt Sunday jumps algorithm of Single Pattern Matching. With slightly and reasonable space cost increasing, iDAT reduces the average time complexity of the dynamic insertion process in Trie Tree. Practical results shows it has a good operation performance.
引用
收藏
页码:355 / 362
页数:8
相关论文
共 50 条
  • [1] Research of Chinese word segmentation based on Double-Array Trie
    School of Computer and Communication, Hunan Univ., Changsha 410082, China
    [J]. Hunan Daxue Xuebao, 2009, 5 (77-80):
  • [2] Study for the Double-array Trie Tree Based Algorithm in Word Segmentation
    Yang, Wenchuan
    Fang, Zeyang
    Li, Pengfei
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENVIRONMENTAL ENGINEERING (CSEE 2015), 2015, : 440 - 446
  • [3] Research of Chinese Segmentation Based on MMSeg and Double Array TRIE
    Xu, Lin
    Zhang, Qin
    Wang, Dandong
    Zhang, Jian
    [J]. ADVANCED RESEARCH ON AUTOMATION, COMMUNICATION, ARCHITECTONICS AND MATERIALS, PTS 1 AND 2, 2011, 225-226 (1-2): : 945 - +
  • [4] An Improved Chinese Segmentation Algorithm Based on Segmentation Dictionary
    Niu, Yan
    Li, Lala
    [J]. PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT, VOL 1, 2009, : 184 - 187
  • [5] An Optimization Algorithm of Chinese Word Segmentation Based on Dictionary
    Tang, Jun
    Wu, Qing
    Li, Yinghong
    [J]. 2015 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2015, : 259 - 262
  • [6] Research on Improved Algorithm for Chinese Word Segmentation Based on Markov Chain
    Pang Baomao
    Shi Haoshan
    [J]. FIFTH INTERNATIONAL CONFERENCE ON INFORMATION ASSURANCE AND SECURITY, VOL 1, PROCEEDINGS, 2009, : 236 - 238
  • [7] The Design and Application of an Improved Word Segmentation Algorithm Based on PATRICIA Tree Dictionary in Communication Material
    Yang, Xiaomei
    [J]. ADVANCED RESEARCH ON AUTOMATION, COMMUNICATION, ARCHITECTONICS AND MATERIALS, III, 2013, 738 : 264 - 267
  • [8] An ambiguity discovery algorithm on Chinese word segmentation based dictionary
    Sun, Tieli
    Liu, Yanji
    Yang, Lehua
    Li, Zhiying
    Liu, Zhenghong
    [J]. PROCEEDINGS OF THE 2009 SECOND PACIFIC-ASIA CONFERENCE ON WEB MINING AND WEB-BASED APPLICATION, 2009, : 39 - 42
  • [9] The Research of Chinese Automatic Word Segmentation In Hierarchical Model Dictionary Binary Tree
    Luo XianGang
    Luo Jin
    Xie Zhong
    [J]. FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 321 - 324
  • [10] Chinese Word Segmentation Based on Improved Double Hashtable
    Shao, Hong
    Sun, Huayu
    Cui, Wencheng
    [J]. FIFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2012): COMPUTER VISION, IMAGE ANALYSIS AND PROCESSING, 2013, 8783